PredTox is a program for performing molecular environmental property estimations, primarily aquatic physical properties and toxicity endpoints, using a Linear Solvation Energy Relationship (LSER) model . Input compounds are written in SMILES , a linear notation for representing molecular structures; output consists of five computed LSER model variable values (Vi, Pi*, Beta, Alpha, and Delta, called vector values) for each compound which are used to evaluate a set of multiple linear regression equations to compute the compound's estimated aquatic properties, including numerous physical properties and toxicity endpoints.
Each compound is analyzed and broken down into a set of constituent fragments and aromatic rings. PredTox contains a built-in set of fragments and aromatic rings with vector values for each; to compute the total vector value the constituent vector values are summed. One of several sets of values is chosen for each fragment depending on whether it is adjacent to an aliphatic or aromatic ring, or embedded in an aliphatic ring, or if it shares a ring with other fragments.
The purpose of this document is to describe the overall organization of PredTox and to describe some of the algorithms and data structures. The intended audience is those who wish to gain a better understanding of the program logic. The PredTox Theory of Operations document listed below provides the details for using the software. Also listed are references to the LSER theory and operations.
PredTox supports both batch and interactive modes of operation. In interactive mode multiple documents each working with a single compound are supported, as well as multiple views of the same document, for more convenient viewing of a long analysis. An interactive document consists of a two connected panes, a left-hand selection pane, and a right-hand output pane. The selection pane is used to select the compound to be processed and to specify any processing options. The output pane shows the result of the molecular analysis for the selected compound. Interactive analysis output may be printed in the usual way. In batch mode, PredTox processes an input file of compounds and stores an output file for subsequent viewing or printing; the number of compounds that may be processed in batch mode is limited only by the amount of disk space available to hold the input and output files.
In both modes, compounds may be selected either by giving a CAS number or a name or partial name that is matched against a database supplied with PredTox, or via a SMILES string. In interactive mode, all compounds that contain the given partial name are presented in a dialog box; the user selects one compound from the list, which is then analyzed. In batch mode, all matches to a partial name are processed.
The level of detail for each analysis may be varied, to include or exclude detailed per-fragment or per-ring vector values, or to include or exclude the regression calculations.
Regression parameters are specified in a regression parameter file; this file may be changed to run the same regressions with new parameter values, or to specify new regressions.
Vector values for the built-in fragments may be overridden via a vector value override file. Overridden values are marked in the program output.
PredTox was developed using Microsoft Visual Studio 97 running on Windows 95 and tested on Windows XP.
Send questions and comments on this program to Dr. James P. Hickey, USGS Great Lakes Science Center, 1451 Green Road, Ann Arbor, MI 48105, firstname.lastname@example.org.