|
See Introduction to DMax Assistant™ for a general introduction to DMax Assistant™ technology.DMax Chemistry Assistant™ complements your screening technology with hypothesis formulation technology. The formulated hypotheses:
Below are some screenshots that illustrate the key functionalities of DMax Chemistry Assistant™. The illustrations are based on data taken from NCI human tumor growth cell line (COLO 205 Colon). In this experiment, "growth inhibition of the COLO 205 human Colon tumor cell line is measured as a screen for anti-cancer activity" (NCI)". Value "logGI50" corresponds to the concentration (in uM) required for 50% inhibition of growth.
- are natural language expressions that relate molecule structure to observations such as bio activity, toxicity, etc...
- these values can be either numerical (e.g., logGI50), or categorical (e.g., high, medium, low)
- multiple observations can be related to molecule structure simultaneously, even a mixture of categorical and numerical values (e.g., high activity and low toxicity)
- are generated from
- the molecular 2D structures (e.g., SDfiles)
- one or more observations of interest (numerical and/or categorical)
- optional: any other properties of the molecules you want to include
- do not require descriptors or fingerprints: DMax Chemistry Assistant™ has the unique ability to start from individual functional groups and rings, and construct hypotheses that combine these building blocks with expressions such as:
- is connected to
- is fused to
- the ring has 2 substituents at distance ..., where the first substituent is ... and the second substituent is ...;
- is linked via a conjugated system to
- are automatically validated on a separate test set;
- are combined into a predictive model that ranks new compound collections on the basis of predicted target values;
- multiple values can be predicted simultaneously
- predicted values can be exported to an Excel or text file, or can be saved with the molecule structures in SDfile format
- are illustrated using color codes that match the text with the molecule drawing
After the data have been loaded, the first step is to select Background Knowledge and "COLO 205 Colon: logGI50" as the target value.
Click the Create button to start the automated formulation of hypotheses.
Below is an example of an automatically generated hypothesis that explains (with very high confidence) low values for "COLO 205 Colon: logGI50". Notice lower values are preferred, since they indicate the compound is effective at lower concentrations. NCI classifies compounds with logGI50 below -6 as "active"; those compounds are effective at nanomolar concentrations.
Below are the statistics (obtained on a separate test set) for the hypothesis in isolation.
On the bases of all hypotheses (for low and high values) generated, a model is constructed for the prediction of "COLO 205 Colon: logGI50". The performance of that model (again, on a separate test set) is shown below.
Finally, we can use the generated hypotheses, to rank a new collection, e.g., "nci_sample" in the example below.
Notice in the left panel that new compound "524367" is predicted to have a value for "COLO 205 Colon: logGI50" of -5.96. You can find the hypotheses underlying that prediction in the bottom panel. In this case, there is just one hypothesis that applies.
The examples underlying this hypothesis are shown in the right panel. For instance, one of the examples from our NCI data set that support the hypothesis is "477482".
Notice color codes link the ranked molecules in the left panel, to the hypothesis text in the bottom panel, to the reference molecules in the right panel.
|
© 2002-2007 PharmaDM, NV. All rights reserved.