AKos
Consulting & Solutions GmbH
Home Company Info Impressum Contact Email

a chemoinformatics company
 

Abstract: We present a program for the  prediction of biological activity spectra for drug-like organic substances. New lead compounds can be found on the basis of predicted biological activity spectra. In house and Internet versions of the PASS program are discussed. 

Keywords: biological activity spectrum, computer-aided prediction, computer system PASS (Prediction of Activity Spectra for Substances), applications in computer-aided drug discovery, prediction via Internet.

Introduction

Most of known biologically active substances have many different biological activities that cause both main (therapeutic) and supplementary (side) actions. Some of these activities are found during the initial preclinical study;  others are found unfortunately too late in clinical trials. Sometimes, many years after the first launch of a drug additional activities are discovered that become the base for a new therapeutic application.

Most computer-aided drug-discovery methods are used to study a single, or only a few activities of a compound class.  [1-5]. A program that predicts simultaneously pharmacological effects, mechanisms, and specific toxicities on the basis of the 2D chemical structure is the tool of choice to get an early indication if a compound could be a potential lead.

Victor Avidon proposed this idea more than 35 years ago [6, 7]. Tis technology has been formerly developed and tested on new chemical compounds synthesized in the USSR [8, 9] in the framework of the national registration system of the UDSSR. The program was revised several times. The theoretical analysis went through several approaches and the accumulated experience of finding new leads allows constant improvements [10-14].

-1-

The PASS team is permanently collecting and evaluating the information about new pharmaceutical substances and lead compounds, to update the PASS training set and extend PASS predictive abilities on new chemical classes and novel biological activities:

 

 

,

Figure 1. Increase of the number of compounds over years that are abstracted for the knowledge base

Figure 2. Increase of the number of predictable activities over years

 The current version of PASS predicts ca. to 4130  pharmacological effects, mechanisms of action, and other effects, see Table 1, to the right. [15]. We provide a list of activities (present list is not complete). 

In the following we show the methods used in PASS, how you can evaluate PASS yourself by using it on the Internet, or as evaluation version, and examples of practical applications.

 

 

 

 

 

-2-

 

Number

Area

Examples

261

pharmacotherapeutic actions

Anxiolytic

66  

anti-infective actions

Antileishmanial

72

actions blocking a certain process

Apoptosis antagonist

40

actions stimulated a certain process

Apoptosis agonist

140

actions blocking activity of certain endogenous substance

Acetylcholine antagonist

71

actions simulating activity of certain endogenous substance

Acetylcholine agonist

5

actions blocking a release of a certain endogenous substance

Cytochrome C release inhibitor

9

actions stimulating a release of a certain endogenous substance

Acetylcholine release stimulant

9

actions blocking an uptake of a certain endogenous substance

Adenosine uptake inhibitor

2219

actions inhibiting a certain enzyme

12 Lipoxygenase inhibitor

41

actions stimulating action of a certain enzyme

ATPase stimulant

268

actions blocking a certain receptor

5 Hydroxytrypamine 1 agonist 

121

actions stimulating a certain receptor

5 Hydroxytrypamine 1 antagonist

28

actions blocking a certain channel

Chloride channel antagonist

5

actions stimulating a certain channel

Calcium channel agonist

28

actions blocking a certain transporter

GABA transporter 1 inhibitor

128

actions that is a substrate of a certain metabolic enzyme

CYP3A4 substrate

24

actions inhibiting a certain metabolic enzyme (

, CYP3A4 inhibitor

13

actions inducing a certain metabolic enzyme

CYP3A4 inducer

28

actions inhibiting a certain protein

Collagen inhibitor

8

actions inhibiting an expression of a certain transcription factor

Transcription factor Rho inhibitor

2

actions stimulating an expression of a certain transcription factor

TP53 expression enhancer

389

actions that cause a certain adverse/toxic effect

Carcinogen

Table 1. List of biological effects

Presentation of biological activities in PASS

Let's define biological activity as the result of a compound's interaction with an biological entity. In clinical studies the entity is the human organism. In preclinical testing it can be animals (in vivo) or experimental models (in vitro). The biological activity depends on a compound's structure, charge distribution, physico-chemical properties, and more. The activity depends on the biological entity (species, sex, age, etc.), on the mode of treatment (dose, route), etc. Any biologically active compound reveals a wide spectrum of different effects. Some of them are useful in treatment of diseases but others cause various side and toxic effects. All activities caused by the compound are considered to be the  "biological activity spectrum of the substance".

If the experimental conditions cannot be defined narrowly, i.e if the difference in species, sex, age, dose, route, etc. is neglected the biological activity can be identified only qualitatively (“yes’/“none”, “active”/“inactive”). Thus, the "biological activity spectrum" is defined as an "intrinsic" property of a compound depending only on its structure and physico-chemical characteristics. A qualitative presentation allows integrating information concerning biologically active compounds that were collected from many different sources for the general PASS training set. Any property of chemical compounds, which is determined by their structural peculiarities, can be used for prediction by PASS. It was shown, that the applicability of PASS is broader than the prediction of biological activities. For instance, this approach was successfully used for prediction of such general property of organic molecules as drug-likeness (Anzali et al., 2001).

Chemical structure description in PASS.

The 2D structure of compounds is chosen as the basis for the description of the chemical structure because this is the only information available at the early stage of research. Thus, using the structural formula as input data, one can obtain the estimates of biological activity profiles even for virtual molecules, prior to their chemical synthesis and biological testing.

Many different characteristics of chemical compounds can be calculated on the basis of the 2D structure. In the earliest versions of PASS (Poroikov et al., 1993; Filimonov et al., 1995; Filimonov and Poroikov, 1996) used the Substructure Superposition Fragment Notation (SSFN) codes (Avidon et al., 1982). However, SSFN, like many other structural descriptors, reflects rather abstraction of chemical structure by the human than the nature of ligand-target interactions, which are the molecular mechanisms of biological activities.

 

-3-

The Multilevel Neighbourhoods of Atoms (MNA) descriptors (Filimonov et al., 1999) have certain advantages in comparison with SSFN. These descriptors are based on the molecular structure representation, which includes the hydrogen atoms according to the valences and partial charges of other atoms and does not specify the types of bonds. MNA descriptors are generated as recursively defined sequence:

  • zero-level MNA descriptor for each atom is the notation A of the atom itself;

  • any next-level MNA descriptor for the atom is the sub-structure notation A(D1D2....Di the previous-level MNA descriptor for i–th immediate neighbour’s of the atom A.

The notation A of the atom may include not only the atomic type but also any additional information about the atom. In particular, if the atom is not included into the ring, it is marked by “-”. The neighbor descriptors D1D2....Di  are arranged in unique lexicographic order. Iterative process of MNA descriptors generation can be continued covering first, second, etc. neighborhoods of each atom.

The molecular structure is represented in PASS by the set of unique MNA descriptors of the 1st and 2nd levels (Figure 3). The substances are considered to be equivalent in PASS if they have the same set of MNA descriptors. Since MNA descriptors do not represent the stereochemical peculiarities of a molecule, the substances whose structures differ only stereochemically, are formally considered as equivalent.

HC

C(C(CC—H)C(CC—C)—H(C))

HO

C(C(CC—H)C(CN—H)—H(C))

CHCC

C(C(CC—H)C(CN—H)—C(C—O—O))

CHCN

C(C(CC—H)N(CC)—H(C))

CCCC

C(C(CC—C)N(CC)—H(C))

CCOO

N(C(CNH)C(CNH))

NCC

H(C(CCH))

OHC

H(C(CNH))

OC

H(O(HC))

 

—C(C(CC—C)—O(—H—C)—O(—C))

 

O(H(O)C(COO))

 

O(C(COO))

Figure 3. Structural formula of nicotinic acid and its MNA descriptors of the 1st (left column) and 2nd (right column) levels

New QNA (Quantitative Neighbourhoods of Atoms) descriptors were recently developed, which allow the analysis of quantitative structure-activity relationships (Filimonov et al., 2009).

Mathematical Approach

The PASS algorithm of the biological activity spectrum prediction is based on Bayesian estimates of probabilities of molecule’s belonging to the classes of active and inactive compounds, respectively. The mathematical method is described in several publications (Lagunin et al., 2000; Stepanchikova et al., 2003; Poroikov and Filimonov, 2005; Filimonov and Poroikov, 2006; Filimonov and Poroikov, 2008), and its details will not be discussed here.

Since the main purpose of PASS is the prediction of activity spectra for new molecule, the general principle of the PASS algorithm is the exclusion from SAR Base (knowledge base) the substances, which are equivalent to the substance under prediction.

The structurefor which teh PASS prediction should be carried out, is presented as a molfile (for the set of molecules – as SDFile). The predicted activity spectrum is presented in PASS by the list of activities with probabilities "to be active" Pa and "to be inactive" Pi calculated for each activity (Figure 6). The list is arranged in descending order of Pa-Pi; thus, the more probable activities appeare at the top of the list. Only activities with Pa>Pi are considered as possible for a particular compound. The list can be shortened at any desirable cutoff value, but Pa>Pi is used by default. If the user chooses rather high value of Pa as a cutoff for the  selection of probable activities, the chance to confirm the predicted activities by the experiment is high too, but many existing activities will be lost. For instance, if Pa>90% is used as a cutoff, about 90% of real activities will be lost; for Pa>80%, the portion of lost activities is 80%, etc.

It is necessary to keep in mind that probability Pa reflects the similarity of molecule under prediction with the structures of molecules, which are the most typical in a sub-set of “actives” in the training set. Therefore, usually there is no direct correlation between the Pa values and quantitative characteristics of activities.

Even an active and potent compound, whose structure does not resemble the typical structures of “actives” from the training set, may obtain a low Pa value during the prediction (even negative Pa-Pi values could be observed). This may be explained by the way how the appropriate estimates are constructed: the values Pa for “actives” and Pi for “inactives” are distributed uniformly.

-4-

 

Taking this into account, the following interpretation of prediction results is possible. If, for instance, Pa=0.9, then for 90% of “actives” from the training set the appropriate estimates are less than for this compound, and only for 10% of “actives” these values are higher. If one declines the suggestion that this compound is active, he will make a wrong decision with 10% probability .

If Pa > 0.7. the chance to find the activity experimentally is high. But, in many cases the compound may occur to be a close analogue of known pharmaceutical agents.

If 0.5 < Pa < 0.7 the chance to find the activity experimentally is less, but the compound is probably not so similar to known pharmaceutical agents.

If Pa < 0.5 more than half of “actives” from the training are estimated to have a higher percentage chance to have this activity. If one declines the suggestion that this compound is active, he will make a wrong decision with probability less than 0.5. In such case the probability to confirm this kind of activity in the experiment is small, but if it will be confirmed, more than 50% chances that this structure has NOT been reported with this activity and might a valuable lead compound.

If the predicted biological activity spectrum is wide, the structure of the compound is quite simple, and does not contain peculiarities, which are responsible for the selectivity of its biological action.

If it appears that the structure under prediction contains several new MNA descriptors (in comparison with the descriptors from the compounds of the training set), then the structure has low similarity with any structure from the training set, and the results of prediction should be considered as rather rough estimates.

Based on these criteria, one may choose which activities have to be tested for the studied compounds on the basis of compromise between the novelty of expected pharmacological action and the risk to obtain the negative result in experimental testing. Certainly, one could also take into account a particular interest to some kinds of activity, experimental facilities, etc.

We have developed a special application CWM Lead Finder which matches with clustering algorithms the biological spectra of a set of compounds with known biological activity and a set of untested compounds.

 

 

Mathematical Method

The accuracy and efficiency of more than 200 various mathematical approaches were tested to select the most relevant algorithms [16]. One of the methods that provides a satisfactory quality of prediction is described below in more details.

Definitions:

n is the total number of compounds in the training set;
ni is the number of compounds, that have the descriptor i;
nj is the number of compounds, that reveal the activity j;
nij is the number of compounds, that have both the descriptor i and the activity j;

pj = nj/n is the estimate of a priori probability of activity j;
pij = nij/ni is the estimate of the conditional probability of the activity j for the descriptor i;
m is the number of descriptors for the compound under prediction;
ri = ni/(ni + 0.5/m) is a regulating factor;

Prj is the initial estimate of the probability of the activity j for the compound under prediction;
CPj is the cutting point;
E1j(CPj) is the estimate of 1st kind error probability;
E2j(CPj) is the estimate of 2nd kind error probability;

The 1st kind error is observed when the compound under prediction actually is active but Prj < CPj;
The 2nd kind error is observed when the compound under prediction is considered as inactive but Prj > CPj. 

LOO is the leave-one-out procedure. 

For each compound in the training set the values n, ni, nj, nij are changed to n-1, ni-1, and nj-1, nij-1 when it has activity j, and the estimates Prj are calculated.

MEP is the maximal error of prediction (see below).

-5-

Algorithm of Prediction

Structural descriptors are generated for the compound under prediction. The following values are calculated for each activity:

        uj = SiArcSin{ri(2pij-1)},   vj = SiArcSin{ri(2pj-1)}

                            sj = Sin(uj/m),   tj = Sin(vj/m)

                            Prj = (1+(sj-tj)/(1-sjtj))/2

Validation criteria: The LOO estimates of Prj are calculated for each compound in the training set. 
The estimates of E1j(CPj) and E2j(CPj) are calculated for each activity. The cross point 

                            E1j(CPj*) = E2j(CPj*)

are calculated. The maximal error of prediction MEP is:

                    MEPj = E1j(CPj*) = E2j(CPj*)

Results of the prediction:

The probability to be active is:

                            Pa = E1j(Prj)

The probability to be inactive is:

                            Pi = E2j(Prj)

The result for the prediction is presented as the list of activities with appropriate Pa and Pi, sorted in descending order of the difference (Pa-Pi)>0.

 

Process of PASS Development

Figure 4. The Process of PASS Development

-6-

The Training Set

The current PASS training set consists of about 270'000 of biologically active compounds, consisting of already launched drugs, drug-candidates under clinical or advanced preclinical testing. Since 1972 this training set is compiled from many sources including publications, patents, databases, private communications, etc. For the majority of compounds, included into the training set, the biological activity spectrum of each compound was studied in detail.

In PASS Pro the customer can create easily his own training set. A training set consists of a SDFile with the field activity_prediction. This file is read into PASS. It takes about 5 minutes to read a training set of 1000 compounds.

Validation of PASS

The quality of prediction can be calculated by leave-one-out cross validation (LOO CV). Each of the compounds is subsequently removed from the training set and the prediction of its activity spectrum is carried out on the basis of the remaining part of the training set. The result is compared to the known activity of the compound, and the maximal error of prediction (MEP) is calculated, and averaged over all compounds and activities.

Average accuracy of prediction is about 95.3% according to the LOO CV estimation, while for the different kinds of activity prediction accuracy varies from 70.7% (Antineoplastic, Myeloid leukemia) to 99.9% (p21-activated kinase 1 inhibitor).

Accuracy of PASS Prediction

The accuracy of PASS predictions depends on several factors, from which the quality of the training set seems to be the most important one. A perfect training set should include the comprehensive information about biological activities known or possible for each compound. In other words, the whole biological activity spectrum should be thoroughly investigated for each compound included into the PASS training set. Unfortunately, no database exists with information about biologically active compounds tested against each kind of biological activity. Therefore, the information concerning known biological activities for any compound is always incomplete.

 

 

 

We investigated the influence of the information’s incompleteness on the prediction accuracy for new compounds. About 20000 “principal compounds” from MDDR database (SYMYX MDL) were used to create the heterogeneous training and evaluation sets. At random 20, 40, 60, 80% of information were excluded from the training set. Either structural data or biological activity data were removed in two separate computer experiments. In both cases it was shown that even if up to 60% of information is excluded, the results of prediction are still satisfactory (Poroikov et al., 2000). Thus, despite the incompleteness of information in the training set, PASS algorithm is robust enough to get the reasonable results of prediction.

PASS predictions were performed for about 250000 molecules from Open NCI database (Poroikov et al., 2003). This information is presented at the NCI web-site (http://cactus.nci.nih.gov/ncidb2/) in a searchable mode. One could combine different terms in a query using Boolean operators. For example, with a query “Angiogenesis inhibitor AND Pa>0.9 AND Pi<0.2 NOT acid NOT amide” we identified 85 hits. Seven compounds were tested in NCI and four showed the Angiogenesis inhibitory activity at the approximately 10-100 µM level (Poroikov et al., 2003). Also, on the basis of results of anti-HIV testing of compounds from the Open NCI database, we estimated that using PASS predictions one could significantly (up to 17 times) increase the fraction of “actives” in the selected sub-set (Poroikov et al., 2003). 

-7-

 PASS on the Internet

PASS INet service (http://www.ibmc.msk.ru/PASS) provides the possibility for any registered user to obtain PASS predictions free-of-charge (Lagunin et al., 2000; Sadym et al., 2003; Filimonov and Poroikov, 2006; Geronikaki et al., 2008a). The user obtains the PASS predictions by submitting a molfile or drawing the structure with a Marvin applet.

By January 1st, 2010 the number of registered users exceeded 5000, and over 115000 predictions were obtained. Based on the prediction results, the researchers select the most prospective substances for chemical synthesis and biological testing. Comparison of PASS prediction results from different chemical series with various kinds of biological activity provides independent validation. Currently, about thirty independent papers have been published, where the coincidence of PASS predictions with the experiment is described. For example, due to the PASS predictions, new antileishmanial agents were found among the 2 substitution-bearing 6-nitro- and 6-amino-benzothiazoles (Delmas et al., 2002), 7-substituted 9-chloro and 9-amino-2-methoxyacridines (Di Giorgio et al., 2003), beta-carboline alkaloids (Di Giorgio et al., 2004); new anxiolytics were found among quinazolines (Goel et al., 2005), thiazoles, pyrazoles, isatins, a-fused imidazoles and other chemical series (Geronikaki et al., 2004); new anti-inflammatory agents were found among substituted amides and hydrazides of dicarboxylic acids (Dolzhenko et al., 2003), 1-acylaminoalkyl-3,4-dialkoxybenzene derivatives (Labanauskas et al., 2005); etc. (for review – see Geronikaki et al., 2008a).

 

 

Also, on the basis of PASS predictions new antihypertensive and antiinflammatory agents with dual mechanisms of actions were discovered (Lagunin et al., 2003; Geronikaki et al., 2008b), which demonstrated the capability of PASS in finding multitargeted agents exhibiting additive/synergistic effects. PASS applications for predicting biological activity spectra of organic molecules including known drug substances are described in detail (Poroikov et al., 2001; Poroikov and Filimonov, 2002; Poroikov et al., 2007).

PASS INet, however, does not provide the full functionality of the commercial version of PASS. In particular earlier version of SAR Base is implemented into PASS INet; this program predicts the smaller number of biological activities; only single molecule using molfile as an input are allowed. In the commercial version of PASS (Figure 5) SDFiles are used. Further analysis of prediction results  done with PharmaExpert.

Also, we provide continuous support for the commercial license answering questions, and supplying the latest versions of PASS when such versions appear.

 

 

 

 

 

-8-

 

Figure 5. PASS user interface and example of prediction results (displayed in a graphic mode)

 

 

In the commercial version of PASS the user can evaluate the contribution of each atom in a molecule to the required biological activity (Figure 6).

The color of each atom depends on the contribution of the atom to the activity.

Green              Pa = 1, Pi = 0

Red                 Pa = 0, Pi = 1

Blue                Pa = 0, Pi = 0

Grey               Pa = 0.33, Pi = 0.33

Thus, Green means the positive impact of a particular fragment into the activity; Red means the positive impact of a particular fragment into the activity; Blue and Grey mean the neutral impact of a particular fragment into the activity. Based on this information, medicinal chemist could modify the structure in order to increase the probability of the desirable pharmacological activity or decrease the probability of toxic action. 

 

-9-

Figure 6. Influence of particular atoms in a molecule on a particular activity (antihyper-tensive in this example).

 

 

PharmaExpert

PharmaExpert as a tool for analysis of PASS predictions. PharmaExpert (Poroikov et al., 2005; PharmaExpert Program Package, 2006) was developed to analyze the biological activity spectra of substances predicted by PASS program. This software provides a flexible mechanism for selecting compounds with the required biological activity profiles. Different kinds of biological activity are divided into six classes: mechanisms of action, pharmacological effects, toxic/adverse effects, metabolic terms, transporter terms and gene expression terms.

PharmaExpert analyzes the “mechanism-effect(s)” and “effect-mechanism(s)” relationships, identifies the probable drug-drug interactions for pairs of molecules, and searches for molecules with the required activity profile(s) and/or acting on multiple targets (Figure 7). The analysis is based on the “mechanism-effect(s)” relationships knowledgebase that is collected from literature more than 12 years and includes about 8000 relationships at the present time.

PharmaExpert also generates reports allowing users to prepare automatically the analysis of biological activity profiles for a set of compounds.

-10-

Figure-1

Figure 7. Example of PharmaExpert search for antineoplastic multitargeted ligands

 

Revealing New Effects and Mechanisms of Action  

This is considered below on the example of predicting the biological activity spectrum for the well-known cerebrotonic drug Cavinton (Vinpocetin). This was  launched by Gedeon Richter (Hungary) more than twenty years ago. Its structure and predicted biological activity spectrum are given below.

Cavinton is used in medicinal practice for twenty years. Many activities that were found in preclinical testing and clinical trials during this period are compared with the result of the prediction. According to the available literature only 16 of 47 predicted activities of Cavinton are already found. These activities are marked by "+" in the Table above.

In particular, PASS predicts the vasodilator and spasmolytic activities (Pa=0.855 and 0.540). It corresponds with the well-known pharmacological effects of Cavinton. It causes vasodilatation, increases the brain blood flow and metabolism. Antihypoxic and Antiischemic effects are also predicted for Cavinton (Pa=0.700 and 0.656 respectively). Cavinton is used for these purposes. Cavinton is predicted as Lipid peroxidase inhibitor (Pa=0.650), agent for cognition disorders treatment (0.648), agent for acute neurological disorders treatment (0.577), etc. Cavinton has all these activities.

The predicted biological activity spectrum of Cavinton suggests several new application of the substance. Among them are: Multiple sclerosis treatment (Pa=0.900); Antineoplastic enhancer (0.812), Antineoplastic Alkaloid (0.225) and Antitumor-Cytostatic (0.236); Antiparkinsonian rigidity-relieving (0.271) and Antiparkinsonian tremor-relieving (0.243); etc. While the Multiple sclerosis treatment is predicted with high probability, all other additionally predicted activities have relatively small values of Pa. 

Similarly, the predicted activity spectrum for any compound provides ideas for further testing. As a result some new effects and mechanisms will be found for old substances. Varying the cutoff value of Pa one may choose the desirable level of novelty vs. acceptable risk of negative result.

-11-

No

Pa

Pi

Activity

Experiment

Reference

1

0.929

0.004

Peripheral vasodilator

 

 

2

0.900

0.000

Multiple sclerosis treatment

 

 

3

0.855

0.005

Vasodilator

+

[17, 18]

4

0.844

0.003

Abortion inducer

+

[17]

5

0.812

0.001

Antineoplastic enhancer

 

 

6

0.760

0.006

Coronary vasodilator

+

[19]

7

0.732

0.007

Spasmogenic

 

 

8

0.700

0.036

Antihypoxic

+

[17, 20, 21]

9

0.650

0.004

Lipid peroxidase inhibitor

+

[22, 23]

10

0.648

0.008

Cognition disorders treatment

+

[17, 24, 25]

11

0.656

0.021

Antiischemic

+

[17, 26-28]

12

0.577

0.013

Acute neurologic disorders treatment

+

[17, 18]

13

0.540

0.039

Spasmolytic

+

[18]

14

0.519

0.026

Antianginal agent

 

 

15

0.486

0.037

Antihypertensive

+

[18]

16

0.449

0.035

Antiarrhythmic

+

[29]

17

0.432

0.063

Sympatholytic

 

 

18

0.438

0.077

Sedative

+

[18]

19

0.500

0.152

Antiinflammatory, Pancreatic

 

 

20

0.328

0.020

Antidepressant, Imipramin-like

 

 

21

0.300

0.010

Thrombolytic

+

[17, 18, 20]

22

0.342

0.075

Psychotropic

+

[18]

23

0.276

0.023

Alpha 2 adrenoreceptor antagonist

+

[30]

24

0.273

0.029

Anesthetic intravenous

 

 

25

0.547

0.304

Vascular (periferal) disease treatment

 

 

26

0.225

0.006

Antineoplastic Alkaloid

 

 

27

0.291

0.086

Cholinergic antagonist

 

 

28

0.263

0.066

Benzodiazepine agonist partial

 

 

29

0.417

0.238

Insulin promoter

 

 

Table 2. Predicted biological activity spectrum for Cavinton

 

Determining Relevant Screens for a Particular Compound. 

Testing can be organized in descending order of difference (Pa-Pi) for different activities. For example, if we consider the example of Cavinton, it should be studied in the following tests: Peripheral vasodilator (0.929-0.004), Multiple sclerosis treatment (0.900-0.000), Vasodilator (0.855-0.005), Abortion inducer (0.844-0.003), Antineoplastic enhancer (0.812-0.001), Coronary vasodilator (0.760-0.006), etc.

In this case both safety and efficacy of a new compound will be characterized more comprehensively. Moreover, it is shown that the economic viability of such approach to testing is more than 500% [32]. 

Selecting the Most Prospective Compounds for Highthroughput Screening. 

Sometimes one is interested in activities that are not yet included in PASS, and the data are not available to train ones own knowledge base for PASS Pro. In such cases two other strategies are suitable.

The first strategy is based on the hypothesis that the more activities are predicted for a compound, the higher is the chance to find any useful pharmacological action for this compound. For each compound the following value is  calculated: P = [S Pa/(Pa+Pi )]/n, where n is the number of biological activities under consideration.

All compounds are arranged in the descending order of P values, and only compounds with the highest values of P are selected for screening. 

The second strategy is based on the hypothesis that the more "novel" a compounds is, the higher is the probability to find a NCE. Thus, the compounds with the highest amount of new descriptors are selected.

Both strategies were tested on datasets including 10,000 - 70,000 compounds and their efficacy is shown [31].

Another approach is to use CWM Lead Finder.

-12-

 

Experimental Verification

The predictions of PASS were confirmed by experiment. Some of these examples are given below.

The activity spectra have been predicted for 300 new chemical compounds, synthesized in the Chemical-Pharmaceutical Research Institute (Novokuznetzk). Twenty compounds have been selected for testing as probable antiulcer agents. Nine compounds have been synthesized and tested. A potent antiulzer activity was found for 5 of these compounds. These new antiulcer agents are NCE [33]. The economic advantage  is about (300/20)100 = 1500% in this study.

The activity spectra have been predicted for 520 new chemical compounds, synthesized in the Institute of Organic Chemistry of Russian Academy of Science (Moscow). Fourteen compounds have been selected for testing as the most prospective. It was shown that the results of 22 experiments made on 5 various kinds of activity, coincide with predictions in 20 cases. The accuracy of prediction is about 90%.

Based on the predicted biological activity spectra for about 20 macroheterocyclic compounds, 2 antitumor leads were found.[34].

New antibacterial agents were found based on the biological activity spectra for derivatives of 1-amino-4-(5-arylozaxolyl-2)-butadiens-1,3 [35].

Analgesic, antiinflammatory, antioxidant and some additional activities were predicted and confirmed by experiment for some thiazole derivatives [36].

 

 

 

Benefits of PASS

In silico screening in the early stages of the research. Only a 2D structure is required as input for PASS.  

Reasonable accuracy of prediction. Average accuracy of prediction in leave one out cross-validation (for ~205,000 compounds and ~3750 kinds of biological activity from the PASS training set) is about 95%. PASS algorithm produce rather robust estimates of structure-activity relationships despite the incompleteness of the training set (Poroikov et al., 2000).

PASS parameters represent the biological space. PASS represents the properties of molecules in biological space in contrast to many other descriptors, which reflect the structural properties of molecules. PASS parameters can be used for clustering of compounds according to their biological properties, not according to their structural similarity. 

Predictions are rather fast. Calculation of biological activity spectra for 10,000 compounds on an ordinary PC takes about 5 min; therefore PASS can be effectively used to analyze the databases consisting of millions of structures. 

Standard structure format is used. Standard SDFile or molfile formats (MDL/Symyx) are used as input for PASS.

Only ordinary PC is necessary. PASS and PharmaExpert works in personal computer under the operating system Windows NT/XP/VISTA/Windows 7.

-13-

Limitations

Naturally, the PASS approach has some limitations. They are:

  • PASS approach can be applied to so-called "drug-like" substances.
  • PASS can be applied to the activities for which the training set will include no less than 5 active compounds per activity.
  • The accuracy of the PASS predictions are significantly higher than random guess. PASS cannot predict the activity spectrum for essentially new compounds that have no  descriptor in the training set 
  • In some cases PASS predicts  both agonist's and antagonist's (blocker and stimulator) actions simultaneously. Thus, only experiments can clarify the intrinsic activity of a compound, but it probably has an affinity to appropriate receptor (enzyme).

 

 

 

Acknowledgments

We gratefully acknowledge MDL Information Systems, Inc. for providing ISIS/Host, ISIS/Base and the MDDR database used in this study.

This is an edited version of the original paper of Prof. Vladimir Poroikov, A. Kos 3.2.03, revised May 30, 2010.

 

 

References by numbers

[1] Wermuth C.G., ed., Medicinal chemistry in practice, Academic Press, London, 1996, 968 p.p.

[2] Van de Waterbeemd H., ed., Structure-property correlations in drug research, Landes, Austin, 1996, 210 p.p.

[3] Dean P.M., Molecular similarity in drug design, Blackie Academic, London, 1995,

[4] Livingstone D., Data analysis for chemists. Applications to QSAR and Chemical Product Design, Oxford Science Publ., Oxford, 1995, 239 p.p.

[5] Kubinyi H., ed., 3D QSAR in drug design, Escom, Leiden, 1993, 759 p.p.

[6] Avidon V., Criteria for similarity assessment of chemical structures and the basics of informational language for development of informational-logical system on biologically active compounds. Chem. & Pharmaceut. J. (Rus.), 1974, 8 (8), 22-25.

[7] Piruzyan L.A., Avidon V.V., Rozenblit A.B., et.al. Statistical analysis of the information file on biologically active compounds. I. Data base on the structure and activity of biologically active compounds. Chem. & Pharmaceut. J. (Rus.), 1977, 11 (4), 35-40.

[8] Piruzyan L.A., Rudzit E.A. The methodical approaches to study biological activity of chemical compounds. Chem. & Pharmaceut. J. (Rus.), 1976, 10 (8), 21-27.

[9] Burov Yu.V., Korolchenko L.V., Poroikov V.V. National system for registration and biological testing of chemical compounds: facilities for new drugs' search. Bull. Natl. Center for Biologically Active Compounds (Rus.), 1990, No. 1, 4-25.

[10] Filimonov D.A., Poroikov V.V., Karaicheva E.I., et. al. (1995). Computer-aided prediction of biological activity spectra of chemical substances on the basis of their structural formulae: computerized system PASS. Experimental and Clinical Pharmacology (Rus), 58 (2), 56-62.

[11] Filimonov D.A., Poroikov V.V. PASS: Computerized prediction of biological activity spectra for chemical substances. Bioactive Compound Design: Possibilities for Industrial Use, BIOS Scientific Publishers, Oxford, 1996, p.47-56.

[12] Poroikov V.V., Filimonov D.A. Computerized prediction of biological activity spectra for chemical substance - new approach to effective drug design. In: QSAR and Molecular Modelling Concepts, Computational Tools and Biological Applications. Barcelona: Prous Science Publishers, 1996, p.49-50.

[13] Poroikov V.V., Filimonov D.A., Stepanchikova A.V., et.al.. Opimization of synthesis and pharmacological testing of new compounds based on computerized prediction of their biological activity spectra. Chem. & Pharmaceut. J. (Rus), 1996, 30 (9), 20-23. (English translation by Consultants Bureau, New York: Pharmaceutical Chemistry Journal, 1996, 30 (9), 570-573).

[14] Poroikov V.V. PASS, a program for the prediction of activity spectra from molecular structure. Newsletter of The QSAR and Modelling Society, 1997, No. 8, 12-15.

[15] Gloriozova T.A., Filimonov D.A., Lagunin A.A., Poroikov V.V. Testing of computer system for prediction of biological activity spectra PASS on the set of new chemical compounds. Chem. & Pharmaceut. J. (Rus), 1996, In press.

[16] Filimonov D.A. Comparison of Algorithms for Computer Prediction of Biological Activity Spectra for Chemical Compounds on the Basis of Their Structural Formulae. II Rus. Natl. Congress "Man and Drugs", Moscow, Abstracts, 1995, 62-63.

[17] Summary of Cavinton (Vinpocetine) Gedeon Richter, Budapest-Hungary, 1994-06-07.

[18] Mashkovskii M.D. The Pharmaceuticals, Medicine, Moscow, 1997, v.1, 399-400.

[19] VIDAL. Pharmaceuticals in Russia. Moscow, AstraPharmService, 1997.

[20] Kiss B., Karpati E. Acta Pharm. Hung., 1996, 66 (5), 213-224.

[21] Plotnikova T.M., Plotnikov M.V., Bazhenova T.G. Bull. Exp. Biol. Med., 1991, 111 (2), 170-172.

[22] Karmazsin L., Olah V. A., Balla G., Makay A. Acta Paediatr. Hung. 1990, 30 (2), 217-224.

[23] Suno M., Nagaoka A. Nippon Yakurigaku Zasshi, 1988, 91 (5), 295-299.

[24] Boda J., Karsay K., Czako L., Fugi S., Kovacs A., Koncz I., Maczko P. A. Ther. Hung., 1989, 37 (3), 176-180.

[25] Molnar P., Gaal L. Eur. J. Pharmacol., 1992, 215 (1), 17-22.

[26] Kiss B., Karpati E. Acta Pharm. Hung., 1996, 66 (5), 213-224.

[27] Hadjiev D., Yancheva S. Arzneimittelforschung, 1976, 26 (10A), 1947-1950.

[28] Rischke R., Krieglstein J. Pharmacology, 1990, 41 (3), 153-160.

[29] Karpati E., Szporny L. Arzneimittelforschung, 1976, 26 (10A),1908-1912.

[30] Paulo T., Toth P.T., Nguyen T.T., Forgacs L., Torok T.L., Magyar K. J. Pharm. Pharmacol., 1986, 38 (9), 668-73.

[31] Poroikov V.V., Filimonov D.A., Stepanchikova A.V. Biological Activity Spectra Prediction as a Tool to Select the Most Prospective Compounds from Commercial and In-House Databases. Abstr. Intern. Med. Chem. Symp., Seoul, 1997, P.143.

[32] Poroikov V.V, Filimonov D.A, Boudunova A.P. Computer Assisted Prediction of Biological Activity Spectra: Estimating the Effectivity of Use in High Throughput Screening. Abstr: XIVth International Symposium on Medicinal Chemistry, Maastricht, the Netherlands, 1996, P-3.05.

[33] Trapkov V.A., Budunova A.P., Burova O.A., Filimonov D.A., Poroikov V.V. Discovery of New Antiulcer Agents by Computer Aided Prediction of Biological Activity. Problems in Medical Chemistry (Moscow), 1997, 43 (1), 41-57.

[34] Islyaikin M.K., Danilova E.A., Kudrik E.V., Smirnov R.P., Boudunova A.P., Kinzirskii A.S. Synthesis and study of antitumor action of macroheterocyclic compounds and their complexes with metals. Chemical & Pharmaceutical J. (Rus), 1997, 31 (8), 19-22.

[35] Maiboroda D.A., Babaev E.V., Goncharenko L.V. (1998). Synthesis and study of spectral and pharmacological properties of 1-amino-4-(5-arylozaxolyl-2)-butadiens-1,3. Chemical & Pharmaceutical J. (Rus), 32 (6), 24-28.

[36] Geronikaki A., Poroikov V., Hajipavlou-Litina D., Mgonzo R., Filimonov D., Lagunin A. Synthesis, computer assisted prediction of biological activity spectra and experimental testing of new thiazole derivatives. Quantitative Structure-Activity Relationships, 1998, In press

-14-

TOP

References by abbreviation

Anzali S., Barnickel G., Cezanne B., Krug M., Filimonov D., Poroikov V. (2001). Discriminating between drugs and nondrugs by Prediction of Activity Spectra for Substances (PASS). J. Med. Chem. 44: 2432-2437.

Avidon V.V. (1974). Criteria for the comparison of chemical structures and principles of construction of an information language for a logical information system for biologically active compounds. Pharm-Chem. J. (Rus). 8: 22-25.

Avidon V.V., Arolovich V.S., Kozlova S.P., Piruzian L.A. (1978a). Statistical study of information file on biologically active compounds. II. Choice of decision rule for biological activity prediction. Pharm-Chem. J. (Rus). 12: 88-93.

Avidon V.V., Arolovich V.S., Kozlova S.P., Piruzian L.A. (1978b). Statistical investigation of large volumes of data with respect to the biological activity of compounds III. Selection of a determinant for predicting biological activity. Pharm-Chem. J. (Rus). 12: 99–106.

Avidon V.V., Pomerantsev I.A., Rozenblit A.B., Golender V.E. (1982). Structure-activity relationship oriented languages for chemical structure representation. J. Chem. Inf. Comput. Sci. 22: 207-214.

Avidon V.V., Arolovich V.S., Blinova V.G., Freidina A.M. (1983). Statistical investigation of the data file on biologically active compounds. V. Allowance for the novelty of the chemical structure in the prediction of the biological activity by an improved method of substructural analysis. Pharm-Chem. J. (Rus). 17: 59-62.

Burov Yu.V., Poroikov V.V., Korolchenko L.V. (1990). National system for registration and biological testing of chemical compounds: facilities for new drugs search. Bull. Natl. Cent. Biol. Active Compnds (Rus.). No. 1: 4-25.

Delmas F., Di Giorgio C., Robin M., Azas N., Gasquet M., Detang C., Costa M., Timon-David P., Galy J.P. (2002). In vitro activities of position 2 substitution-bearing 6-nitro- and 6-aminobenzothiazoles and their corresponding anthranilic acid derivatives against Leishmania infantum and Trichomonas vaginalis. Antimicrob. Agents Chemother. 46: 2588–2594.

Di Giorgio C., Delmas F., Filloux N., Robin M., Seferian L., Azas N., Gasquet M., Costa M., Timon-David P., Galy J.P. (2003). In vitro activities of 7-substituted 9-chloro and 9-amino-2-methoxyacridines and their bis- and tetra-acridine complexes against Leishmania infantum. Antimicrob. Agents Chemother. 47: 174–180.

Di Giorgio C., Delmas F., Ollivier E., Elias R., Balansard G., Timon-David P. (2004). In vitro activity of the beta-carboline alkaloids harmane, harmine, and harmaline toward parasites of the species Leishmania infantum. Exp. Parasitol. 106: 67–74.

Dolzhenko A.V., Kolotova N.V., Koz'minykh V.O., Vasilyuk M.V., Kotegov V.P., Novoselova G.N., Syropyatov B.Ya., Vakhrin M.I. (2003). Substituted amides and hydrazides of dicarboxylic acids. Part 14. Synthesis and antimicrobial and antiinflammatory activity of 4-antipyrylamides, 2-thiazolylamides, and 1-triazolylamides of some dicarboxylic acids. Pharm-Chem. J. 37: 149–151.

Filimonov D.A., Poroikov V.V., Karaicheva E.I., Kazarian R.K., Budunova A.P., Mikhailovskii E.M., Rudnitskikh A.V., Goncharenko L.V., Burov Yu.V. (1995). Computer-aided prediction of biological activity spectra of chemical substances on the basis of their structural formulae: computerized system PASS. Exper. Clin. Pharmacol. (Rus). 58: 56-62.

Filimonov D.A., Poroikov V.V. (1996). PASS: computerized prediction of biological activity spectra for chemical substances. In: Bioactive Compound Design: Possibilities for Industrial Use, BIOS Scientific Publishers, Oxford (UK), pp.47-56.

Filimonov D., Poroikov V., Borodina Yu., Gloriozova T. (1999). Chemical Similarity Assessment through multilevel neighborhoods of atoms: definition and comparison with the other descriptors. J. Chem. Inf. Comput. Sci. 39: 666-670.

Filimonov D.A., Poroikov V.V. (2006). Prediction of biological activity spectra for organic compounds. Russian Chemical Journal, 50 (2), 66-75

Filimonov D.A., Poroikov V.V. (2008). Probabilistic approach in activity prediction. In: Chemoinformatics Approaches to Virtual Screening. Eds. Alexandre Varnek and Alexander Tropsha. Cambridge (UK): RSC Publishing, 182-216.

Filimonov D.A., Zakharov A.V., Lagunin A.A., Poroikov V.V. (2009). QNA based “Star Track” QSAR approach. SAR & QSAR Environ. Res. 20: 679-709.

Geronikaki A., Babaev E., Dearden J., Dehaen W., Filimonov D., Galaeva I., Krajneva V., Lagunin A., Macaev F., Molodavkin G., Poroikov V., Saloutin V., Stepanchikova A., Voronina T. (2004). Design of new anxiolytics: from computer prediction to synthesis and biological evaluation. Bioorg. Med. Chem. 12: 6559-6568.

Geronikaki A., Druzhilovsky D., Zakharov A., Poroikov V. (2008a). Computer-aided predictions for medicinal chemistry via Internet. SAR & QSAR Environ. Res. 19: 27-38.

Geronikaki A.A., Lagunin A.A., Hadjipavlou-Litina D.I., Elefteriou P.T., Filimonov D.A., Poroikov V.V., Alam I., Saxena A.K. (2008b). Computer-aided discovery of anti-inflammatory thiazolidinones with dual cyclooxygenase/lipoxygenase inhibition. J. Med. Chem. 51: 1601-1609.

Goel R.K., Kumar V., Mahajan M.P. (2005). Quinazolines revisited: search for novel anxiolytic and GABAergic agents. Bioorg .Med. Chem. Lett. 15: 2145–2148.

Golender V.E., Rozenblit A.E. (1978). Computer Methods for Drug Design. Riga: Zinatne, 232 pp.

Golender V.E., Rosenblit A.B. (1983). Logical and Combinatorial Algorithms for Drug Design, Research Studies Press, Wiley&Sons, 352 pp.

Labanauskas L., Brukstus A., Udrenaite E., Bucinskaite V., Susvilo I., Urbelis G. (2005). Synthesis and anti-inflammatory activity of 1-acylaminoalkyl-3,4-dialkoxybenzene derivatives. Il Farmaco. 60: 203–207.

Lagunin A., Stepanchikova A., Filimonov D., Poroikov V. (2000). PASS: prediction of activity spectra for biologically active substances. Bioinformatics. 16: 747-748.

Lagunin A.A., Gomazkov O.A., Filimonov D.A., Gureeva T.A., Dilakyan E.A., Kugaevskaya E.V., Elisseeva Yu.E., Solovyeva N.I., Poroikov V.V. (2003). Computer-aided selection of potential antihypertensive compounds with dual mechanisms of action. J. Med. Chem. 46: 3326-3332.

PASS program package, © Filimonov D.A., Poroikov V.V., Gloziozova T.A., Lagunin A.A. Russian State Patent Agency, N 2006613275 of 15.09.2006.

PharmaExpert program package, © Lagunin A.A., Poroikov V.V., Filimonov D.A., Gloziozova T.A. Russian State Patent Agency, N 2006613590 of 16.10.2006.

Poroikov V.V., Filimonov D.A., Boudunova A.P. (1993). Comparison of the Results of Prediction of the Spectra of Biological Activity of Chemical Compounds by Experts and the PASS System. Automat Document Math Linguistics. 27: 40-43.

Poroikov V.V., Filimonov D.A., Borodina Yu.V., Lagunin A.A., Kos A. (2000). Robustness of biological activity spectra predicting by computer program PASS for non-congeneric sets of chemical compounds. J. Chem. Inform. Comput. Sci. 40: 1349-1355.

Poroikov V., Akimov D., Shabelnikova E., Filimonov D. (2001). Top 200 medicines: can new actions be discovered through computer-aided prediction? SAR and QSAR in Environmental Research, 12 (4), 327-344.

Poroikov V.V., Filimonov D.A. (2002). How to acquire new biological activities in old compounds by computer prediction. J. Comput. Aid. Molec. Des., 16 (11), 819-824.

Poroikov V.V., Filimonov D.A., Ihlenfeldt W.-D., Gloriozova T.A., Lagunin A.A., Borodina Yu.V., Stepanchikova A.V., Nicklaus M.C. (2003). PASS Biological Activity Spectrum Predictions in the Enhanced Open NCI Database Browser. J. Chem. Inform. Comput. Sci. 43: 228-236.

Poroikov V., Filimonov D. (2005). PASS: Prediction of Biological Activity Spectra for Substances. In: Predictive Toxicology. Ed. by Christoph Helma. Taylor & Francis, 459-478.

Poroikov V., Lagunin A., Filimonov D. (2005). PharmaExpert: diseases, targets and ligands – three in one. QSAR and Molecular Modelling in Rational Design of Bioactive Molecules. Eds. Esin Aki Sener, Ismail Yalcin,  Ankara (Turkey), CADD & D Society, 514-515.

Poroikov V., Filimonov D., Lagunin A., Gloriozova T., Zakharov A. (2007). PASS: Identification of probable targets and mechanisms of toxicity. SAR & QSAR in Environmental Research., 18 (1-2), 101-110.

Sadym A., Lagunin A., Filimonov D., Poroikov V. (2003). Prediction of biological activity spectra via Internet. SAR & QSAR Environ. Res. 14: 339-347.

Stepanchikova A.V., Lagunin A.A., Filimonov D.A., Poroikov V.V. (2003). Prediction of biological activity spectra for substances: Evaluation on the diverse set of drugs-like structures. Cur. Med. Chem. 10: 225-233.

 

Up