Pharm Dev Regul 2003; 1 (3): 159-168 1175-9046/03/0003-0159/$30.00/0
LEADING ARTICLE
© Adis Data Information BV 2003. All rights reserved.
Predictive Software for Drug Design and Development Recent Progress and Future Developments Robert C. Jackson Cyclacel Ltd, Dundee, UK
Abstract
The drug discovery and development process has become more quantitative and much more computationally intensive in recent years. For pharmaceutical and biotechnology companies, this has had two major implications. Firstly, there is now a much greater range of commercial software to support drug design and development. Secondly, a number of specialized companies have appeared that have written proprietary software, and which provide computational service support to the industry. In the area of drug design, available software falls into four main categories: (i) tools for structure-based ligand design when a 3-dimensional receptor structure is available (from X-ray crystallography or high-field nuclear magnetic resonance spectrometry); (ii) software products for in silico screening of chemical compound collections against a 3-dimensional receptor structure, where this is available; (iii) computational tools for the design of inhibitors in the absence of a 3-dimensional structure, by drawing inferences about receptor structure from the properties of known inhibitors; and (iv) computational techniques for prediction of drug-like properties, i.e. the physical and metabolic attributes characteristic of successful drugs such as solubility, ability to cross biological barriers, and stability to metabolism. A number of other trends are leading to greater computational intensity in drug development. Software is available that attempts to predict a range of toxicities, and also drug absorption, distribution, metabolism and elimination (ADME) properties from chemical structure. In another growth area, the established discipline of pharmacokinetics (prediction of drug concentrations in body compartments) is extending its range into pharmacodynamics (prediction of drug effects). Since drug effects are the result of interactions of xenobiotic agents with complex biological systems, this has led to attempts to create quantitative disease models, bringing the field of complex system theory into drug development. Ultimately the rate-limiting process in drug development, and the most expensive part, is the clinical trial. The promise that computational biology brings to drug development is the ability to bring these modeling tools to bear on the design and interpretation of clinical trials, to increase their success rate and cost effectiveness.
Historically, drug development has not been primarily a quantitative science, with the exception of pharmacokinetics (PK) whose roots go back to the 1930s. In the last two decades, two major trends have converged and are making the drug discovery and development process more quantitative and more computer-intensive. The first is that the historic balance between rational drug design approaches and screening approaches is beginning to tilt in favor of rational design. The second is that a number of technologies, chiefly high-throughput screening and genomics, have generated databases that are orders of magnitude greater than our
industry has previously seen. Techniques have been developed for storage, retrieval and analysis of this data, but these databases also present an unprecedented opportunity for modeling approaches, and for the development of knowledge-based expert systems that can extract the predictive power from these enormous volumes of information. A large body of scholarly research has contributed to this process, and cannot be reviewed here. For general references to different aspects of this literature see Babine and Bender,[1] Hopkins and Groom,[2] and de Jong.[3] Rather, the intent is to discuss
160
the degree to which the new techniques are now available to drug developers who are not themselves expert biomathematicians, through the use of user-friendly commercial software, or through specialized commercial contract services. 1. Introduction The group of techniques referred to as ‘rational drug design and development’ are necessarily heavily dependent upon computer hardware and software. Protein crystallography, nurtured in Cambridge’s Cavendish Laboratory, England, relied upon the London University Atlas computer when its Fourier transforms were too much for the early Cambridge computers to handle. The use of protein crystal structures in drug design came two decades later, when academic supercomputers were becoming widespread; the use of distributed processing techniques means that members of the public can now contribute the idle time of their PCs to running drug screening software as a screensaver. Rational drug discovery is often contrasted with screening (frequently termed, rather pejoratively, ‘random screening’) – though all pharmaceutical companies have long used both techniques. The concept of what is ‘rational’ has evolved with time: the first generation of antimetabolites were designed as structural analogs of physiological precursor subunits of DNA, RNA or protein, with the expectation that these antimetabolites might inhibit macromolecular synthesis, even though the actual targets of these first-generation antimetabolites were in many cases not known. When detailed 3-dimensional structures of drug target macromolecules started to become available (initially enzymes, but also DNA and more recently cell membrane-bound receptors) it became possible to design small-molecule ‘keys’ to fit the macromolecule ‘locks’, and this approach of ‘structure-based drug design’ (SBDD) became the focus of intensive software design, aimed at automating both the process of structure elucidation and of inhibitor design. SBDD has been supplemented by a large number of computational techniques that attempt to predict the affinity of an inhibitor for its target in the absence of a target structure.[1] Of course, to be an effective drug, a molecule must not only be able to bind effectively to its target; it must have appropriate physicochemical properties, and it must be able to reach and maintain a sufficient concentration, for the required period of time, at its intended site of action i.e. it must have ‘drug-like properties’. A variety of computer programs have been devised to predict drug-like properties of particular chemical structures.[2] The use of screening approaches to drug design has been stimulated in recent years by the introduction of combinatorial chemistry and high-throughput screening. In some cases these techniques are used in combination with SBDD; for example, © Adis Data Information BV 2003. All rights reserved.
Jackson
structural information may be used in the design of targeted combinatorial libraries of inhibitors.[4] Publication of the almost complete human genome sequence has two major implications for drug discovery. Firstly, it has emphasized that existing drugs are targeted against a small fraction of potential targets, and makes it desirable to evaluate the potential of proteins coded by the remaining genes as novel drug targets. Secondly, as we gain an understanding of individual genetic variability, this raises the promise of personalized therapy. In principle, the current pharmacopoeia of agents, optimized against a target with average properties, will be replaced by a range of agents targeted against the principle variants of the target. The problem with this scenario is that modern drug development is already an extremely expensive process; it is estimated that to take a novel chemical entity from conception to market costs around half a billion US dollars. The reason this cost is so great is because the failure rate is so high. If our objective is to make new drugs more selective by designing them more precisely against individual receptor variants, we shall have to develop many more agents, each of which will have a fraction of the total market for that drug class. The economics of this situation can only work out if development costs can be made much lower, and for this to happen the success rate in development must be correspondingly higher. Interest in more accurate predictive tests, including computational prediction, has thus been greatly stimulated by the publication of the genome. 2. Structure-Based Drug Design The process of designing drug molecules has been revolutionized in the past two decades by the ability to examine the 3-dimensional structure of drug receptors (especially enzymes) at atomic resolution, and to predict the spatial and electrostatic properties of potential small-molecule inhibitors. Two technologies have made this possible: structural biology (chiefly by X-ray crystallography of proteins) and computational chemistry. An important additional technology in structural biology has been high-field nuclear magnetic resonance (NMR) spectrometry, since this does not require the protein to be crystallized, and gives information on the solution structure. A third physical technique that has made possible more detailed characterization of drug-receptor interactions is mass spectrometry. All these technologies are highly computer-intensive. When a particular protein is selected as a drug design target, the first port of call is the crystallographic database. If the coordinates of the protein are available (ideally with a bound inhibitor to delineate the active site), the process of designing new inhibitors can begin. Often the required structure will not be available, but Pharm Dev Regul 2003; 1 (3)
Predictive Software for Drug Design and Development
that of a related protein may be, which may provide some clues on how to solve the required structure. Sometimes one or more structures are available that have high sequence homology to the required structure in the region of interest. In such cases a homology model may be created. For example, in the very active field of protein kinases, experimental structures are only available for a small fraction of the 500-plus kinases in the human genome. However, if one is attempting to design an inhibitor at the ATPbinding site, the assumption may be made that regions remote from this site have little influence on its 3-dimensional structure. A model can then be created by computationally altering those residues at the active site that differ from their counterparts in the known structure. When an experimental 3-dimensional protein structure or a homology model has been obtained, it may be used for the process of virtual screening (also called in silico screening). This process consists of examining the 3-dimensional structures of tens of thousands of potential ligands: do they fit into the active site of the protein? Is there electrostatic complementarity? Does the ligand form hydrogen bonds with the protein? A number of commercially available programs are available for this ‘docking’ process. The current generation of these programs treats the ligand conformations as flexible, since the bound conformation of the ligand may differ from its conformation when in solution. The predicted protein-ligand binding affinities are scored (according to how well the ligand fills space at the active site, and how many electrostatic or hydrogen-bonding interactions it makes) and the highest-scoring ligands can then be synthesized and tested experimentally. Several large chemical libraries are commercially available and the 3-dimensional structures of the library compounds are distributed on disk. These libraries can be used as a starting point for virtual screening, because the computational ‘hits’ are then immediately available for experimental assay. Not all the predicted hits turn out to be inhibitors in practice, but compared with random screening, the virtual screening approach has been shown to enrich the screening success rate many fold. The early hits, whether discovered by virtual screening or by combinatorial chemistry, often have rather low binding potency. At this early hit stage, structure-based drug design is used to design more potent analogs by an iterative process involving cocrystallization of the hit compound with the target protein, solution of the co-crystal structure, and use of the new structure to design a next-generation compound with improved binding properties. The new compound, in turn, is synthesized, tested as an inhibitor, and co-crystallized with the target protein. Often, several rounds of this iterative design process can result in the series being improved in inhibitory potency by many orders of magnitude. © Adis Data Information BV 2003. All rights reserved.
161
Several software tools are available that can assist in the design process. LUDI[5] is a fragment-based approach that can be used as a computational building kit to explore different ways of filling a binding site. SPROUT[6] explores the accessible surface of a protein and identifies possible binding pockets. It then uses fragments that dock to the target sites, and links them with spacers to generate suggested inhibitor molecules that satisfy the steric constraints of the binding pocket, and finally sorts and scores the proposed inhibitors. Many advances in the study of protein structure are making the rational drug design process more effective. Currently, it is not possible to predict the 3-dimensional structure of a protein from its sequence (if this were possible, X-ray crystallography would be unnecessary). However, as the database of solved protein structures becomes greater, it becomes more feasible to predict at least partial structures by analysis of the various motifs within the structure. As structure prediction becomes more accurate, so homology models become more useful. The other aspect of the design process that is not yet solvable from first principles is the scoring problem. Again, as the database of solved protein-ligand complexes becomes greater, together with accurate thermodynamic data, knowledge-based approaches for predicting protein-ligand binding affinities are becoming more effective. 3. Drug Design in Absence of a Structure For many important drug targets, 3-dimensional structural information is not available, at least at the high resolution necessary for molecular design. This includes most membrane-bound proteins, such as the G-protein-coupled receptors. Various computational approaches to drug design are available that do not require a receptor structure. Probably the best established of these techniques is quantitative structure-activity relationships (QSAR). This analyzes a set of inhibitors (often 20–40 compounds) covering a range of binding affinities, with respect to a set of molecular descriptors of size, shape, and electronic properties. As many as 60 or more descriptors have been used. A multiple regression analysis then correlates the measured activity with the set of descriptors, and attempts to predict the activity of novel molecules from the resulting coefficients.[7] Neural nets have been used in a similar way[8] and have similar power to QSAR, and the similar disadvantage that over-training the network results in it becoming over-determined, with a paradoxical decrease in its predictive power. The classical QSAR approach pioneered by Hansch and Klein[7] has been extended to include 3-dimensional shape descriptors – comparative molecular field analysis (COMFA). With this technique, determinants of the 3-dimensional interaction of ligand and receptor may often be Pharm Dev Regul 2003; 1 (3)
162
Jackson
inferred in the absence of 3-dimensional structural information; see, for example, the COMFA analysis of the paclitaxel/tubulin interaction.[9] To date, the primary uses of the QSAR approach have been in predicting affinity of inhibitors for target enzymes or receptors, i.e. ‘activity’ is defined as efficacy. The relevant activity could equally well be defined as a toxic effect, or as susceptibility to metabolism, and recent years have seen increasing application of the approach to predicting toxicology and metabolism (see section 6). 4. The Future of Bioinformatics The term ‘bioinformatics’, as usually employed, is viewed primarily in the context of gene sequence data, and the associated tools for sequence comparisons, homology quantification, phylogenetic trees, and domain searches. With the growth of systematic proteomics, the power of tools for analysis of protein sequences has increased correspondingly. A rather less obvious point is that as these databases become larger, their predictive power also increases, primarily through their forming the basis of knowledgebased expert systems. Although it will be a long time, if ever, before we have access to a high resolution crystal structure for every human protein, the value of the protein database increases disproportionately every time a new structure is deposited. This is because as the number of solved structures grows, the accuracy of homology modeling of unknown structures increases, and the ability to predict protein function from sequence becomes better. Similarly, the growth of chemi-informatics and high-throughput screening databases increases our ability to make accurate predictions about structure-activity relationships. Expert systems for prediction of drug metabolism and toxicology, though initially unimpressive, become more powerful every time a new piece of experimental data is deposited in them. The future of bioinformatics is thus that the growth of tools for storage, retrieval, and analysis of data will be paralleled by growth of rule-based expert systems with predictive power. Increasingly, these systems will develop the power to infer their own rules, resulting in true knowledge-based systems. This means that as databases and theoretical knowledge grow, the balance of advantage between rational and screening approaches will inevitably shift. 5. Predicting Drug-Like Properties Since combinatorial chemistry and high-throughput screening have accelerated the process of identifying ligands for new drug target molecules, many potent inhibitors have been identified 1
which are, however, not suitable for use as drugs. In some cases this is because they are too insoluble, or are unable to cross biological membranes, or contain chemical functions that confer severe toxicity. In other cases the molecules are cleared so rapidly from the bloodstream that they are not able to show a pharmacological effect in vivo. This experience has prompted an extended debate on what molecular properties make a molecule ‘drug-like’. Perhaps the most familiar generalization on this subject is Lipinski’s rule of fives.[10,11] These criteria of drug-likeness are: molecular weight <500Da; log octanol-water partition coefficient (log P) <5; fewer than 5 hydrogen-bond donors; and fewer than 10 hydrogen-bond acceptors. It is argued that a compound that infringes >1 of these four criteria is unlikely to be a good drug. Three of these criteria can be simply counted from looking at the structure; the fourth (log P) can be measured experimentally or computed by a number of commercial software packages (e.g. ClogP™1, from BioByte in Claremont, CA, USA). Abraham et al.[12] use a QSAR approach based on 5 physicochemical descriptors (including hydrogen-bonding effects) from which they can predict aqueous solubility, log P, and over a hundred other properties. Other commercially available software products for predicting parameters relevant to ‘drug-likeness’ are Cerius2 (Accelrys, San Diego, CA, USA), HYBOT (TimTec Inc., Newark, DE, USA)[13] and Solubility DB from Advanced Chemical Development Inc., Toronto, Ontario, Canada. Ionization constants (pKa) may be computed by PETRA, from the University of Erlangan, Germany.[14] These methods taken together provide a valuable series of filters that can be used to prioritize compounds for development based upon their likelihood of reaching their site of action in the body, and remaining for long enough to have pharmacological activity. However, the need for drug-like properties must be put in perspective; if we are trying to hit an extracellular target, and are prepared to accept drug delivery by intravenous injection, even a protein can be a useful drug. 6. Prediction of Drug Absorption, Distribution, Metabolism and Elimination and Toxicology ADME is a customary abbreviation for drug absorption, distribution, metabolism and elimination. All these processes have been subjected to computer modeling, with varying degrees of success. Drug distribution is the subject matter of pharmacokinetics (PK), which unlike most other areas of drug development has been a topic of rigorous quantitative study for many years. PK is discussed in section 7. Modelling and prediction of drug absorption
The use of tradenames is for product identification purposes only and does not imply endorsement.
© Adis Data Information BV 2003. All rights reserved.
Pharm Dev Regul 2003; 1 (3)
Predictive Software for Drug Design and Development
and metabolism have been the subjects of a number of competing approaches. A great deal of effort has been devoted to prediction of oral bioavailability, since (ignoring for the moment the fact that some drugs are substrates for active transport carriers) it seems that the passage of a small molecule across biological membranes should be predictable from the compound’s physicochemical properties. Veber et al.[15] analyzing data from a set of 1100 compounds, which had been tested for oral bioavailability in rats by GlaxoSmithKline concluded that the major determinants were ten or ˚ 2, or 12 or fewer fewer rotatable bonds, polar surface area ≤140A hydrogen bond donors and acceptors. Veber et al.[15] found that molecular weight and lipophilicity were not in themselves useful predictors (though larger molecules tended to have greater polar surface area and more rotatable bonds). These authors noted, though, that nearly 80% of the molecules in their data set with oral bioavailability of ≥20% met three out of four of Lipinski’s criteria (see section 5). Clearly oral bioavailability of compounds that are actively transported will be determined more by specific chemical criteria than by physicochemical properties, and will be less predictable. Conversely, compounds that are substrates for the membrane P-glycoprotein pump may be actively removed from their target site. Penzotti et al.[16] have used a pharmacophore approach (i.e. a set of shape and electronic descriptors) to predict affinity for P-glycoprotein. In addition to oral bioavailability, permeation across other biomembranes may be important for drug delivery. Potts and Guy[17] present a predictive algorithm for calculation of skin penetration, and Prausnitz and Noonan[18] discuss the delivery of drugs to the eye. Efficient penetration of the blood-brain barrier is necessary for drugs intended to act within the central nervous system (CNS), but is probably best avoided for drugs intended to act outside the CNS, to minimize the risk of neurotoxicity. The prediction of blood-brain barrier penetration is discussed by Seelig et al.[19] The blood-brain barrier actively removes compounds that are P-glycoprotein substrates, so again penetration across this biomembrane cannot be predicted purely from physicochemical properties. Abraham et al.[12] also described the prediction of blood-brain barrier penetration and oral bioavailability with their Absolv™ software, a readily available, user-friendly Windows-based package which uses a QSAR approach based upon the five Abraham descriptors (see section 5). The software calculates linear regression coefficients based upon these five parameters and a constant term. QSAR has also been used for prediction of toxicity, but more emphasis has been placed upon expert system approaches.[20] A commercially available example is TOPKAT® (Accelrys), which © Adis Data Information BV 2003. All rights reserved.
163
has been most extensively used for prediction of genotoxicity; Accelrys also has tools for hepatotoxicity prediction.[21] Several companies offer expert systems for prediction of ADME properties including Accelrys, Amedis (Cambridge, UK), Cyprotex (Macclesfield, UK), Lion Bioscience (Heidelberg, Germany), and Arqule (Woburn, MA, USA). Some of these companies refer to predictive drug absorption, distribution, metabolism, elimination, and toxicity (ADMET) as well as the usual ADME properties. In the prediction of drug metabolism, the systems are generally better at predicting sites of metabolism, and which drug-metabolizing enzymes will be involved, than at predicting relative rates of metabolism. However, as with all knowledge-based methods the predictive power is growing rapidly as the knowledge base expands. 7. From Pharmacokinetics to Pharmacodynamics Pharmacokinetics has always been one of the most quantitative aspects of drug development. Its origins were an outgrowth of some of the most basic concepts of quantitative biology: the idea of dose-response relationships, the concept of drug clearance, and the observation that for many drugs the total effect was a function of both drug concentration at its site of action, and of exposure time. When it became practical to measure drug concentrations in plasma, it was observed that the pharmacokinetics for many drugs were well approximated as a multiexponential process. The advent of affordable computing power was followed by the availability of increasingly user-friendly software for fitting experimental data to compartmental PK models. WinNonlin® (Pharsight Corporation, Mountain View, CA, USA) is a popular current example. Once the basic PK parameters have been established, it is possible to predict what blood levels and time courses will result from a particular dosage regimen, and how they will vary with body size. A development of pharmacokinetics that has had considerable influence on drug development strategies has been population PK modeling. This deals with the fact that PK parameters are not identical in all individuals, but follow a statistical distribution. Thus, if we know that a particular level of drug exposure is required for activity, but that at another, higher exposure toxic side effects will emerge, we may use population PK modeling to ask the question: if we want to limit the incidence of adverse events to 0.5%, what response rate can we expect? Software is available for this kind of analysis, the best-known being NONMEM, from the University of Southern California, USA. Modelling tissue levels of drugs has been slower to catch on than modeling plasma levels, partly because it is much harder to collect experimental data, especially in humans, and partly because the resulting models are mathematically more complex. Pharm Dev Regul 2003; 1 (3)
164
Nevertheless, models that have the power to predict drug concentrations at the site of action of a drug, whether the brain, an arthritic joint, or in a tumor, are potentially more powerful, and this technique of physiologically-based PK modeling (PB-PK) represents a growing opportunity for PK software development. Cloe-PK™ is a commercially available PB-PK modeling tool that bases its predictions on physical properties of the test compounds and in vitro measurements of metabolism (Cyprotex plc, Macclesfield, UK).[22] An approach to prediction of PK parameters that has been widely explored is to use multivariate regression, analogous to QSAR, also known as quantitative structure-PK relationships, or QSPR. This is often used to predict the likely PK properties of novel compound designs, based upon the measured pharmacokinetics of existing related compounds within the series (the training set). In this way, the likely pharmacokinetics of novel analogs may be predicted before the analogs are actually synthesized.[10,15] Since the experimental data for the training set will usually have been collected in rats, the predicted values will also apply to this species. Since human PK parameters are not usually available for multiple members of an analog series, this technique is not usually directly applicable to predicting human pharmacokinetics. Instead of multiple regression, a neural net approach has been explored for predicting PK in humans.[8,23] Neural nets are multilayered assemblies of molecules capable of detecting multiple input signals and giving outputs than can be weighted functions of their inputs. Neural nets, named because of their supposed resemblance to brain function, can be trained with data sets whose outputs are known functions of the inputs, and then used to make predictions in unknown situations. A situation that often arises in early drug development is that we may have PK parameters for an animal species, often rats, but would like to predict human PK from animal data. One way to do this is by using the technique of allometry, which takes advantage of the fact that many biological processes, including blood flow, urine output, intestinal absorption etc., follow a power law. A summary of the use of allometry in prediction of human pharmacokinetics is given in Jackson,[24] which describes examples of the use of the technique to predict volume of distribution, plasma clearance, plasma half-life and area under the plasma concentration-time curve. Allometry often provides useful insights, but has two main limitations. Firstly, to give useful predictions of human PK parameters it is necessary to have preclinical data from several species; two at an absolute minimum, and three or more to use the process with any degree of confidence. In some instances the toxicokinetic analysis (i.e. the PK analysis of blood levels in the animals in which the new agent’s toxicology was studied) will provide PK parameters for mice, rats and dogs, and © Adis Data Information BV 2003. All rights reserved.
Jackson
occasionally also in monkeys. In these cases an allometric analysis will take very little extra effort, and will at least provide some indication of what to expect from the human pharmacokinetics. Secondly, there are certain aspects of drug clearance, especially the occurrence and distribution of particular drug metabolizing enzymes, which may vary erratically between species, rather than following a power law. In such cases, allometry is not much help. The technique of PB-PK modeling, referred to above, in which PK equations are used to describe tissue levels of drugs as a function of time, may also be used for cross-species predictions. Once the system has been described in detail, for example for rats, the rat organ weights and blood flow data in the model are simply replaced with the corresponding human values; the model will now predict human pharmacokinetics. If comparative drug metabolism data are available from in vitro studies with rat and human hepatocytes, then the hepatic clearance term in the system of equations may also reflect known species differences in metabolism. This technique, potentially very powerful, has until recently not been widely available in the form of user-friendly software. Cyprotex now provide PB-PK modeling as a commercial service. Examples of PB-PK modeling for prediction of human pharmacokinetics are given in Jackson.[24] The other growth area for pharmacokinetics is to extend the technique from prediction of drug concentrations to prediction of drug effects, i.e. from pharmacokinetics to pharmacodynamics (PD). At the simplest level, this can be done by linking a PK model to a dose-response relationship. How well this works depends upon the degree of complexity of the biological system being modeled. At one end of the scale, with antibacterial drugs, for example, we may obtain detailed dose-response information in vitro, and use it to relate plasma drug concentration to a predicted log kill of the pathogen. This technique has been cautiously extended to more complex situations. In some cases the desired pharmacological effect may lag behind the plasma concentrationtime curve; this situation has been addressed by adding an additional (notional) compartment, the ‘effect compartment’ to a classical compartmental PK analysis.[24,25] Several of the commercially available PK software packages include modules for PD analysis and modeling. WinNonlin® includes a PD module, and RIDO™, a software package from the European Centre of Pharmaceutical Medicine, has extensive PD modeling capability.[26] 8. Complex System Theory in Drug Development The main reason that PD modeling has not become as universally applicable as PK modeling is that the relationship of drug effect to drug concentration is not usually as simple as it is in the Pharm Dev Regul 2003; 1 (3)
Predictive Software for Drug Design and Development
case of antibacterial drugs (see section 7). Even in the deceptively simple case of an antihypertensive drug, where we can often establish a smooth relationship between dose and effect, we are in fact perturbing a complex homeostatic system, and for a model to have useful predictive value it must encapsulate the essential regulatory properties of the system being modeled. This means that pharmacodynamics leads us unavoidably into systems biology and the mathematical modeling of complex systems.[27] Disease models attempt to describe a pathological process in sufficient detail that we can ask questions about the outcome of intervening at particular molecular sites. An example where this has had direct relevance to drug development is found in models of HIV infection. Wodarz and Nowak[28] summarized models that describe the dynamics of HIV infection and disease progression, and showed how mathematical models can be used to design treatment regimens that could boost antiviral immunity and induce long-term virus control. Jackson[29] described a similar model that included expressions for the effects of two classes of anti-HIV drugs, reverse transcriptase inhibitors and protease inhibitors. Resistance to these two classes of agents was modelled. The model predicted that despite the rapid acquisition of drug-resistant mutations, with the use of multi-drug combinations, and with sufficient treatment intensity, disease progression could be arrested for many years. This model can be used to compare the relative merits of early versus late treatment, and to compare the efficacy of simultaneous versus successive or alternating use of different agents. In the area of cancer chemotherapy, several new classes of drugs are being investigated that intervene at various points in the cell cycle. The pharmacodynamics of these agents are very different from those of earlier classes of drugs; depending on which stage of the cycle the cell is in, the drug may have an immediate effect, a delayed effect, or no effect at all. Tumour cells, in which checkpoint functions are mutated or deleted may respond in a qualitatively different way from cells with intact checkpoint function, e.g. by going into apoptosis rather than reversible cell cycle arrest. A combination of two agents acting at different points in the cycle will give effects that are highly dependent upon their sequence of administration. To predict the pharmacodynamics of this kind of drugs requires a detailed model of the mammalian cell cycle. Tyson et al.[30] have described a model that is being used for this kind of analysis. A model of this kind is in use at Cyclacel Ltd (Dundee, UK) and Physiomics plc (Oxford, UK) to predict optimal development strategies for cell cycle-targeted inhibitors. Also in the area of oncology, models of the angiogenesis process have been developed that may be used to predict the most effective use of antiangiogenic drugs.[31] Entelos Inc. (Foster City, CA, USA) have developed a set of modeling tools described as PhysioLab® that are particularly © Adis Data Information BV 2003. All rights reserved.
165
directed at disease modeling.[32] Their type 2 diabetes mellitus model has reportedly been used in the design and analysis of clinical trials.[33] Disease models of asthma and rheumatoid arthritis are also reported to be under development. Cellnomica, Inc. (Fort Myers, FL, USA and Munich, Germany) has developed multicellular pharmacodynamics (MCPD) models that resemble compartmental models but have a higher level of resolution; these models are being used for ADMET prediction. They have developed integrated models that include cell signaling and genomic regulation in simulations of the dynamics of multicellular systems. These models, e.g. an in silico cancer model, are being used to model drug effects on tumor growth properties.[27] In many disease areas, new drugs are under development against targets that form part of signal transduction pathways. These pathways include extracellular receptors, communicating with multistep pathways of protein kinases and phosphatases, and finally transcription factors. They may have highly nonlinear kinetics, multiple convergent and divergent branches, redundancy in both extracellular ligands and receptors, and their end-effect is activation or inhibition (or both) of complex gene regulatory networks. This complexity is such that the systems behavior of signaling networks can probably only be understood by creating detailed models. Neves and Iyengar[34] concisely review the current state of this area, including software reviews. The final frontiers in understanding complex biological systems are the central nervous system, and, hardly less complex, the immune system. Both of these systems deal with information processing on a massive scale, and both show emergent behaviors such as learning, memory, and distinction of self from nonself. In both areas we have reasonably effective drugs to treat some malfunctions. However, in every case we probably do not understand enough about how the compounds work to come up with an optimal strategy for moving forward to the next generation of drugs, except by optimizing interactions at the target receptors and exploring the whole system properties of the agents empirically. These areas represent the clearest examples of the need for systems biology in drug development. 9. Drug-Drug Interactions The use of drugs in combination, once deprecated as ‘polypharmacy’, has now become established in most disease areas, as it is increasingly accepted that combinations may have greater efficacy and selectivity than their constituent drugs used singly. In addition, combinations are more easily individualized to the needs of particular patients. Analyzing the results of a combination study is not trivial, since what constitutes ‘additivity’ or ‘synergism’ can Pharm Dev Regul 2003; 1 (3)
166
Jackson
be defined in a number of ways,[35] and the degree of an antagonistic or synergistic interaction may depend upon the degree of inhibition. In fact, two drugs may be antagonistic over part of their concentration range and synergistic elsewhere. The method of Chou and Talalay[36] is widely used for analysis of combination data; it calculates a parameter, the combination index (CI) where CI = 1 corresponds to additivity, CI > 1 to antagonism, and CI < 1 to synergism. CI may vary depending upon the total degree of inhibition. The program ‘Dose-Effect Analysis Software’ is commercially available (BioSoft, Cambridge, UK) for computation of CI. The Chou and Talalay[36] method has the limitation that the two drugs must be studied at a constant ratio, and there have been criticisms of its theoretical assumptions.[35] Another approach has been to fit dose-response data to a 3-dimensional response surface. Bunow and Weinstein[37] described a computer program, COMBO, that forms part of the MLAB suite of programs (Civilised Software, Bethesda, MD, USA), which fits data to a double logistic equation with one or more interaction parameters. MLAB is a system for mathematical and statistical modeling originally developed at the US National Institutes of Health. Greco et al.[35] described a response-surface approach that can select from a variety of equations and weighting schemes. This method includes an interaction parameter, α, that is invariant over the entire response surface; α = 0 corresponds to additivity, α > 0 implies synergism, and α < 1 indicates antagonism. This technique (which also generates statistical confidence intervals to determine whether apparent synergism or antagonism is significant) has been automated by a computer program, SYNFIT. SYNFIT runs on a PC and produces graphical output, including isobol plots. Prediction of drug-drug interactions is more difficult. Interactions may be based upon metabolism, cytokinetics, or kinetics of multi-enzyme systems, and all of these can be modeled. Computational prediction of these various kinds of drug interaction has thus been explored using the tools discussed above for modeling drug metabolism and complex systems (see sections 6 and 8), but these studies must be considered as research projects rather than as a routine drug development exercise.[25,37] Nevertheless, this will become an increasingly important area. In cancer chemotherapy, for example, the design of combinations that are optimized against a particular tumor is much more practical than tailoring individual drugs to particular patients, and thus represents the most promising approach to individualization of therapy. 10. Clinical Trials Simulations Clinical trials are so time-consuming and expensive that the possibility of using computational approaches to improve their © Adis Data Information BV 2003. All rights reserved.
success rate is highly appealing. The intent is not to perform fewer clinical trials, but to try and identify in advance what the likely consequences might be of different trial designs, so that the actual clinical studies will have the maximum information content. The ideal computational clinical trials software would incorporate a well-validated model of the disease process being studied, combined with pharmacokinetics (including population pharmacokinetics), pharmacodynamics, and biostatistics. In practice, most available disease models are more research tools than software that is sufficiently established to be relied upon as a clinical trial design tool. The validation of complex models of disease will be an important growth area over the next few years. In the meantime, a few software houses and pharmaceutical company in-house computational groups have brought together sets of tools that are making clinical trials modeling a reality. A leading example is the Trial Designer™ from Pharsight Corporation (Palo Alto, CA, USA). This enables the user to answer questions such as: What is the optimal treatment schedule of a drug for a particular indication? What is the expected range of response? How will a change in inclusion/exclusion criteria affect outcome? How frequently should the response be measured? What is the impact of poor compliance and how can the trial design be improved? The Pharsight® Trial Designer™ consists of three modules: a Model Generator is used to build quantitative descriptions of the drug, the disease, and the subject population; a Design Editor is used to design the clinical trial and to generate predicted clinical trial results; and a Results Viewer generates statistical and graphical analyses on the simulated trial results. Data can be exported to industry standard software analysis packages such as SAS®.[38] 11. Rational Drug Development Unlike rational drug design, where a broad consensus exists as to the key technologies (X-ray crystallography, high-field NMR, molecular modeling, computational chemistry), the term ‘rational drug development’ is still rather fluid. However, there is growing expectation that certain techniques will lead to improving the success rate of clinical trials. These include the established technologies of biostatistics and pharmacokinetics, as well as the stillemerging areas of pharmacodynamics, pharmacogenomics, rational combination design, and clinical trials modeling. Of course, like any other applied science, drug development includes rational and empirical components. Clinical development, in particular, is at present primarily a statistical exercise. The ability to make greater use of pharmacogenomics (to characterize the target population) and pharmacodynamics (to characterize the response at the molecular level) will allow drug developPharm Dev Regul 2003; 1 (3)
Predictive Software for Drug Design and Development
ment a greater ability to test hypotheses. This will require more computational support, which the software sector is beginning to address. 12. Conclusions Predicting trends at the forefront of technology suffers from the obvious limitation that unforeseen inventions and discoveries inevitably continue to emerge. A fairly obvious development like the mobile telephone literally changes the way we live our everyday lives, so it is even more likely that high-tech industry will be transformed by the unpredictable. With this caveat, looking at the recent past certainly suggests several trends. The first is that the increasing power of analytical technology will continue to make possible new kinds of measurement that will provide ever more detailed knowledge about how drugs work. A clear example of this from the past two decades is the way that more sensitive X-ray detectors have enormously increased the scope of protein crystallography. In pharmacokinetics, powerful new imaging techniques have made possible unprecedented advances in our ability to observe drug effects noninvasively in patients. The second clear generalization is that many of the new technologies would not be possible without greatly increased computational power – this is true, again, of protein crystallography, and of magnetic resonance imaging, and the increased computational intensity of drug discovery technologies seems set to continue for the foreseeable future. The third, very interesting trend is that the databases used in drug discovery have increased in size by many orders of magnitude, and this process has barely begun. Here we are discussing information, rather than tools. This rise in giant databases, and in software for their storage, retrieval, and analysis has been driven by technologies such as combinatorial chemistry and high-throughput screening, and also, critically, by the genome projects. With the accumulation of these mega-databases, and increasing access to them, has come the growing realization that they provide not only an archive, but also (in conjunction with modeling software) a resource with great predictive power. The conclusion to which the preceding trends are all pointing is that the use of modeling and simulation software will become an increasingly vital part of the drug developers’ armamentarium. Modeling tools, already widely employed in drug discovery, will be an increasingly important part of the drug development process, supplementing the biostatistical packages that represent current drug developers’ token nod to quantitative biology. Modeling techniques will be used to optimize clinical trial design, to correlate biomarker data with clinical outcome, to match particular drugs to patterns of single nucleotide polymorphisms in individual © Adis Data Information BV 2003. All rights reserved.
167
patients, and to select delivery routes and formulations that are matched to the biological demands of the disease process. Only by utilizing the predictive value of every piece of available information can we reconcile the conflicting demands of tailoring drug therapy to the individual patient and containing the runaway costs of modern drug development. Acknowledgements Preparation of this manuscript was supported by Cyclacel Ltd, Dundee, UK. The author has no financial interest in any of the software discussed in this review.
References 1. Babine RE, Bender SL. Molecular recognition of protein-ligand complexes: applications to drug design. Chem Rev 1997; 97: 1359-1472 2. Hopkins AL, Groom CR. The druggable genome. Nat Rev Drug Discov 2002; 1: 727-30 3. de Jong H. Modeling and simulation of genetic regulatory systems: a literature review. J Comput Biol 2002; 9: 67-103 4. Ghose A, Viswanadhan V, Wendoloski J. A knowledge-based approach in designing combinational or medicinal chemistry libraries for drug discovery. J Comb Chem 1999; 1: 55-68 5. B¨ohm HJ. The computer program LUDI: a new method for the de novo design of enzyme inhibitors. J Comput Aided Mol Des 1992; 6: 61-78 6. Mata P, Gillet VJ, Johnson AP, et al. SPROUT-3D structure generation using templates. J Chem Inf Comput Sci 1995; 35: 479-93 7. Hansch C, Klein TE. Quantitative structure-activity relationships and molecular graphics in evaluation of enzyme-ligand interactions. Methods Enzymol 1991; 202: 512-43 8. Zupan J, Gasteiger J. Neural networks for chemists: an introduction. VCH Verlag: Weinheim, 1993 9. Islam MN, Song Y, Iskander MN. Investigation of structural requirements of anticancer activity at the paclitaxel/tubulin binding site using COMFA and COMSIA. J Mol Graph Model 2003; 21: 263-72 10. Lipinski CA, Lombardo F, Dominy BW, et al. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Delivery Rev 1997; 23: 3-25 11. Lipinski CA. Chris Lipinski discusses life and chemistry after the rule of five. Drug Discov Today 2003; 8 (1): 12-6 12. Abraham MH, Ibrahim A, Zissimos AM, et al. Application of hydrogen bonding calculations in property based drug design. Drug Discov Today 2002; 7: 1056-63 13. Raevsky OA. Hydrogen bond strength estimation by means of HYBOT. In: van de Waterbeemd H, Testa B, Folkers G, editors. Computer-assisted lead finding and optimization: current tools for medicinal chemistry. New York: Weinheim, 1997: 367-78 14. Klein¨oder T, Gasteiger J. PETRA Parameter estimation for the treatment of reactivity applications [online]. Available from URL: http://zabib.chemie.unierlangen.de/software/petra/intro.phtml [Accessed 2003 Sep 3] 15. Veber DF, Johnson SR, Cheng H-Y, et al. Molecular properties that influence the oral bioavailability of drug candidates. J Med Chem 2002; 45: 2615-23 16. Penzotti JE, Lamb ML, Evenson E, et al. A computational ensemble pharmacophore model for identifying substrates of P-glycoprotein. J Med Chem 2002; 45: 1737-40 17. Potts RO, Guy RH. A predictive algorithm for skin permeability: the effects of molecular size and hydrogen bond activity. Pharm Res 1995; 12: 1628-33 18. Prausnitz MR, Noonan JS. Permeability of cornea, sclera and conjunctiva: a literature analysis for drug delivery to the eye. J Pharm Sci 1998; 87: 1479-88 19. Seelig A, Gottschlich R, Devant RM. A method to determine the ability of drugs to diffuse through the blood-brain barrier. Proc Natl Acad Sci U S A 1994; 91: 68-72 20. Hodgson J. ADMET: turning chemicals into drugs. Nat Biotechnol 2001; 19: 722-6 Pharm Dev Regul 2003; 1 (3)
168
21. Accelrys. Software for pharmaceutical, chemical, and materials research [online]. Available from URL: http://www.accelrys.com/ [Accessed 2003 Sep 3] 22. Cyprotex. Science and technology [online]. Available from URL: http:// www.cyprotex.com/ [Accessed 2003 Sep 3] 23. Br¨ustle M, Beck B, Schindler T, et al. Descriptors, physical properties and druglikeness. J Med Chem 2002; 45 (16): 3345-55 24. Jackson RC. Computer models in preclinical and clinical drug development. Boca Raton (FL): CRC Press, 1996 25. Conolly RB, Andersen ME. Biologically based pharmacodynamic models: tools for toxological research and risk assessment. Ann Rev Pharmacol Toxicol 1991; 31: 503-23 26. Amstein R, B¨uhler FR, Gasser D, et al. RIDO™/RIDO PLUS: an interactive computer-based guide to improve and shorten clinical drug development. Basel: European Centre of Pharmaceutical Medicine, 1998 27. Werner E. Systems biology: the new darling of drug discovery? Drug Discov Today 2002; 7: 947-9 28. Wodarz D, Nowak MA. Mathematical models of HIV pathogenesis and treatment. Bioessays 2002; 24: 1178-87 29. Jackson RC. A pharmacokinetic-pharmacodynamic model of chemotherapy of human immunodeficiency virus infection that relates development of resistance to treatment intensity. J Pharmacokinet Biopharm 1997; 25: 713-30 30. Tyson JJ, Csikasz-Nagy A, Novak B. The dynamics of cell cycle regulation. Bioessays 2002 Dec; 24 (12): 1095-109 31. Chaplain MA. Mathematical modelling of angiogenesis. J Neurooncol 2000; 50: 37-51
© Adis Data Information BV 2003. All rights reserved.
Jackson
32. Entelos. Science and technology [online]. Available from URL: http://www.entelos.com/science/index.html [Accessed 2003 Sep 3] 33. Entelos Inc, and Johnson & Johnson. Entelos®PhysioLab® technology used to evaluate compound for diabetes [online]. Available from URL: http://www.entelos.com/news/pressArchive/press47.html [Accessed 2003 Sep 3] 34. Neves SR, Iyengar R. Modelling of signalling networks. Bioessays 2002; 24 (12): 1110-7 35. Greco WR, Bravo G, Parsons JC. The search for synergy: a critical review from a response surface perspective. Pharmacol Rev 1995: 47; 331-85 36. Chou TC, Talalay P. Quantitative analysis of dose-effect relationships: the combined effects of multiple drugs or enzyme inhibitors. Adv Enzyme Regul 1984; 22: 27-55 37. Bunow B, Weinstein JN. COMBO: a new approach to the analysis of drug combinations in vitro. Ann N Y Acad Sci 1990; 616: 490-4 38. Pharsight®. Pharsight® Trial Simulator® quick tour [online]. Available from URL: http://www.pharsight.com/literature/pts_web_tour.pdf [Accessed 2003 Sep 3]
Correspondence and offprints: Dr Robert C. Jackson, Cyclacel Ltd, James Lindsay Place, Dundee, DD1 5JJ, UK. E-mail:
[email protected]
Pharm Dev Regul 2003; 1 (3)