Challenges and prospects for computational aids to molecular diversity

Although workers have devised many usable strategies, a validated method for the computational analysis and optimization of molecular diversity in com...

2 downloads 16 Views 908KB Size

Download PDF

Perspectives in Drug Discovery and Design 718: 159-172, 1997. KLUWERIESCOM ©1997 Kluwer Academic Publishers. Printed in the Netherlands.

Challenges and prospects for computational aids to molecular diversity YVONNE CONNOLLY MARTIN

Pharmaceutical Products Discovery, D-47E, AP10/2, 100 Abbott Park Road, Abbott Park, 1L 60064-3500, U.S.A. Received 4 March 1997; Accepted 27 June 1997

Summary. Although workers have devised many usable strategies, a validated method for the computational analysis and optimization of molecular diversity in compound collections or combinatorial libraries remains a challenge. Even the most ambitious programs consider less than 1: 1039 of all possible compounds. The various methods need to be validated against experimental data and compared with each other, which might require sharing the structures and biological activities of 1fY _106 molecules. We need molecular descriptors that more accurately reflect the biological properties of compounds: this will probably entail designing a strategy to realistically include the properties of the multiple conformers, tautomers, and ionization states of molecules. For true computer generation of diverse synthesizable compounds, we need a whole new generation of programs that organize the knowledge of synthetic organic chemistry. Additionally, if the goal is to design molecules to fit a macromolecular target of known 3D structure, we also need improved methods for estimating ligand affinity.

Statement of the problem

The aim of using computational tools for assessing molecular diversity is a practical one: to minimize the costs and increase the chances of discovering novel and patentable compounds that cure or alleviate the symptom of a disease. We expect that by using computational tools we will select compounds to purchase or synthesize that will increase the molecular diversity of the compounds screened for biological activity. We also expect to use the tools to design diverse compounds based on the 3D structure of the target macromolecule(s) or a preliminary structure-activity relationship. The objective of both activities is to increase the number of types of compounds that possess the desired biological activity. This should increase the chances of finding a compound that will meet all the safety, efficacy, and economic considerations necessary to reach clinical practice. Hence, the assumption is that the larger the variety of compounds tested in a preliminary screen, the larger will be the variety of good leads for further development, and the better the chance for ultimate success.

160 How large is the universe of possible drug molecules? Will combinatorial methods or outside acquisitions approach this limit? If so, then no rational selection of compounds is necessary because our library could be exhaustive. We also need to know this number to assess how much of chemical compound space we have included in our diversity considerations. David Weininger has estimated that there are 10 180 possible compounds with drug-like structures [1]. Even if the correct value were 1050, the number of compounds covered in some patents, this is more than the number of seconds since the big bang, 104o -fold larger than the number of base pairs in the human genome, and 1031 _ fold larger than the number of bases in all living humans. The challenge to combinatorial chemists is enormous. If one could synthesize and test 4x 108 molecules per year by making and testing a library of a million compounds a day, it would take 2.5x 1041 years to make the 1050 molecules contained in one known patent. Hence, wise computational strategies to design libraries are essential. State-of-the-art computers take approximately 0.5 s to store a 2D structure in a chemical information database, generate a 3D structure from a structure diagram, or calculate the similarity of one molecule to 100 000 others. One computer can do approximately 108 such operations per year. The result is that even if, as expected, computer speed increases 1000 x over the next few years and one would use 100 processors, the limit of 1013 simple operations, of which several would be needed for any diversity analysis, would still prevent one from processing all the interesting molecules in all the interesting ways. Innovative algorithms are needed. Even if one does not want to consider all the possible drug molecules Gust those that are being considered for a single combinatorial library), the number of compounds to process remains immense. A simple example illustrates this: for one specific type of library built from two amino acids and one aldehyde, there are WID «2x103 amino acids)2x3x103 aldehydes) molecules that might be synthesized directly from commercially available precursors [2]. If one were to make libraries containing 1 million compounds, then 10 000 such libraries would be needed to explore the products of only this one reaction. Would making all such libraries be a productive strategy? How many would be needed to explore the molecular diversity presented by these easily synthesized compounds? Clearly, selection criteria must be applied to both the cores and substituents of a library if we are to efficiently seek new bioactive molecules. This means that we must have reliable computational methods that would optimize the choice of subsets of these 1DID compounds within a week or at most a month. Once a library has been designed and tested, a further computational challenge emerges. The screening data should contain the information to

161 allow one to design combinatorial libraries that will provide many active molecules against the screen of interest. However, current methods to analyze quantitative structure-activity relationships handle at most leY compounds. The challenge is to derive predictive structure-activity models from screening data on 106_10 8 compounds in a timely manner.

Analyzing diversity Library enumeration, registration, and searching While it is not a grand conceptual challenge, the current software for combinatoriallibrary enumeration [3-8] presents practical challenges for many real and virtual libraries. Most require almost manual intervention with complex libraries, for example those for which the reaction creates rings of various sizes. For library registration, simple-to-use programs for bench scientists are needed since the synthetic chemist is the best person to detect errors in transformations or enumeration. The commercial pressure for a useful product will probably lead to continual improvement in these programs. Methods suitable for the registration of synthesized libraries might not be suitable for the enumeration of virtual libraries that are millions of times larger. The substructure searching of mixture combinatorial libraries is also available [3,4,6-10]. Interesting new work aims to generate the molecular fingerprints of all compounds in a library without enumerating them [11]. This will also make it faster to cluster or partition a combinatorial library. Tripos scientists generated a database, ChemSpace, of easily prepared combinatorial libraries with two sites of diversity. It includes 1.2x 1011 small organic molecules. The structure handling software, SpaceCrunch, generates structures 700 times faster and searches 1400 times faster than commercially available structural database engines [12]. Since the algorithms are inherently parallelizable, searching speed can be increased by using multiple processors. The developers calculated that a single substructure search of ChemSpace with conventional methods would take 50 years, whereas they observed that searching with their algorithms takes 2 weeks on conventional computers and overnight on a Power Challenge with 48 R10000 processors [12]. This demonstration increases our optimism that we will be able to meet the computational challenge of the chemical information aspects of combinatorial chemistry. Molecular descriptors An earlier article showed that one can generate various descriptors for several million compounds in a reasonable time [13]. It is to be hoped that there will be a continued validation of the biological relevance of the different descriptors

162 now in use. This might require the synthesis and testing of libraries based on different selection criteria. Perhaps this will take the form of a competition such as is used in the protein structure field to compare different structure prediction or ligand docking strategies [14]. Two distinct aspects to molecular diversity need to be considered. Differences in the structure diagrams of two compounds, 2D diversity, suggest differences in a number of properties, the most important of which are that the structure diagram forms the basis of most chemical patents and also determines the routes of synthesis and hence the cost of the compounds. Thus, an increase in the 2D diversity of compounds active in a screen increases the chances of finding a novel and patentable series that is also cost-effective to market. On the other hand, it is the 3D properties of the molecules, their propensity to participate in electrostatic or hydrophobic interactions and their shape and flexibility, that determine their biological properties. Increasing the 2D diversity of a database is expected to yield more diverse hits in some screening programs; increasing the 3D diversity of a database is expected to yield at least one hit in more programs. An important question is how much the structure diagram also encodes information about the 3D properties of a molecule - tentative results suggest that some information, but not all, is encoded [15]. A related question is how do the descriptors that we use in our computational analysis encode both types of information. Since the macromolecular targets of potential drugs bind most strongly to ligands with complementary hydrogen-bonding, charge features, and shape, we programmed an expert system to recognize not just such features of the input structure, but also the possible features generated by ionization and tautomerization [16]. For example, it recognizes that either nitrogen of an imidazole can be a hydrogen-bond donor, a hydrogen-bond acceptor, and that the ring can bear a positive charge, and that some, but not all, secondary amines are protonated at neutral pH. The 3D geometric properties describe the relationships between atoms and their projections to complementary sites on an interacting molecule. Considering 3D properties immediately raises the question of how to treat the complex relationships between the conformers, tautomers, and ionization states of a molecule: should one use Boltzmann weighting, simple averaging, or something else? Is it necessary to explicitly encode shape [17-23]? Is it better to store the various conformations and protonation states or generate them on the fly? How should the entropic cost of fixing rotatable bonds enter into our 3D descriptions of molecules? Computational chemists need to develop and validate a more sophisticated and realistic strategy to include all the 3D properties of molecules.

163 Selection of subsets for purchase Often we wish to suggest a subset of compounds to purchase or synthesize that, once tested, would lead us to all biologically active compounds in the larger original set. This implies that any grouping and selection strategy we devise must be relevant to biological properties of the compounds. Selecting compounds for acquisition clearly needs computational support. One very practical example involves the identification of duplicates: in 1 million compounds considered for potential acquisition, we found that approximately 10% or 105 of these are duplicates of each other or of available Abbott compounds [24]. If we had accidentally purchased only 10% of such duplicates, 1% of the offered compounds, we could have spent >$300 000 to purchase compounds for which we already have a sample. Earlier, we and others demonstrated that compounds with similar molecular descriptors are also biologically similar [25-27]. Depending on the diversity of the database, our method groups 2-10 molecules per bio-homogeneous cluster, suggesting the purchase of 10-50% of the offered compounds. A strategy that increases the cluster size with the same level of accuracy in predicting activity would produce more information per compound (purchased or synthesized and) tested. Preliminary results suggest that 3D descriptors including complementary (protein) site points and molecular shape are significantly more efficient, but dramatic further improvements are still needed. The issue of how to optimally select a subset of molecules for purchase suggests that comparative tests of different strategies (clustering, diversity selection, partitioning methods) should be performed. The importance of considering the 3D properties of molecules is illustrated by the successful forecasts of potency using 3D QSAR [28,29] and by the identification of novel active structures using 3D database searching [30-33]. This viewpoint also has the advantage that one can set bounds on the number of types of molecules. For example, there are '" 16 000 possible triangles formed between atoms with positive, negative, hydrogen-bond donor, hydrogen-bond acceptor, or hydrophobic character spaced from 3.0 to 10.0 A using 1.0 A bins. By this criterion, a maximally diverse data set would contain at least 16 000 compounds and have equal numbers of compounds with each pharmacophore triangle. In reality, more than three points determine receptor recognition: expanding the pharmacophores to four points produces 1 500625 combinations (requiring biological testing of at least that many compounds) and to five produces 52 521 875 combinations. Does the 1010 virtual library discussed above include all 5 x 107 five-point pharmacophores? This analysis shows that even if molecules are described by their 3D properties, the number of molecules required to investigate property space is very large. Again, basing compound selection on such a description needs validation with large data

164 sets. Are three-point pharmacophores sufficient? Or are four or five points necessary? Is it advantageous to distinguish strong hydrogen bonding groups from weaker ones [34,35]? Again, how does one encode the complex relationships between the multiple confonners, tautomers, and ionization states of a compound? Structure-activity data from high-throughput screening provide an opportunity to compare different strategies. However, it comes with the caveat that the structures of some of the compounds tested might be in error due to decomposition or incomplete characterization. Additionally, it will be difficult to compare different methods unless several large data sets are publicly available for researchers to investigate. Design of mixtures for screening In order to screen large numbers of compounds, biologists have capitalized on the fact that most of the compounds will be inactive. Hence, they screen mixtures and test the individual compounds only if a mixture shows activity. One can also test each compound in two orthogonal mixtures designed such that no two mixtures have more than one compound in common [36]. Deconvolution involves identifying which compound is in common between the two mixtures that show activity. This strategy succeeds only if there are not two active compounds in any mixture - no mixture should contain two similar compounds. For Abbott screening we maximized the diversity of each tray or set of mixtures by fonning bio-homogeneous clusters and dispersing members of each cluster over as many trays as possible. Again, the success of such a strategy depends on how well the chosen descriptors reflect biological properties of molecules and how well the grouping strategy successfully puts similar compounds together. Selection of a representative subset Many companies do not test their entire compound collection in high-throughput screening, but rather test what they consider to be a representative subset of the entire collection. Others use such a subset for difficult or expensive screens or to calibrate a screen before testing all compounds. The second stage of the analysis typically involves testing all compounds similar to any of the actives identified in the first screen. This strategy raises the question of how representative is this subset of the whole collection? Does it really lead to discovery of all the active compounds? Our most efficacious clustering strategy typically groups 2-3 compounds together. For example, there are an average of 2.3 molecules per bio-homogeneous cluster in the Abbott collection [37]. Using this criterion, one must select 43% of the compounds to sample the biological diversity present in that database! Some simulation studies have been reported on this subject [38], but it certainly deserves more attention.

165 Especially important will be to compare various molecular descriptors and the experiences using typical corporate collections with those from combinatorial libraries, which are thought to contain more similar compounds. A common database of structure-activity data available to any interested investigator would greatly aid such comparisons. Design of combinatorial libraries We have reported a general method to design a diverse library subject to combinatorial and other constraints [39]. First we enumerate every compound that could be in the library, and then cluster [25,26,40] or partition them [41]. The objective of the design will be, for example, to reduce the average cluster or cell redundancy or to occupy as many clusters or cells as possible. Such a library would be simple to design, except that for mix-and-split combinatorial synthesis every substituent at anyone position must be present with every substituent at every other position: the combinatorial constraint [42,43]. Additionally, one method of deconvolution of the sub libraries (the last split synthesis is not mixed) is to determine the mass spectrum of the biologically active product. The more compounds in a sublibrary with the same molecular formula, the more compounds will have to be synthesized to identify the biologically active compound. Hence, the ideal combinatorial library will minimize this redundancy. We show that a genetic algorithm can accomplish such designs within a reasonable amount of time [39]. There are many important issues to be explored in combinatorial library design. Since the exploration of the scope and conditions of the chemistry can consume 3-4 times the time of synthesizing a library, when is it better to synthesize a second library with the same chemistry rather than move on to using new chemistry? When are two libraries resulting from different chemistries so similar by whatever chosen measure that it is not worthwhile to make the second? Can this be predicted from the structures of the cores of the molecules? Can we program the genetic algorithm to design second (or third or ... ) libraries that are most different from those preceding? What computational resources would this consume? Is there any advantage to selecting diverse precursors before library enumeration and, if so, how would we decide which of several similar precursors to carry forward? Would the possible procedures to search and cluster combinatorial libraries without enumerating be part of this strategy? What database structure would perform best? It will be a challenge to compare libraries - should we count the number of overlapping compounds, the number of similar compounds, or some other measure? In some cases there are not enough or diverse enough available precursors for library design. Simple chemical transformations on other available compounds might lead to a more diverse set of precursors. However, one would

166 need to devise and test strategies for measuring 3D diversity versus some absolute standard of optimum coverage because substantial resources might be needed to synthesize these additional precursors. Additionally, software (including the knowledge base) to automatically identify precursors available by easy transformations would need to be written. Again, there are interesting precedents [44], but no real tools. All procedures that handle virtual libraries should flag any compound that would use or form a controlled (narcotic) or toxic substance. The scientists involved could then decide if they wish to include this compound.

Targeted diversity Frequently the problem is not to merely identify diverse compounds, but rather to discover diverse compounds that possess some particular biological activity knowing that certain compounds show this activity and/or the 3D structure of the macromolecular target. Various strategies have been devised to include such information in the diversity design, but a number of unsolved issues remain. For example, how does one balance the need to make a diverse library with the need to make the compounds with high forecast potency? Genetic algorithm with biological response as the scoring function Two groups presented results from a strategy in which an original set of < 100 compounds is selected at random from a virtual library built around a core structure selected so that at least some compounds will be active [45,46]. The second and subsequent libraries are selected by a genetic algorithm that uses as the fitness function the observed biological potency of the compounds. Impressive increases in potency were found in later-generation libraries. One could imagine automating this process such that the compound selection as well as the robotic synthesis and testing would occur with no operator control. Although the scientists would spend some time to set up the system, a month of automatic synthesis, testing, and redesign would produce a set of highly active compounds for further detailed investigation. Before this scenario could come to fruition, the chemistry steps would have to be perfectly predictable and programmable. More validation of the parameters for the genetic algorithm would also be appropriate. We expect that the more diverse the virtual library from which the compounds are selected, the better the chance for success. Diversity design biased by prior structure-activity relationships For a generation, computers have been used to derive quantitative structureactivity relationships that have demonstrated their ability to forecast the potency of new compounds [47]. While the original QSAR methods often focused

167 on analogues [48,49], the newer 3D QSAR can accommodate more structural diversity [28,29]. For a typical focused library of individual compounds [50], this is a good synergy since the time to develop a QSAR model would not exceed the time required to investigate the relevant chemistry. However, the largest number of compounds covered in traditional QSAR is 103 - no more than a morning's work for one person in high-throughput screening, but several months' work for a computational chemist. Hence, to develop QSARs based on high-throughput screening, one challenge is to discover descriptors for QSAR that can be generated at a rate of 105 compounds per day, that can be calculated for most organic structures, and that are relevant to biological activity. Beyond this, the traditional statistical methods of multiple linear regression and partial least squares are also slow, sensitive to ill-conditioned or noisy data, and unable to detect the presence of submodels within the data set. When the structure-activity relationships are derived from screening either combinatorial libraries or historical collections, the computational strategies must be tolerant to the fact that the structures of some compounds might not be correct and that the biological activity of some might be due to an impurity. Newer computational strategies may be useful to automatically detect 2D or 3D structural patterns associated with bioactivity in a particular screen [51-55]. In a design-synthesis-analysis-design cycle, such hypotheses could then be used to enrich a subsequent directed library in the relevant features to refine the hypotheses or select between possibilities [56,57]. Again, for this we would need a QSAR based on 105 rather than 102 compounds; that is robust in the face of errors in structures, missing compounds, and errors in biological data; and that can be generated in a few days. Such a method does not yet exist. Often, high-throughput screening identifies only a few structurally diverse active compounds. This suggests that the more qualitative pharmacophore mapping method may be the appropriate one to choose [58]. Again, current methods are on the border of being too slow for use in the new screening paradigm. A follow-up would be to search 3D databases of known and hypothesized compounds to identify others that fit the pharmacophore pattern [31-33,59]. The concern with this strategy is a familiar one: that the database construction and searching might be far too slow to be practical for library design. Alternatively, one could use de novo design programs discussed below to suggest new compounds that fit the pharmacophore hypothesis. Diversity design based on protein structure The number of solved 3D protein structures now approaches 6000 [60]. A typical molecular diversity problem might be to propose which compounds

168 of a proposed combinatorial library would bind to a target [61], a docking exercise [62]. Is it sufficient to examine the potential substituents at each position separately, can we assume that the 'core' will stay as found in the original experimental structure, or do we need to evaluate every potential library product? Although computer programs that design compounds to fit a protein binding site are now available [63,64], the problem of forecasting binding affinity remains a major challenge.

Generating diversity de novo Ultimately we would like the computer to suggest the whole structure and synthetic pathway for the molecules in a combinatorial library. Usually we would add other constraints such as that the library should be diverse, that it should be different from those already available, and that it should contain compounds with good forecast biological activity, etc. Suggesting synthesizable compounds would require accessing detailed information on synthetic paths, scopes of reactions, and available precursors. Databases to store reaction information Currently, computers can store and search reactions, transformations, as well as individual compounds [65,66]. The first such commercial system, REACCS [67], has proved to be extremely useful to synthetic organic chemists. With the ability to search Beilstein electronically [68], an important milestone in handling chemical transformations by computer has been reached. However, this source covers only reactions reported in the primary literature and no attempt is made to quantify the scope of a particular transformation or even to organize the many examples of one transformation type. Could some of the tools developed for the retrosynthetic analysis of organic syntheses be applied [69]? The long history of encoding reactions for retrosynthetic analysis illustrates the difficulty of manually discovering and encoding what is known about synthetic organic chemistry [70-74]. Will the current algorithm for identification of reaction similarity [44] be useful? Could an artificial intelligence program discover the rules of a reaction by parsing the Beilstein or a similar database? Combinatorial chemists often spend days to weeks experimentally investigating the scope and conditions of a reaction before they synthesize a library. Should these data be saved for future use? How about the instructions for the robots? Will or should all the work done in support of combinatorial chemistry be deposited into a publicly available database? Would it be useful what would we do with such a mass of information?

169 De novo design based on 3D protein structures or QSAR Several programs design compounds based on fit to 3D protein structures or a pharmacophore hypothesis [64,75,76]. However, none integrates such a design with a knowledge of the scope of 'all' known reactions with available precursors to suggest compounds that can be synthesized easily. A serious problem in de novo design based on macromolecular structure is our inability to accurately predict relative protein-ligand affinity [62]. Perhaps some of the newer approaches will be fruitful, but this is still to be proven [77]. The method for predicting binding affinity must be accurate enough to identify the best compounds in a virtual library and yet fast enough to process hundreds of millions of compounds. Such approaches will be challenged even more if the protein structure is derived from homology modeling rather than experimental observation. This is especially likely to be the case for proteins identified by genomics: the reason for making the combinatorial library might be to identify a ligand that will stabilize the structure enough that it can be solved by crystallography or NMR. In other instances, libraries will be designed based on previous structureactivity relationships or SAR by NMR [78]. Again, the challenge is to efficiently access the compounds that are possible to synthesize by combining information from reaction and precursor databases, while simultaneously applying structural, SAR or other design criteria.

Prospects Although the next years will see great increases in computer capability and information on biological properties of many molecules, much work remains to be done. Descriptors and compound selection strategies need to be evaluated; new algorithms that process orders of magnitudes more structures must be invented; fast and accurate methods to forecast binding affinity of ligands to proteins of known 3D structure need to be developed; and new strategies must be devised to couple known organic chemistry with the design of combinatorial libraries based on diversity or predicted affinity. It is impossible to predict the contents of an article written in 10 years on the subject of molecular diversity.

References 1.

2. 3.

Czarnik, A.W., Chemtracts-Org. Chern., 8 (1995) 13. Available Chemicals Directory, MDL Information Systems Inc., San Leandro, CA, 1997. Siani, M.A., Weininger, D. and Blaney, I.M., 1. Chern. Inf. Comput. Sci., 34 (1994) 588.

170 4.

Siani, M.A, Weininger, D., James, C.A. and Blaney, J.M., J. Chern. Inf. Comput. Sci., 35 (1995) 1026. 5. Pearlman, R., Combindbmaker, University of Texas, Austin, TX, 1996. 6. Legion and Unity, Tripos Inc., St. Louis, MO, 1997. 7. Project Library and Central Library, MDL Information Systems Inc., San Leandro, CA, 1997. 8. Chern-X, Chemical Design Ltd., Oxon, 1996. 9. Merlin and Thor, Daylight Chemical Information Systems Inc., Mission Viejo, CA, 1997. 10. Warr, w., Perspect. Drug Discov. Design, 7/8 (1997) 115 (this issue). 11. Downs, G.M. and Barnard, J.M., J. Chern. Inf. Comput. Sci., 37 (1997) 59. 12. ChemSpace and SpaceCrunch, Tripos Inc., St. Louis, MO, http://www.tripos.comlspacecrunch/ChemSpace.html, 1996. 13. Brown, R.D., Perspect. Drug Discov. Design, 7/8 (1997) 31 (this issue). 14. Strynadka, N., Eisenstein, M., Katchalskikatzir, E., Shoichet, B.K., Kuntz, I.D., Abagyan, R., Totrov, M., Janin, J., Cherfils, J., Zimmerman, E, Olson, A, Duncan, B., Rao, M., Jackson, R., Sternberg, M. and James, M., Nat. Struct. BioI., 3 (1996) 233. 15. Brown, R.D. and Martin, Y.C., J. Chern. Inf. Comput. Sci., 37 (1997) 1. 16. Martin, Y.e., Brown, R.D., Lico, I. and Delazzer, J., Submitted to Internet Journal of Chemistry. 17. Anzali, S., Barnickel, G., Krug, M., Sadowski, J., Wagener, M., Gasteiger, J. and Polanski, J., J. Comput.-Aided Mol. Design, 10 (1996) 521. 18. Nilakantan, R., Bauman, N. and Venkataraghavan, R., J. Chern. Inf. Comput. Sci., 33 (1993) 79. 19. Bemis, G.W. and Kuntz, I.D.A., J. Comput.-Aided Mol. Design, 6 (1992) 607. 20. Good, Ae. and Richards, w.G., J. Chern. Inf. Comput. Sci., 33 (1993) 112. 21. Meyer, AM. and Richards, w.G., J. Comput.-Aided Mol. Design, 5 (1991) 426. 22. Bath, P.A., Poirrette, AR. and Willett, P., J. Chern. Inf. Comput. Sci., 35 (1995) 714. 23. Bath, P.A, Poirette, AR., Willett, P. and Allen, EH., J. Chern. Inf. Comput. Sci., 34 (1994) 141. 24. Bures, M., unpublished observations, 1997. 25. Willett, P., Similarity and Clustering Techniques in Chemical Information Systems, Research Studies Press, Letchworth, 1987. 26. Brown, R.D. and Martin, Y.C., J. Chern. Inf. Comput. Sci., 36 (1996) 572. 27. Patterson, D.E., Cramer, R.D., Ferguson, AM., Clark, R.D. and Weinberger, L.E., J. Med. Chern., 39 (1996) 3049. 28. Kubinyi, H. (Ed.) 3D QSAR in Drug Design: Theory, Methods and Applications, ESCOM, Leiden, 1993. 29. Martin, Y.C., Kim, K.-H. and Lin, e.T., In Charton, M. (Ed.) Advances in Quantitative Structure Property Relationships, JAI Press, Greenwich, CT, 1996, p. 1-52. 30. Kuntz, I.D., Science, 257 (1992) 1078. 31. Wang, S.M., Milne, G.W.A, Yan, X.J., Posey, I.J., Nicklaus, M.e., Graham, L. and Rice, w.G., J. Med. Chern., 39 (1996) 2047. 32. Mason, J.S., McLay, I.M. and Lewis, R.A, In Dean, P.M., Jolles, G. and Newton, C.G. (Eds.) New Perspectives in Drug Design, Academic Press, London, 1995, p. 225-253. 33. Martin, Y.C., J. Med. Chern., 35 (1992) 2145. 34. Raevsky, O.A, Grigor'ev, V.J., Kireev, D.B. and Zefirov, N.S., Quant. Struct.-Act. Relatsh., 11 (1992) 49. 35. Abraham, M.H., Duce, P.P., Prior, D.V., Garratt, D.G., Morris, J.J. and Taylor, P.J., J. Chern. Soc., Perkin Trans. 2, (1989) 1355. 36. Burres, N.S. and Clement, J.J., In Zambias, R. and Kolb, A (Eds.) American Chemical Society, Washington, DC, 1997, in press. 37. Martin, Y.e., Brown, R.D. and Bures, M.G., In Kerwin, J.E and Gordon, E.M. (Eds.) Combinatorial Chemistry and Molecular Diversity, Wiley, New York, NY, 1997, in press. 38. Taylor, R., J. Chern. Inf. Comput. Sci., 35 (1995) 59.

171 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73.

Brown, R.D. and Martin, Y.e., J. Med. Chern., 40 (1997) 2304. Downs, G.M., Willett, P. and Fisanick, W., J. Chern. Inf. Comput. Sci., 34 (1994) 1094. Pearlman, R.S., Network Science, http://www.awod.com!netscilIssues/. June 1996. Gordon, E.M., Gallop, M.A. and Patel, D.V., Acc. Chern. Res., 29 (1996) 144. Choong, I.e. and Ellman, J.A., In Bristol, J.A. (Ed.) Annual Reports in Medicinal Chemistry, Academic Press, San Diego, CA, 1996, p. 309-318. Gasteiger, J., Ihlenfeldt, W.-D., Fick, R. and Rose, J.R., J. Chern. Inf. Comput. Sci., 32 (1992) 700. Weber, L., Wallbaum, S., Broger, C. and Gubemator, K., Angew. Chern., Int. Ed. Eng!., 34 (1995) 2280. Singh, J., Ator, M.A., Jaeger, E.P., Allen, M.P., Whipple, D.A., Soloweij, J.E., Chowdhary, S. and Treasurywala, A.M., J. Am. Chern. Soc., 118 (1996) 1669. Boyd, D.B., In Lipkowitz, K.B. and Boyd, D.B. (Eds.) Reviews in Computational Chemistry, VCH, New York, NY, 1990, p. 355-371. Hansch, e. and Leo, A., Exploring QSAR: Fundamentals and Applications in Chemistry and Biology, American Chemical Society, Washington, DC, 1995. Martin, Y.C., Quantitative Drug Design, Marcel Dekker, New York, NY, 1978. DeWitt, S.H., Kiely, J.S., Stankovic, C.J., Schroeder, M.C., Cody, D.M.R. and Pavia, M.R., Proc. Nat!. Acad. Sci. USA, 90 (1993) 6909. Gombar, V.K. and Enslein, K., J. Chern. Inf. Comput. Sci., 36 (1996) 1127. Klopman, G. and Li, J.-Y., J. Comput.-Aided Mol. Design, 9 (1995) 283. Klopman, G., J. Am. Chern. Soc., 106 (1984) 7315. Sheridan, R.P., Miller, M.D., Underwood, D.J. and Kearsley, S.K., J. Chern. Inf. Comput. Sci., 36 (1996) 128. Kearsley, S.K., Sallamack, S., Fluder, E.M., Andose, J.D., Mosley, R.T. and Sheridan, R.P., J. Chern. Inf. Comput. Sci., 36 (1996) 118. Sheridan, R.P. and Kearsley, S.K., J. Chern. Inf. Comput. Sci., 35 (1995) 310. Agrafiotis, D.K., Bone, R.P., Saleme, P.R. and Soil, R.M., USA Patent no. 5,463,564, October 3 I, 1995. Martin, Y.C., In Martin, Y.e. and Willett, P. (Eds.) Design of Bioactive Molecules Using 3D Structural Information, American Chemical Society, Washington, DC, 1997, in press. Willett, P., J. Mol. Recog., 8 (1995) 290. Brookhaven National Laboratory, Brookhaven, NY, http://www.pdb.bn!.gov/. 1997. Kick, E.K., Roe, D.e., Skillman, A.G., Liu, G., Weing, T., Sun, Y., Kuntz, I.D. and Ellman, J., Chern. BioI., 4 (1997) 297. Dixon, S. and Blaney, J., In Martin, Y.e. and Willett, P., (Eds.) Design of Bioactive Molecules Using 3D Structural Information, American Chemical Society, Washington, DC, 1997, in press. Bohm, H.J., J. Comput.-Aided Mol. Design, 10 (1996) 265. Gillet, V.J. and Johnson, A.P., In Martin, Y.C. and Willett, P. (Eds.) Design of Bioactive Molecules Using 3D Structural Information, American Chemical Society, Washington, DC, 1997, in press. Ott, M.A. and Noordik, J.H., Recl. Trav. Chim. Pays-Bas, III (1992) 239. Gasteiger, J., Ihlenfeldt, W.-D. and Rose, P., Recl. Trav. Chim. Pays-Bas, III (1992) 270. MDL Information Systems Inc., San Leandro, CA, 1997. CrossFire, Beilstein Informationssysteme GmbH, Frankfurt, 1995. Moll, R., J. Chern. Inf. Comput. Sci., 37 (1997) 131. Pensak, D.A. and Corey, EJ., In (Eds.) Computer assisted Organic Synthesis, American Chemical Society, Washington, DC, 1977, pp. 1-32. Wipke, W.T. and Rogers, D., J. Chern. Inf. Comput. Sci., 24 (1984) 71. Johnson, A.P., Marshall, C. and Judson, P.N., J. Chern. Inf. Comput. Sci., 32 (1992) 411. Corey, EJ., Long, E.K., Lotto, G.I. and Rubenstein, S.D., Recl. Trav. Chim. Pays-Bas, III (1992) 304.

172 74. 75. 76. 77. 78.

Ihlenfeldt, W.-D. and Gasteiger, J., Angew. Chern., Int. Ed. Engl., 34 (1995) 2613. Caftisch, A. and Karplus, M., Perspect. Drug Discov. Design, 3 (1995) 51. Cosgrove, D.A. and Kenny, P.w., J. Mol. Graph., 14 (1996) 1. Welch, W., Ruppert, J. and Jain, A.N., Chern. BioI., 3 (1996) 449. Shuker, S., Hajduk, P., Meadows, R. and Fesik, S., Science, 274 (1996) 1531.

Challenges and prospects for computational aids to molecular diversity

Recommend Documents