Molecules in Silico: The Generation of Structural Formulae and Its Applications

Adalbert KERBER, Reinhard LAUE, Markus MERINGER and Christoph RUCKER


1 Introduction

Molecules are entities consisting of atoms that interact, their descriptions are approximations on different levels of exactness. We can easily distinguish the following levels of increasing accuracy:
At present we are able to handle the arithmetic and topological levels reasonably well. On the level of configuration there is some progress, though there are still several problems to be solved. Automatic and systematic generation and classification of all conformers is still largely unsolved.

2 The Arithmetic Level

A molecule is described on this level by its molecular formula, a list of atoms of which it is made. Thus, C6H6 is the molecular formula for benzene and its 216 constitutional isomers. C6H6 is thus a valid molecular formula, in contrast to an invalid formula, one that does not correspond to a molecule capable of existence, for example, C6H5. Obviously there are restrictions on molecular formulae, and we will consider these in the next section. We mention that along with the usual well-defined molecular formulae there are fuzzy formulae, those that consist of intervals for the occurrence numbers of elements.

3 The Topological Level

On this level of approximation a molecule is described by an interaction model, which means that we emphasize interactions between pairs of atoms in the molecule. Mathematical structures that can be interpreted as interaction models are in particular the unlabelled multigraphs. The vertices of a graph indicate the objects involved, while the edges connect the interacting ones, and different strengths of interactions are expressed by different multiplicities of the edges.
3.1 Definition A structural formula is a (usually connected) multigraph, the vertices of which are colored by element symbols. Moreover, the degree of each vertex, i.e. the number of edges to which this vertex belongs, agrees with the prescribed valence of the corresponding atom.
For example, the structural formulae for C6H6 are the connected multigraphs consisting of six vertices of degree 4 and six vertices of degree 1. The former are colored by the letter C, the latter by H. Here are two such formulae (out of 217), those of benzene and of Dewar benzene:

All 217 are interaction models of constitutional isomers of benzene, and they can easily be obtained (in a fraction of a second) using the molecule generator MOLGEN. Clearly most of them will not be structural formulae of existing molecules, but there are no strict rules known to distinguish between molecules merely not yet synthesized and molecules not capable of existence. Attempts were made to define the latter in terms of "forbidden" (too high in energy) substructures, but this approach met with failure in that on inspection of the Beilstein database almost any "forbidden" substructure was eventually found to occur in a known compound. So whenever completeness matters, e.g. in structure elucidation, all mathematically possible graphs should be constructed.
Accepting the notion of connected multigraph as an interaction model for molecules we can impose restrictions on molecular formulae: If g is a multigraph, then we can use the sequence of its vertex degrees

where lg(i) means the number of vertices of degree i in g. It is a partition of the number n of vertices in g. We abbreviate this fact by

To begin with, the following expression gives the number e(g) of edges of g,

This is a formulation of the simple fact that each bond connects two atoms. Assume that the degree sequence of an interaction model of a molecule with molecular formula C6H5 is lg = (0, 5, 0, 0, 6, 0,...). Then we obtain an odd number, which violates 3.2. Hence C6H5 is not a valid molecular formula.
Moreover, if we assume that the interaction models are always connected graphs, then we can use the following necessary and sufficient condition for the existence of at least one connected multigraph in terms of the partition lg:

where the inequality on the right says that there are at least n-1 edges required in order not to leave a vertex isolated. So we obtain the following test for the validity of a given molecular formula:
3.4 Corollary A given molecular formula and a corresponding sequence l of degrees together describe a valid molecular formula only if they satisfy condition 3.3.

4 Molecular Graphs

The above definition of structural formula needs to be refined to the notion of molecular graph that we are going to introduce now.
Chemical compounds are described by multigraphs consisting of particular vertices representing atoms and edges representing covalent bonds. These bonds may be single, double or triple bonds. The vertices are colored by the name of a chemical element and an atomic state.
A chemical element is identified by its atomic number which is the number of positive elementary particles contained in the atom, the protons. In its elementary state, the atom contains the same number of electrons, in this situation it does not bear a charge.
A certain number of electrons of the atom are able to interact with electrons of other atoms of the molecule in question. Electrons with this property are called valence electrons. Their number depends on the element, and the interactions are called chemical (covalent) bonds. An interaction between two electrons (from two atoms) is called a single bond and is denoted as a single line, an interaction between four or six electrons (from two atoms) is called a double or triple bond and is drawn as a double or triple line, respectively. There are also forms of interactions not amenable to this simple scheme (e.g. mesomerism). A single valence electron that does not participate in a bond is an unpaired electron, two valence electrons on one atom that are not involved in a bond form a free electron pair (a lone pair).
The sum of the number of electrons engaged in covalent bonds, of those in lone pairs and of an unpaired electron (if any) for an atom in a molecule may differ from the number of valence electrons in the isolated atom. The difference is the charge of the atom. For this reason, we define the state of the atom as follows:
4.1 Definition An atomic state is a quadruple

Such a state is called a ground state if qs = 0 and rs = false.
The valence of an atom in a molecule is the number of covalent bonds in which it is involved, each bond counted with its multiplicity, and so it is the degree of the corresponding vertex in the multigraph. For example, the valence of H atoms is 1, for O atoms it is 2, for N atoms it is 3 and for C atoms we have 4. But we should carefully note that this is true only if these atoms are in their ground state, i.e. if there is neither a charge nor an unpaired electron. There are elements such as phosphorus and sulfur that even in the ground state can exhibit more than one valence. For example, there are molecules with 3- or 5-valent phosphorus atoms. Sulfur may have valences 2, 4 or 6, differing in the number of free electron pairs. If we skip the assumption that the atoms are in their ground state, further valences can show up.
For this reason we introduce for each chemical element X a set SX of admissible atomic states. Its definition clearly depends on the particular situation of the molecule in question. For example, the most important elements in organic chemistry are gathered in the following set of elements:

We shall refer in the following to this set, and also to its extension

Table 1 contains, for the elements X Î E11 their atomic number T EX, the number of valence electrons V EX and a list of atomic states [1]. The states listed are those relevant for structure elucidation using mass spectroscopy.
The set SX of admissible states of the element X depends on the chemistry that we are willing to use in a particular situation. A hierarchical classification of the corresponding topological models, introduced in [2], can be described in terms of these states.

Table 1. Admissible states for the elements in E11 occurring in mass spectroscopy

4.2 Definition

Summarizing, we obtain the following chain of inclusions:

In terms of these notions, the structure generator MOLGEN (up to version 3.5, see [3]) is able to generate chemical compounds from RC. From version 4.0, cf. [4], it is possible to generate molecules for IC. Using an algorithm that identifies aromatic systems MOLGEN covers the most important part of MC, aromatic compounds, as well.
4.3 Definition (molecular graph) Let E denote a set of chemical elements and assume that SE indicates the set of all the admissible atomic states of the elements in E. In formal mathematical terms.

A molecular graph for a molecule consisting of n atoms from E is a triple

where e is a sequence of length n consisting of element symbols, i.e.,

The second component z is a sequence of n atomic states, where the i-th component is an admissible state of the i-th atom,

The third component g is a connected multigraph consisting of n vertices and edges that are at most 3-fold, for short,

Its vertices are numbered from 1 to n and are colored by the atom names e(i), the components of e. The degree of vertex i of the graph is equal to the valence of the i-th atom e(i),

Let Mn denote the set of molecular graphs on n atoms.
A problem with molecular graphs is that there are usually many molecular graphs that represent the same molecule, differing in vertex numbering. Here are two examples of differently numbered molecular graphs:

Vertex numbering is unavoidable since also the entries e(i) of the element distribution e, as well as the entries z(i) of the atomic states, are numbered. Hence, two such molecular graphs (e,z,g) and (e',z',g') describe the same molecule if and only if they are the same up to renumbering, which means that there is a permutation p such that


defined by

g({i, j}) denotes the multiplicity of the covalent bond that connects atoms i and j, i.e. g({i, j}) Î {0, 1, 2, 3}. In mathematical terms, we are faced with the following action of the symmetric group:

This action, like every action of a group on a set, induces an equivalence relation, the classes of which are called orbits, for example

is the orbit of (e,z,g). Properties (1) and (2) of definition 4.3 are preserved by this operation. Therefore Sn also operates on Mn.
4.4 Corollary A structural formula of a molecule with n atoms contained in E corresponds to an orbit of Sn on the set Mn, i.e. the set of structural formulae of molecules built from n atoms in E is the set of orbits

Hence the problem of construction of all structural formulae, i.e. all constitutional isomers of the molecular formula in question, amounts to the evaluation of a complete system of representatives of these orbits of the symmetric group. This is obviously an algebraic problem. To solve it, the efficient application of group theoretic methods is the method of choice. Double cosets can be used, as pointed out by G. Polya in his seminal paper already. Another useful tool is orderly generation, as indicated by R. C. Read, see [5 - 9].

5 Applications

Having described the mathematical tools which form the basis of an efficient, systematic and complete generation free of redundance of all structural formulae that correspond to a given molecular formula and (optional) further restrictions, we are now in a position to list a few software packages that use these methods.
To begin with, we mention three kinds of structure generation problems:
These will be discussed in the following subsections.

5. 1 Generation Based on a Molecular Formula

MOLGEN is a generator of structural formulae corresponding to a given (well-defined or fuzzy) molecular formula and (optional) further restrictions imposed by the user (see e.g. [10]). For example, depending of the version of MOLGEN used, the following numbers can be restricted by upper and lower bounds, i.e. we can force MOLGEN to generate just those constitutional isomers for which the following numbers belong to user-defined intervals:
Substructures can be prescribed, e.g. hydroxyl groups etc., or forbidden by the user in various ways:
Moreover, the user can force the generator to
Further features of MOLGEN allow to check restrictions after the generation, e.g.
MOLGEN applies a lot of algebra (groups and double cosets of groups, for example), as well as of combinatorics (orderly generation etc.). One of the crucial points is the following one, extremely important for applications (e.g. for the generation of large combinatorial libraries, for the generation of patent libraries, or for the use of inhouse databases, see [11]):
We shall return to this in the relevant subsections. Further details can be found in the MOLGEN home page ( The structure generator can be used (in a restricted version) also online. MOLGEN is quite fast depending, of course, on the conditions the user imposes. For example, if just the molecular formula is given, MOLGEN produces within a second several thousand isomers of moderate size on a standard PC. Table 2 lists for all molecular formulae based on E4 with mass 146 and at least one C atom the number of structural formulae (RC) together with the CPU time (in seconds) on a 2.53 GHz Pentium 4 PC.

5. 2 Chemical Education

For the purpose of chemical education, we developed an interactive online course on molecular symmetry and isomerism including stereoisomerism, called UNIMOLIS ([12]). The course is freely accessible in the internet ( in English or German. A somewhat limited version of MOLGEN is available within UNIMOLIS for the generation of constitutional isomers for a molecular formula entered by the student. The course is also available on CD, in this case to use the generator an internet connection is required.
In the most recent version of UNIMOLIS to be released soon, the student for the first time is given the opportunity to apply molecular mechanics calculations to a molecule of his choice, and to observe within that framework the structural and energetic effect of any molecular deformation. Moreover, for a given constitution repeated molecular mechanics calculations starting from random atom coordinates will generate and visualize stereoisomers.

5. 3 Generation of Combinatorial Libraries

Suppose a combinatorial library is described in terms of a set of building blocks and a set of chemical reactions that link building blocks by means of their functional groups. This corresponds to linking molecular graphs by means of well-defined procedures acting on well-defined subgraphs (see [13, 14]). So we can generate the complete library quickly, exhaustively and free of redundance. A prominent example are the libraries described by Carell et al. in [15], where there is a central molecule containing carboxylic acid chloride functions, to which various amine starting materials are attached via amidation.
Already here, during the generation of the library, the importance of the canonical form becomes obvious. If we admit 20 different amines to be attached to 4 carboxylic acid chloride functions that cover, in a tetrahedral arrangement, the cubane skeleton, then a purely combinatorial approach would result in 204 = 160000 seemingly distinct products. However, due to the high symmetry of the central molecule, no more than 13700 of these are in fact distinct. Such a generation is an algebraic problem rather than simply a combinatorial one, group theory is intensively used (see e.g. [7 - 9]).

5. 4 Evaluation of Molecular Libraries, QSPR

The basic problem of QSPR (quantitative structure property relationship) work is to describe the numerical values of some experimental property of compounds in terms of their molecular structures. The aim is to predict, by means of such a relationship, the property values for some other compounds in the same compound class, or even for all compounds in a certain structure space. The software package MOLGEN-QSPR was developed to assist the scientist in all steps of this endeavor. MOLGEN-QSPR allows to import, to generate or to manually edit the structures of the learning set of compounds (the real library), to import or to manually input property values, to calculate numerical values of quite a lot of molecular descriptors, to derive, using various methods of statistical learning, mathematical models for the property of interest (QSPR equations), and to apply such a model to a list of structures or to all structures in a somehow defined class of compounds (the virtual library), that again are produced completely and free of redundance by the generator. For applications see [16] and [17].

5. 5 Patent Libraries

Patents in chemistry often claim a whole library of compounds, a patent library, defined by a generic structural formula, a Markush formula. We present a particularly simple example of two patent libraries to be compared, in order to illustrate the problems to be overcome [18].
The first formula is taken from [19]:

In order to obtain a finite library, we restricted substituent R2 to include 1-6 C atoms.
The second Markush formula was constructed by us in order to demonstrate in an easy way the crucial points:

MOLGEN-COMB constructs the corresponding libraries L1 and L2 in a few seconds, using a reaction-based generation. These libraries are of the order

which are the numbers of compounds contained therein. The first point we should like to emphasize is that in the L2 case a purely combinatorial approach yields the number 3 . 33 . 5 . 3 . 4 = 5940 of possible combinations of the admissible substituents. However, due to the symmetry of the benzene skeleton, one structure appears twice (R5 = H, R1 = R4 = OH and R2 = C2H5, R3 = CH3 or R2 = CH3, R3 = C2H5), and MOLGEN automatically eliminates the duplicate.
The second important and absolutely crucial point is the canonical form in which the structural formulae of the members of these libraries are produced. Automatic comparison of canonical forms shows in a few seconds that there is an overlap. If either Markush formula represented a claim in one of two patents, the two patent assignees would face a problem, since the intersection of these two libraries is not empty, it consists of four elements:

Here is the overlap found:

5. 6 Structure Elucidation

Another important application of structure generation is in structure elucidation. Here a chemical structure best fitting a given set of spectroscopic data for an unknown compound is to be found (see e.g. [20]).
A database search for the spectra of an unknown is more or less likely to find hits if the unknown was obtained and examined previously. Columns 4 and 5 of Table 2 give the number of compound entries for a particular molecular formula in the Beilstein database [21], and the number of corresponding mass spectra in the NIST MS database [22]. The Beilstein database is the largest collection of known organic compounds worldwide, and the NIST MS database is one of the most comprehensive of its kind. Comparison of entries in the two columns shows that for most known compounds a mass spectrum is not available in the NIST database, even for a molecular mass as low as 146.

Table 2. Molecular formulae with mass 146 and at least one C atom, numbers of structural formulae together wit the CPU time for structure generation (in seconds), numbers of structural formulae included in the Beilstein and the NIST MS database

Comparison of the "Beilstein" and the "Structural formulae" columns of Table 2 shows how small a fraction of mathematically possible structural formulae exist as known organic compounds. In fact, since column 2 refers to RC, it gives a lower bound of possible structures (nitro compounds or nitrates, for example, are not included). Beilstein, on the other hand, does of course register nitro compounds and nitrates, and furthermore registers stereoisomers and isotopomers separately. So column 4 gives an upper bound of known structural formulae (constitutions). Thus the ratio of known existing constitutions to possible constitutions is even lower than would be expected from the numbers in columns 4 and 2.
Most structure elucidations, in particular the non-trivial ones thus deal with new compounds. Classically, structure elucidation is done in three steps [23]:

  1. Structural features are extracted from spectral data.
  2. All structural formulae compatible with these structural properties are generated.
  3. For the generated structures virtual spectra are calculated and compared to the experimental spectrum, the spectra/structures are then ranked according to goodness of fit.
While step ii) is essentially solved by structure generators such as MOLGEN, steps i) and iii) still pose challenging problems.
Bearing in mind the numbers from column 2 in Table 2, it is obvious that in step i) we have to find restrictions that are highly selective, so as to efficiently downsize the library of potential hits. At the same time restrictions must not be overselective, so as not to exclude the correct structure.
For example, a restriction highly efficient in the case of polycyclic compounds is the limitation to graph-theoretically (gt) planar compounds. A survey of the Beilstein database found that very few known compounds are gt-nonplanar [24], whereas many or even most of the candidate structures generated for a polycyclic unknown are gt-nonplanar. However, if the unknown happens to be gt-nonplanar, its correct structure will be missed under this restriction.
In the case of a synthetic product, the chemist often is able to provide some guidance for step i) (starting materials, experimental conditions, etc.), but in the case of a new natural product equivalent information is obviously not available. Spectroscopic methods providing information are numerous, it is, however, everything but easy to automatically and reliably translate this information into useful restrictions for the generation process.
As was demonstrated above, the most important input for a generator is a molecular formula. The method of choice to obtain this information nowadays is MS or the combination GC/MS, thanks to its high sensitivity and resolution. The method is applicable automatically even for large combinatorial libraries. In case of a low-resolution MS only being available, the software package MOLGEN-MS [25, 26] can give suggestions to identify the molecular ion peak, and then it provides possible molecular formulae for that molecular mass.
Further, using the tools developed by Varmuza [27, 28], MOLGEN-MS is able to identify from the mass spectroscopic peak patterns substructures that are either present or absent.
As to step iii), MS simulation is presented in [29], and first results on the quality of structure ranking according to MS fit are reported in [30]. These procedures are also incorporated in MOLGEN-MS.

6 The Geometric Level

Molecules live in 3D space, and so the final aim is to construct all distinct stereoisomers (configurations) corresponding to a given constitutional formula. Unfortunately, a stereo generator able to automatically construct stereoisomers efficiently, exhaustively, and free of redundance is not yet available. At hand are energy models such as Allinger's molecular mechanics programs ([31 - 33]), that allow to find some local minima of the particular energy function, corresponding to some conformers. Other software packages such as Gasteiger's CORINA [34] arrive at similar results by a different procedure. However, these packages do not find systematically all conformers, there is even no guarantee that the very lowest (in energy) conformer is found in every case. More importantly, often the chemist is not interested in the conformers but in the stereoisomers, as said above. Work on this problem is going on in this laboratory.

Financial support by the BMBF under contract 03C0318C is gratefully acknowledged.


[ 1] W. Werther, Versuch einer Systematik der Reaktionsmoglichkeiten in der Elektronenstoss-Massenspektrometrie (EI-MS), unpublished (1996).
[ 2] J. Dugundji and I. Ugi, An Algebraic Model of Constitutional Chemistry as a Basis for Chemical Computer Programs, Top. Curr. Chem., 39, 19-64 (1973).
[ 3] C. Benecke, R. Grund, R. Hohberger, R. Laue, A. Kerber, and T. Wieland, MOLGEN+, a Generator of Connectivity Isomers and Stereoisomers for Molecular Structure Elucidation, Anal. Chim. Acta, 314, 141-147 (1995).
[ 4] T. Gruner, A. Kerber, R. Laue, and M. Meringer, MOLGEN 4.0, MATCH - Commun. Math. Comput. Chem., 37, 205-208 (1998).
[ 5] G. Polya, Kombinatorische Anzahlbestimmungen fur Gruppen, Graphen und chemische Verbindungen, Acta Mathematica, 68, 145-253 (1937).
[ 6] R. C. Read, Everyone a Winner, Annals of Discrete Mathematics, 2, 107-120 (1978).
[ 7] A. Kerber, Applied Finite Group Actions, Springer (1991).
[ 8] S. Fujita, Symmetry and Combinatorial Enumeration in Chemistry, Springer (1991).
[ 9] R. Laue, Construction of Combinatorial Objects - A Tutorial, Bayreuther Mathematische Schriften, 43, 53-96 (1993).
[10] R. Laue, T. Gruner, M. Meringer, and A. Kerber, Constraint Generation of Molecular Graphs, Graphs and Discovery, DIMACS series in Discrete Math. and Theor Comp. Science, (in print).
[11] J. Braun, R. Gugisch, A. Kerber, R. Laue, M. Meringer, and C. Rucker, MOLGEN-CID - A Canonizer for Molecules and Graphs Accessible through the Internet, J. Chem. Inf. Comput. Sci., 44, 542-548 (2004).
[12] C. Rucker and J. Braun, UNIMOLIS - A Computer-Aided Course on Molecular Symmetry and Isomerism, MATCH - Commun. Math. Comput. Chem., 47, 173-174 (2003).
[13] T. Wieland, Mathematical Simulations in Combinatorial Chemistry, MATCH - Commun. Math. Comput. Chem., 34, 179-206 (1996).
[14] R. Gugisch, A. Kerber, R. Laue, M. Meringer, and J. Weidinger, MOLGEN-COMB, a Software Package for Combinatorial Chemistry, MATCH - Commun. Math. Comput. Chem., 41, 189-203 (2000).
[15] T. Carell, E. A. Wintner, A. Bashir-Hashemi, and J. Rebek Jr., Novel Method for Preparation of Libraries of Small Organic Molecules, Angew. Chem., Int. Ed. Engl., 33, 2059-2061 (1994).
[16] A. Kerber, R. Laue, M. Meringer, and C. Rucker, MOLGEN-QSPR, a Software Package for the Search of Quantitativ Structure Property Relationships, MATCH - Commun. Math. Comput. Chem., 51, 187-204 (2004).
[17] C. Rucker, M. Meringer, and A. Kerber, QSPR Using MOLGEN-QSPR: The Example of Haloalkane Boiling Points, J. Chem. Inf. Comput. Sci., (in press).
[18] A. Kerber, R. Laue, and M. Meringer, An Application of the Structure Generator MOLGEN to Patents in Chemistry, MATCH - Commun. Math. Comput. Chem., 47, 169-172 (2003).
[19] J. M. Barnard and G. M. Downs, Use of Markush Structure Techniques to Avoid Enumeration in Diversity Analysis of Large Combinatorial Libraries, (1997)
[20] R. Laue, C. Benecke, T. Gruner, A. Kerber, and T. Wieland, Vorhersage von Molekulstrukturen mit MOLGEN, Chemie und Informatik, ed. by B. Koppenhoefer and U. Epperlein, Shaker Verlag (1997), pp. 207-218.
[21] Beilstein database BS0302PR with MDL CrossFire Commander Server-Software, Version 6.0, MDL Information Systems.
[22] NIST/EPA/NIH Mass Spectral Library, NIST '98 Version, U.S. Department of Commerce, National Institute of Standards and Technology.
[23] R. K. Lindsay, B. G. Buchanan, E. A. Feigenbaum, and J. Lederberg, Applications of Artificial Intelligence for Organic Chemistry: The DENDRAL Project, McGraw-Hill Book Company, New York (1980).
[24] C. Rucker and M. Meringer, How Many Organic Compounds are Graph-Theoretically Nonplanar?, MATCH - Commun. Math. Comput. Chem., 45, 153-172 (2002).
[25] T. Gruner, A. Kerber, R. Laue, M. Meringer, K. Varmuza, and W. Werther, MASSMOL, MATCH - Commun. Math. Comput. Chem., 38, 173-180 (1998).
[26] A. Kerber, R. Laue, M. Meringer, and K. Varmuza, MOLGEN-MS: Evaluation of Low Resolution Electron Impact Mass Spectra with MS Classification and Exhaustive Structure Generation, Advances in Mass Spectrometry, volume 15, Wiley, pp. 939-940 (2001).
[27] K. Varmuza, P. He, and K.-T. Fang, Boosting Applied to Classification of Mass Spectra, J. Data Sci., 1, 391-404 (2003).
[28] K. Varmuza and W. Werther, Mass Spectral Classifiers for Supporting Systematic Structure Elucidation, J. Chem. Inf. Comput. Sci., 36, 323-333 (1996).
[29] J. Gasteiger, W. Hanebeck, and K.-P. Schulz, Prediction of Mass Spectra from Structural Information, J. Chem. Inf. Comput. Sci., 32, 264-271 (1992).
[30] M. Meringer, Mathematische Modelle fur die kombinatorische Chemie und die molekulare Strukturaufklarung, PhD thesis, University of Bayreuth, (2004) (in German)
[31] N. L. Allinger, MM2. A Hydrocarbon Force Field Utilizing V1 and V2 Torsional Terms, J. Am. Chem. Soc., 99, 8127-8134 (1977).
[32] N. L. Allinger, Y. H. Yuh, and J.-H. Lii, Molecular Mechanics. The MM3 Force Field for Hydrocarbons, 1, J. Am. Chem. Soc., 111, 8551-8566 (1989).
[33] N. L. Allinger, K. Chen, and J.-H. Lii, An improved force field (MM4) for saturated hydrocarbons, J. Comput. Chem., 17, 642-668 (1996).
[34] J. Sadowski and J. Gasteiger, From Atoms and Bonds to Three-dimensional Atomic Coordinates: Automatic Model Builders, Chem. Rev, 93, 2567-2581 (1993).