Development of an Atomistic Brownian Dynamics Algorithm with Implicit Solvent Model for Long Time Simulation

Tadashi ANDO, Toshiyuki MEGURO and Ichiro YAMATO


Return

1 Introduction

Recent progress of genomic analysis necessitates the development of prediction methods of protein structure and function [1]. Methods of protein structure prediction are classified into three main categories: homology modeling, fold recognition and ab initio folding. The first two methods use structures already solved as templates. The third method does not require such information and is based on physical chemistry in principle. In this respect, the ab initio method is useful not only to predict protein structures but also to design novel functional proteins or drugs, thus should be developed.
Molecular dynamics (MD) and Monte Carlo (MC) simulations are mainly used for the ab initio method at present. For protein structure prediction by these methods, it is necessary to develop an accurate energy function for the protein-solvent system and an efficient conformational search algorithm [2 - 5]. By the MD method, actual smooth motion of atoms is reproduced. However, since the folding of even small proteins takes more than a few milliseconds [6], it is difficult to simulate the complete folding process with explicit solvent molecules with present computer power, because the limit of time step in MD is 2 fsec. By the MC method, it is possible to obtain thermodynamic parameters without long time simulation, if efficient sampling is applied. However, the sampling efficiency is not satisfactory when this method is applied to a condensed polymer system like a protein, since random displacements of atoms in a polymer cause frequent collisions with other atoms. In this respect, new sampling methods, such as multicanonical, simulated tempering, replica exchange and so on have been developed for MD or MC simulations [7].
Stochastic dynamics simulation is another solution for conformational search efficiency; it enables long time simulation because mild instabilities of certain long-time steps are masked [8, 9]. In Langevin dynamics simulation [8], one can treat the effects of solvent as a dissipative random force. Since the motion of polypeptide in aqueous solvent is slow, the solvent damping is large and the inertial memory is lost in a very short time. The relevant approximate equation is called the Brownian equation [9]. So far, each residue in a polypeptide has been mainly represented by one or two spheres connected by virtual bonds in Brownian dynamics of proteins [10, 11]. In order to obtain detailed information on the folding process, it is necessary to use more realistic models. So far, only a few attempts have been made using the atomistic model for 20-mer polyalanine [12] and 4-mer peptide [13]. In this report, we present an atomistic Brownian dynamics (BD) algorithm with implicit solvent model for protein folding simulation. We test the effectiveness of this algorithm using 28-mer bba peptide as a model. This is the first report of the stable simulation of an actual protein using an atomistic BD algorithm with implicit solvent model.

2 Methods

2. 1 Brownian Dynamics Algorithm

By treating the effects of solvent as a dissipative random force, the Langevin equation can be expressed as

Here, ri and mi represent the mass and position of atom i, respectively. zi is a frictional coefficient and is determined by the Stokes' law, that is, zi = 6paiStokesh in which aiStokes is a Stokes radius of atom i and h is the viscosity of water (= 0.205 kcal/mol/A3.psec at 280 K). Fi is the systematic force on atom i. Ri is a random force on atom i having a zero mean <Ri(t)> = 0 and a variance <Ri(t)Rj(t)> = 2zikBTdijd(t)I, where I is 3×3 unit tensor; this derives from the effects of solvent.
For the overdamped limit (the solvent damping is large and the inertial memory is lost in a very short time), we set the left side of Eq. 1 to zero,

The integrated equation of Eq. 2 is called Brownian dynamics [9];

where h is a time step and zi is a random noise vector obtained from Gaussian distribution.

2. 2 United-Atom Model and Force Field

The AMBER91 united-atom force field [14] was used; non-polar hydrogens are included in the carbon atoms they attach, while polar hydrogens are explicitly treated. The Stokes radius of each atom is its van der Waals radius plus 1.4 A.
In this simulation, the effective energy of the system W(rN) for a protein containing N atoms with Cartesian coordinates rN = (r1, r2, r3, .....) consists of two parts, intramolecular energy Vintra(rN) given by AMBER91 force field and mean solvation energy term Gsolv(rN),

Intramolecular energy Vintra(rN) is composed of bonds, angles, dihedrals (including impropers), van der Waals, hydrogen bonding and electrostatic contributions. In the original AMBER91 force field, the hydrogen-bond term was just a function of distance. However, since there have been many successful reports for protein designing by using distance-dependent and angle-dependent functions for the hydrogen-bond term [15 - 19], we used an angle-dependent, 12-10 hydrogen-bond potential:

The angle-dependent term, F(qA-H-D,qAA-A-H), varies depending on the type of hybridized orbital of the acceptor atom:

where qA-H-D is the acceptor-hydrogen-donor angle and qAA-A-H is the base-acceptor-hydrogen angle (where the base is the atom that attaches to the acceptor).

2. 3 Solvation Energy

It has been reported that no continuum solvent model can completely account for the effect of explicit inclusion of solvent [20, 21]. But explicit solvent simulation expands the time cost enormously in order to appropriately sample the solvent configurations. Therefore, various implicit solvent models for protein simulation have been developed to represent the effect of explicit solvent; distance-dependent dielectric models [22], surface area models [23 - 26], Gaussian solvent exclusion model [27], generalized Born/surface area (GB/SA) models [28 - 31], generalized Born/analytical continuum electrostatic (GB/ACE) model [32], reference interaction site model (RISM) theory [33] and so on. In this study, we used an implicit solvent model combining two models; Distance-dependent Dielectric (DD) model and Surface Area (SA) model. DD model assumes the dielectric constant e = Crij where C is a constant value and rij is the distance between atom i and j. This model is approximating the long-range electrostatic screening effects [22]. In the SA model, the solvation energy is given by:

where SAi(rN) is the solvent accessible surface area of atom i and si is a solvation parameter depending on the atom type. In this study, we used e = 2rij and sN, O = -60 cal/mol/A2, sC = 12 cal/mol/A2 and sH, S = 0 cal/mol/A2 for DD/SA model [25]. SAi(rN) was calculated using the approximate analytical expression of Hasel et al. [34]. The solvation term was calculated at every time step of dynamics simulation.

2. 4 Bond Length Constraints and Adaptive Time Step

BD algorithm makes it possible to use longer time steps in the simulations. The LINCS algorithm [35] was used to constrain bond lengths. The time step used was 10 fsec. However, angle energy divergence occurred sometimes (especially in PRO residue) when the long time step was used. To avoid this problem, we used the adaptive time step method [13] in the simulation: A normal time step Dt was used mostly and the time step was split into n giving smaller time step Dt/n when the angle energy became larger than a predetermined threshold. In our simulations, the normal time step was 10 fsec, the threshold was 20 kcal/mol and n = 5. Small time steps were used less than 1% of the simulation time. 9 A non-bonded cut-off was used and the non-bonded pair list was updated every 10 steps.

2. 5 BD Simulation

The effectiveness of the BD algorithm with the implicit solvent model was tested for a 5 nsec simulation of designed bba fold 28-mer peptide pda8d (PDB code 1psv) [15]. The sequence is KPYTARIKGRTFSNEKELRDFLETFTGR with 304 atoms [15]. We used the reported structure as the starting structure (Figure 1). Simulation temperature was 280 K. Coordinates and energies were recorded every 10 psec during the simulation. BD simulation without solvation energy calculation (i.e. Gsolv(rN) = 0 kcal/mol and e = 1) was also performed for comparison of computation time.


Figure 1. Ribbon representations of the pda8d NMR structure [15]. (a) A view from one side, with N and C termini labeled. (b) A view of the peptide rotated by 90o with respect to (a) around the vertical axis. The figures are generated with RasMol [43].

The BD simulation program was written in Fortran. All calculations were performed on an 800 MHz Duron processor.

2. 6 MD Simulation

We also performed MD simulation in explicit water molecules and in vacuo (e = 1) to compare with the results of BD simulation with implicit solvent model. Both simulations were performed using the MD program AMBER 4.1 [36] with united-atom force field, which was the same as that used in BD simulation. The starting structures were the same as used for BD simulation. The explicit water simulation was performed in the constant NPT ensemble using the peptide (initial structure model determined by NMR) that was solvated with TIP3P water in a box extending at least 10 A in all directions, producing the system of 2,459 water molecules and total 7,681 atoms. The periodic boundary condition was used for explicit water model simulation. The system was energy minimized with standard procedure. The temperature of the system containing water molecules was maintained by coupling solute and solvent (only for explicit water model simulation), separately, to an external heat bath fixed at the reference temperature [37]. The temperature coupling constant was 0.2 psec. The pressure of the system containing water molecules was maintained by coupling the system to an external pressure bath at 1 atm with a coupling constant of 0.2 psec [37]. All covalent bonds were constrained with SHAKE algorithm [38]. The time step was 2 fsec. The 9 A non-bonded cut-off was used and the non-bonded pair list was updated every 10 steps. Simulation temperature and sampling were the same as used for BD simulation.

3 Results

3. 1 Computational Cost

Table 1 lists the time required to run 1 nsec simulation of pda8d with various solvent models on an 800 MHz Duron processor in a personal computer. It also lists the relative times compared to in vacuo (e = 1) MD simulation. in vacuo MD simulation required 1.6 (85.1/52.5) times more simulation time than the BD simulation without solvation energy. Therefore, the increase of time step from 2 fsec (MD) to 10 fsec (BD) was not so effective for time-saving. On the contrary, BD simulation with DD/SA solvent model required 53 (4,230/80.0) times less time than the MD simulation with explicit water molecules with the same cut-off radius (9 A).

Table 1. Computation Time for 1 nsec Dynamics of 28-mer bba Peptide
Solvent modelCut-off radius(A)Time(min)Relative time
Vacuum MDa985.11.0
BD without solvation energyb952.50.6
DD/SA BDb980.00.9
Explicit water MDc94,23049.7
aThe simulation was performed using the MD program AMBER with united-atom force field. Dielectric constant was 1. All covalent bonds were constrained with SHAKE algorithm. The time step was 2 fsec. Number of atoms was 304.bThe simulations were performed using the BD program developed here. All covalent bonds were constrained with LINCS algorithm. The time step was 10 fsec. Number of atoms was 304.cThe simulation was performed using the MD program AMBER with united-atom force field. The peptide was solvated using a box extending at least 10 A in all directions. All covalent bonds were constrained with SHAKE algorithm. The time step was 2 fsec. Number of atoms was 7,681.

3. 2 Dynamics

As described above, the BD simulation with DD/SA model was proved useful for long time simulation. Next we asked whether the BD simulation gave satisfactory simulation trajectories concerning the structure and dynamics. We compared the simulation results on structure and dynamics obtained by the various simulation procedures.
Figure 2 shows Ca RMS deviations of the peptide from the NMR structure as a function of time. Snapshots of each trajectory at various periods are shown in Figure 3. The DD/SA model and explicit water model simulations appeared to give relatively stable trajectories, as judged by the smaller RMS deviations compared to that of in vacuo MD simulation. By in vacuo MD simulation the structure was almost collapsed quickly and the collapsed structure was stable until 5 nsec. On the other hand, the simulations using the DD/SA solvent model and explicit solvent model gave stable trajectories of the secondary and tertiary structures of the native peptide over the 5 nsec simulation.


Figure 2. The Ca RMS deviations from NMR structure as a function of simulation time. From left to right: vacuum, DD/SA model, explicit water model.


Figure 3. Ribbon representations of the snapshots at 1 nsec (a, d, g), at 3 nsec (b, e, h), and at 5 nsec (c, f, i) in various solvent model simulations. First row is from vacuum simulation (a, b, c), second row is for DD/SA model (d, e, f) and third row is for explicit water model (g, e, f). The figures are generated with RasMol [43].

Figure 4 shows the radius of gyration (Rg) of the peptide as a function of time. The DD/SA solvent model gave stable Rgs at 9.5 A during 5 nsec simulation. The Rgs by DD/SA simulation were almost similar to those obtained by the explicit solvent simulation. in vacuo simulation gave much smaller Rgs.


Figure 4. The radii of gyration as a function of simulation time. From left to right: vacuum, DD/SA model, explicit water model.

We examined the Ca positional RMS fluctuations of the peptide during simulation using various solvent models (Figure 5). The RMS fluctuations observed in the simulation using DD/SA model were only a little smaller throughout the peptide than those observed in the simulation using the explicit solvent model, in which the fluctuations around the C terminus region of the peptide and the hairpin turn (residues 7 and 8) were specifically much larger. On the other hand, the fluctuations observed in the in vacuo simulation were very small throughout the peptide.


Figure 5. The RMS fluctuations of Ca atom positions obtained from simulations using different solvent models. The RMS fluctuation values were calculated from the average values over 2.5 nsec to 5 nsec period. From left to right: vacuum, DD/SA model, explicit water model.

4 Discussion

The BD simulation using the implicit solvent model was effective for stable simulation in two senses. 1) BD simulation enables us to use longer time step, which saves much computer time for longer simulation (i.e. efficient conformational search). In this study we used the time step of 10 fsec for 1 step in BD simulation which should be 5 times more time-saving than MD with 2 fsec for 1 step, but actually BD simulation was only 1.6 fold time-saving (Table 1). This is partly due to the intrinsic difference of dynamics algorithm and partly due to the immature optimization of our BD program, which can be improved by further refinement. 2) BD simulation with the implicit solvent model provides stable simulation similar to the MD with explicit solvent molecules (Figures 2, 4). In BD simulation, the number of atoms was 304 for pda8d (Table 1, legend), while in MD simulation with explicit water, the number was 7,681. The implicit solvent model was effective for saving computation time by decreasing the number of atoms considered and the effect is estimated as 53/1.6 33 times.
Especially the smaller radii of gyration often observed by in vacuo simulations [39], which were claimed as main artifacts, considerably improved to similar Rg values to those obtained with explicit solvent molecules. A little bit smaller atomic fluctuations observed in the BD simulation were different from those with the explicit solvent model (Figure 5). This difference may be attributed to the energy function using the mean solvation energy. In addition, the solvation parameters for DD/SA solvent model in this study have been originally parameterized for GROMOS87 force field (ionized side chains were neutralized) with e = 1 by Fraternali and Gunsteren [26]. Then, those parameters have been applied in GROMOS96 force field with e = rij [31] and almost the same ones have been used for CHARMM19 force field with e = 2rij [40 - 42]. The difference in solvation parameters between GROMOS96 and CHARMM19 force fields is only for sS, the former has 0 cal/mol/A2 and the latter 12 cal/mo/A2. However, there has been no study that used such solvation parameters for AMBER91 force field. Therefore, the original solvation parameters for SA model may not be suitable for direct usage with AMBER91 force field, which might have resulted in smaller atomic fluctuations. The parameterization of those parameters for AMBER91 force field is necessary to solve this issue.
We used LINCS algorithm [35] to constrain bond lengths instead of SHAKE algorithms [38] in our BD simulations. Both algorithm saved simulation time similarly, thus were useful. However, when we used SHAKE algorithm in BD simulation, the simulation was sometimes suspended since SHAKE did not converge for large atomic displacements, especially at high temperatures. Therefore, we implemented the more stable constraint algorithm, LINCS, in this study.
Although a simple surface area model was used for solvation energy calculation in this initial development of the atomistic Brownian dynamics method, the Generalized Born [30] or Poisson-Boltzmann solvation model can be utilized to improve accuracy of solvation energy calculation. Furthermore, the replica exchange method [7] can be combined to increase sampling efficiency in future development. The results shown in this paper indicate that BD simulation with the implicit solvent model is a promising approach that replaces MD simulation with explicit water molecules.

5 Conclusions

In this work we have developed an atomistic Brownian dynamics simulation with implicit solvent model for long time simulation. We tested it by simulating 28-mer peptide. The atomistic Brownian dynamics algorithm with implicit solvent model was applied for the first time to an actual protein having its own structure bba. The BD simulation using DD/SA solvent model was 53 times faster than the corresponding MD simulation using the explicit solvent model. The main artifacts typically encountered in in vacuo simulations, namely smaller radii of gyration and atomic fluctuations, were considerably reduced when applying the DD/SA solvent model to BD simulation. Therefore, the BD simulation with the implicit solvent model is potentially useful for studying thermodynamics and kinetics of protein folding.

6 Agreement for using the program

This atomistic Brownian dynamics program is freeware. Please contact us by e-mail when you want to use the program.

This work was supported (I. Y.) by a Grant-in-Aid for Scientific Research on Priority Area (C) Genome Information Science from the Ministry of Education, Culture, Sports, Science and Technology of Japan.

References

[ 1] D. T. Jones, Curr. Opinion Struct. Biol., 10, 371-379 (2000).
[ 2] U. H. Hansmann and Y. Okamoto, Curr. Opinion Struct. Biol., 9, 177-183 (1999).
[ 3] T. Lazaridis and M. Karplus, Curr. Opinion Struct. Biol., 10, 139-145 (2000).
[ 4] D. J. Osguthorpe, Curr. Opinion Struct. Biol., 10, 146-152 (2000).
[ 5] C. Hardin, T. V. Pogorelov and Z. Luthey-Schulten, Curr. Opinion Struct. Biol., 12, 176-181 (2002).
[ 6] B. Nolting, Protein Folding Kinetics, Springer-Verlag, Berlin (1999), 1-4.
[ 7] A. Mitsutake, Y. Sugita and Y. Okamoto, Biopolymers, 60, 96-123 (2001).
[ 8] W. F. van Gunsteren, Molecular dynamics and stochastic dynamics simulation: A primer in Computer Simulation of Biomolecular System: Theoretical and Experimental Applications Vol. 2, ed. by W. F. van Gunsteren, P. K. Weiner and A. J. Wilkinson, ESCOM Science Publishers B. V., Leiden (1993), 3-36.
[ 9] D. L. Ermak and J. A. McCammon, J. Chem. Phys., 69, 1352-1360 (1978).
[10] J. A. MaCammon, S. H. Northrup, M. Karplus and R. M. Levy, Biopolymers, 19, 2033-2045 (1980).
[11] S. Takada, Z. Luthey-Schulten and P. G. Wolynes, J. Chem. Phys., 110, 11616-11629 (1999).
[12] N. Gronbech-Jensen and S. Doniach, J. Comput. Chem., 15, 997-1012 (1994).
[13] T. Shen, C. F. Wong and J. A. McCammon, J. Am. Chem. Soc., 123, 9107-9111 (2001).
[14] S. J. Weiner, P. A. Kollman, D. A. Case, U. C. Singh, C. Ghio, G. Alagona, S. J. Profeta and P. Weiner, J. Am. Chem. Soc., 106, 765-784 (1984).
[15] B. I. Dahiyat, C. A. Sarisky and S. L. Mayo, J. Mol. Biol., 273, 789-796 (1997).
[16] B. I. Dahiyat, D. B. Gordon and S. L. Mayo, Protein Sci., 6, 1333-1337 (1997).
[17] B. I. Dahiyat and S. L. Mayo, Proc. Natl. Acad. Sci. USA, 94, 10172-10177 (1997).
[18] B. I. Dahiyat and S. L. Mayo, Science, 278, 82-87 (1997).
[19] S. M. Malakauskas and S. L. Mayo, Nat. Struct. Biol., 5, 470-475 (1998).
[20] S. W. Rick and B. J. Berne, J. Am. Chem. Soc., 116, 3949-3959 (1994).
[21] R. M. Levy and E. Gallicchio, Annu. Rev. Phys. Chem., 94, 10172-10177 (1997).
[22] B. R. Gelin and M. Karplus, Biochemistry, 18, 1256-1268 (1979).
[23] J. Vila, R. L. Williams, M. Vasquez and H. A. Scheraga, Proteins, 10, 199-218 (1991).
[24] L. Wesson and D. Eisenberg, Protein Sci., 1, 227-235 (1992).
[25] B. von Freyberg, T. J. Richmond and W. Braun, J. Mol. Biol., 233, 275-292 (1993).
[26] F. Fraternali and W. F. van Gunsteren, J. Mol. Biol., 256, 939-948 (1996).
[27] T. Lazaridis and M. Karplus, Proteins, 35, 133-152 (1999).
[28] W. C. Still, A. Tempczyk, R. C. Hawley and T. Hendrickson, J. Am. Chem. Soc., 112, 6127-6129 (1990).
[29] B. N. Dominy and C. L. Brooks III, J. Phys. Chem., 103, 3765-3773 (1999).
[30] Y. Liu and D. L. Beveridge, Proteins, 46, 128-146 (2002).
[31] J. Zhu, Y. Shi and H. Liu, J. Phys. Chem., 106, 4844-4853 (2002).
[32] M. Schaefer and M. Karplus, J. Phys. Chem., 100, 1578-11599 (1996).
[33] M. Kinoshita, Y. Okamoto and F. Hirata, J. Am. Chem. Soc., 120, 1855-1863 (1998).
[34] W. Hasel, T. F. Hendrickson and W. C. Still, Tetrahedron Comput. Meth., 1, 103-116 (1988).
[35] B. Hess, H. Beker, H. J. C. Berendsen and J. G. E. M. Fraaije, J. Comp. Chem., 18, 1463-1472 (1997).
[36] W. D. Cornell, P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz Jr., D.M. Ferguson, D. C. Spellmeyer, T. Fox, J. W. Caldwell and P. A. Kollman, J. Am. Chem. Soc, 117, 5179-5197 (1995).
[37] H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren, A. DiNola and J. R. Haak, J. Chem. Phys., 81, 3684-3690 (1984).
[38] J. P. Ryckaert, G. Ciccotti and H. J. C. Berendsen, J. Comput. Phys., 23, 327-341 (1977).
[39] C. L. Brooks III, M. Karplus and B.M. Pettitt, Proteins: a theoretical perspective of dynamics, structure, and thermodynamics, John Wiley & Sons, Inc., New York (1988), 137-174.
[40] P. Ferrara and A. Caflisch, Proc. Natl. Acad. Sci. USA, 97, 10780-10785 (2000).
[41] P. Ferrara, J. Apostolakis and A. Caflisch, Proteins, 39, 252-260 (2000).
[42] P. Ferrara, J. Apostolakis and A. Caflisch, Proteins, 46, 24-33 (2002).
[43] R. A. Sayle and E. J. Milner-White, Trends Biochem. Sci., 20, 374-376 (1995).


Return