Table 1. The configuration of PC cluster.
|CPU||8CPU Pentium4 2.4GHz, 512K Cache|
|Memory||1 GB DDR266/ board|
|Network||1000 BaseT Ethernet|
|Hard Disk||60GB / 5400rpm|
|Operating System||Linux kernel 2.4/RedHat 8.0|
|Fortran Compiler||Gnu Fortran77|
|Parallel Library||MPICH ver.1.2.4|
The computational speed is measured with a series of minor-tranquilizers with the benzodiazepin and thienodiazepin frameworks; flutoprazepam (1, C19H16ClFN2O), triazolam (2, C17H12Cl2N4), clothiazepam (3, C16H15ClN2OS), etizolam (4, C17H16ClN4S), flutazolam (5, C19H18ClFN2O3), and lorazepam (6, C15H10Cl2N2O2) molecules (Scheme 1). The 3-21G basis set  is used throughout this study. We repeat the calculations ten times of single SCF and gradient of each molecule and take the fastest time among them.
Table 2. Molecular formula, number of atoms, number of basis functions, and the amount of the two electron integrals of each molecules.
Table 3 shows the CPU and wall clock time for the P and NOP methods, respectively. In all cases for N=1, it is clearly shown that the wall clock time by the NOP method is shorter and 0.35-0.65 times smaller than that of the P method. Furthermore, when comparing the sum of CPU and system time, the NOP method shows shorter time, except for the results of fultazolam that are almost the same.
Table 3. Lists of CPU, system and wall clock time of P and NOP methods. N is number of CPUs.
Concerning the wall clock time, in all molecules, the difference becomes smaller when the number of CPUs increases. After all two electron integrals are buffered on the main memory, the wall clock time by the P method decreases more quickly than that by the NOP method showing the difference between the two methods. In the case of lorazepam, that is the smallest calculation of the present work, the difference between two methods disappears when 4 CPUs are used. In other molecules, the difference between the two methods also disappears when 8 CPUs are used again except for the fultazolam case. In the case of fultazolam, the NOP method result is still 0.59 times shorter than that of the P method even in the 8 CPU case, and the difference does not disappear in the present study. However, we consider from Table 3 that the difference between the two methods will vanish as the number of CPUs increases.
Table 4. The numbers of two electron integrals for NOP and P method.
The difference in the wall clock times between the two methods is brought by the difference of the amount of the files of the two electron integrals. We usually handle just the two electron integral that is larger than a certain threshold value (10-8 in the present study). Table 4 shows the number of two electron integrals by the P and NOP methods used in the present calculations. It is worthwhile to note that the number by the P method is almost two times larger than that by NOP method. From the definition of the P method, Irstu has a certain value if the integral <rs|ru> is smaller than the cutoff value but either
is larger than the cutoff value. As a result, the number of two electron integrals increases in the case of P method. In the calculation of the relatively larger molecule like the present calculations, almost all of the two electron integrals are under cutoff value and the effect of increasing the number of two electron integrals denoted above becomes significant. In Table 3, the system time of the P method is always larger than that of the NOP method, which indicates the overhead for the file I/O operation is larger in the case of P method. In a smaller molecule, this is not true because a large part of the two electron integrals have values larger than the cutoff threshold. The time required for the calculation of P method, is therefore, smaller than that of NOP method. Table 5 shows the results of C2F6 molecule case as an example of a small molecule. The number of two electron integrals by NOP method is 0.87 times when the 6-31G** basis set is used, and 0.76 times when the 3-21G basis set is used. In both basis sets, the calculations finish faster in P method. It should also be noted that the degree of acceleration is larger in the 6-31G** basis set case than in the 3-21G basis set case, which is easily seen from NOP/P factor of the number of two electron integrals.
Table 5. The difference of computational time and number of two electron integrals of C2F6 molecule. N is number of CPUs.
|Number of Integrals||20868299||23940759||Ratio||0.87|
|Number of Integrals||2124399||277210||Ratio||0.76|
In the present paper, we have studied the CPU time and the wall clock time required for the ab initio Hartree-Fock molecular orbital calculations with and without the Raffenetti's P super matrix algorithm under the parallel environment using the PC cluster. As realistic examples, the six different drug molecules of the minor-tranquilizer and the 3-21G basis set are used. In almost all of the cases, the P method cannot calculate faster than the NOP method in such a large calculation. It should be concluded that the P method sometimes calculates faster but sometimes does not. In large scale of calculations, it should be suggested to perform a test calculation to confirm which method is faster prior to the real calculations.
We are grateful to Dr. Kazumasa Shinjo and Dr. Shinsuke Shimogawa of ATR Adaptive Communication Laboratories for stimulating discussions and suggestions. This work was partially supported by Telecommunication Advancement Organization of Japan (TAO).