Development of XyMTeX2PS for the PostScript Typesetting of Chemical Documents Containing Structural Formulas

Shinsaku FUJITA


1 Introduction

To typeset chemical documents containing structural formulas within the TeX/LaTeX processing environment, we developed and distributed the XyMTeX system (Version 1.00) in 1993 [1], where the LaTeX picture environment was used as a tool for drawing. Thereafter, the XyMTeX Version 2.00 (1998) supported the XyM Notation which we proposed as a linear notation of structural formulas [2, 3]. The XyMTeX Version 3.00 (2000) supported the size reduction of structural formulas, which expanded the scope of the XyMTeX system [4].
Up to Version 3.00, we laid stress on portability within the scope of TeX/LaTeX, where the XyMTeX system was designed to depend on the LaTeX picture environment [5, 6] and the epic package [7].
The advance of information technology, however, has provided another approach in which the portability within the scope of TeX/LaTeX is no longer a prerequisite. In particular, the Internet system based on HTML (Hypertext Markup Language) and XML (Extensible Markup Language) has been widely accepted during the 1990s and the present decade.
To catch up with the spread of the Internet system, we have developed XyMJava [8], XyMML (XyM Markup Language) [9], and XyM-XLST [10], as shown in Figure 1. These systems as well as XyMTeX are based on the XyM Notation that works as a key technique in the background of the total system shown in Figure 1 [2]. Thereby, the total system (Figure 1) covers a traditional field (printing) and a new field (the Internet communication), both of which are concerned with chemical documentation.

Figure 1. XyMTeX as part of a communication system of chemical documents. The acronym "DVI" means a "device-independent" file produced by a TeX/LaTeX processing.

During the last decade (1990s), on the other hand, desktop publishing (DTP) based on the PostScript language has emerged as an alternative methodology to cover conventional publishing systems and the Internet communication systems. In particular, PDF (Portable Document Format) based on the PostScript has attracted anxious attention since the PDF is capable of bringing a sound method to cover them.
The TeX/LaTeX typesetting system has been influenced by this trend of the DTP. Particularly in treating graphic data [11], the TeX/LaTeX system is now recognized as a programming language to produce PostScript codes in place of TeX-original DVI (device-independent) codes. this usage of the TeX/LaTeX system stems from the \special function, which was originally equipped to accept such graphic data [12]. On the basis of the \special function, versatile tools such as PSTrick [13] have been developed to output PostScript codes. Moreover, tools for translating DVI codes (including the codes due to \special) to PostScript codes and tools for converting the PostScript codes into PDF codes have become easily available.

Figure 2. XyMTeX2PS as an alternative methodology in a communication system of chemical documents

In contrast to the general status of the TeX/LaTeX typesetting system, chemical documentation by TeX/LaTeX is left behind the trend of the DTP, because XyMTeX (up to Version 3.00) has not fully utilized the graphic utilities of PostScript and PDF. Hence, the XyMTeX system should be improved to be compatible with PostScript and PDF, so that the PDF technique will be applied to the two fields of chemical documentation (i.e. printing and Internet communication).
As clarified in the preceding paragraphs, the present article aims at improving the XyMTeX system to be compatible with PostScript and PDF. Thereby, the improved XyMTeX system (XyMTeX2PS) will cover the two fields of chemical documentation (i.e. printing and Internet communication), as shown in Figure 2. A subsidiary aim is to develop more elaborate stereochemical expressions such as wedged bonds, because the improvement permits us to be free from the restriction of the LaTeX picture environment.

2 Package Files of XyMTeX2PS

The XyMTeX2PS system, which has been distributed in the name of XyMTeX Version 4.02 to emphasize the succession to the previous versions of XyMTeX, consists of the package files listed in Table 1 [14]. Among them, the two packages, xymtx-ps and chmst-ps, have been developed for PostScript printing. Macros for PostScript printing are contained in xymtx-ps. They are substituted for several drawing macros contained in the chemstr package. The chmst-ps package for PostScript printing corresponds to the chemist package for the LaTeX picture environment. Moreover, the package aliphat.sty of the XyMTeX2PS system is enhanced to draw wedged bonds for stereochemistry.

3 Use of XyMTeX2PS

The XyMTeX2PS system works in two modes:
  1. TeX/LaTeX-compatible mode: When the utility package xymtex.sty is input in the TeX document file shown below, all of the package files of the XyMTeX system except xymtx-ps.sty are loaded. This mode draws b-bonds as thick lines and a-bonds as dotted lines.
    %Any code cited in this article 
    %is inserted here. 

    To reduce formula sizes, epic.sty is automatically loaded.
  2. PostScript-compatible mode: When the utility package xymtexps.sty is input in the TeX document file shown below, all of the package files of the XyMTeX system (also xymtx-ps.sty) are loaded. This mode draws b/a-bonds in either one format selected from a pair of wedged bonds/hashed dash bonds (default or on the declaration of \wedgehasheddash), a pair of wedged bonds/hashed wedged bonds (\wedgehashedwedge), and a pair of dash bonds/hashed dash bonds (\dashhasheddash).
    %Any code cited in this article 
    %is inserted here.

After compiling these TeX files by the TeX system, the resulting DVI files are converted into the PostScript files (by means of a converter such as dvips), which are browsed and printed by PostScript tools (e.g., GhostScript). Further, the PostScript files are converted into PDF files (by means of a converter such as Adobe Acrobat Distiller), which can be browsed and printed by such tools as Adobe Acrobat Reader.

Table 1. Package Files of XyMTeX2PS and Related Files
package nameincluded functions
XyMTeX Files
chemstr.stybasic commands for atom- and bond-typesetting
hetarom.stymacros for drawing vertical types of carbocyclic and heterocyclic compounds
hetaromh.stymacros for drawing horizontal types of carbocyclic and heterocyclic compounds
carom.stymacros for drawing vertical and horizontal types of carbocyclic compounds
lowcycle.stymacros for drawing five-or-less-membered carbocycles.
ccycle.stymacros for drawing bicyclic compounds etc.
hcycle.stymacros for drawing pyranose and furanose derivatives
aliphat.stymacros for drawing aliphatic compounds
locant.stycommands for printing locant numbers
polymers.stycommands for drawing polymers
fusering.stycommands for drawing units for ring fusion
methylen.stycommands for drawing zigzag polymethylene chains
sizeredc.stycommands for size reduction
XyMTeX2PS File
xymtx-ps.stymacros for PostScript printing (XyMTeX2PS, XyMTeX Version 4.02). These macros
are substituted for several macros contained in the chemstr package.
XyMTeX Utilities
xymtex.stya package for calling all package files except xymtx-ps.sty
(no PostScript)
xymtexps.stya package for calling all package files
(PostScript, i.e. with xymtx-ps.sty)
Related Files
chemist.stycommands for using `chem' version and chemical environments
chmst-ps.stymacros for PostScript printing. These macros are substituted for several macros
contained in chemist package.

4 PostScript-Compatible Mode vs. TeX/\LaTeX-Compatible Mode

4. 1 Wedged Bonds for Stereochemistry

Three profiles of the PostScript-compatible mode are summarized in Figure 3, which also contains structural formulas by the TeX/LaTeX-compatible mode for comparison. Figure 3 is obtained by the following codes:
PostScript-compatible mode \\
(wedge and hashed dash): \\
\CompareSample \\
PostScript-compatible mode \\
(wedge and hashed wedge): \\ 
\CompareSample \\
PostScript-compatible mode \\
(dash and hashed dash): \\
\CompareSample \\
\TeX/\LaTeX{}-compatible mode: \\
\CompareSample \\

Figure 3. Comparison between PostScript-compatible mode and TeX/LaTeX-compatible mode.

Each row of Figure 3 contains three formulas drawn by the same mode, which are different in size (unit lengths: 0.1pt, 0.08pt, and 0.06pt). As for the PostScript-compatible mode, the switch \wedgehasheddash or a default condition produces a pair of wedged bonds/hashed dash bonds (the first row), the switch \wedgehashedwedge produces a pair of wedged bonds/hashed wedged bonds (the second row), and the switch \dashhasheddash generates a pair of dash bonds/hashed dash bonds (the third row).
By means of the sizeredc package (distributed after Version 3.00), the original LaTeX picture environment can be used by a switching declaration \reducedsizepicture in order to reduce the sizes of formulas, as shown in the bottom row of Figure 3.
According to "Basic Terminology of Stereochemistry" of IUPAC Recommendations 1996 [15], a bond from an atom in the plane of drawing to an atom above the plane (i.e., so-called b-bond) is shown with a bold wedge, which starts from the atom in the plane at the narrow end of the wedge; and a bond below the plane (i.e., so-called a-bond) is shown with a hashed bold dash (short parallel lines). Hence, the combination of wedges and hashed dashes is selected as a default setting for XyMTeX2PS.

4. 2 Techniques for Switching Modes

The LaTeX picture environment, which has been adopted in the TeX/LaTeX-compatible mode of XyMTeX2PS, draws a long line by shifting and joining several line fonts of fixed length, while it draws a short line by using the epic package. The joint of two lines may become visible under unfortunate conditions. For example, each sloped bond of the first and second six-membered rings depicted in the bottom row of Figure 3 may be split into two jointed lines at a low dissolution of a CRT display.
On the other hand, the PostScript, which has been adopted in the PostScript-compatible mode of XyMTeX2PS (e.g., the first, second, and third rows of Figure 3), can draw a line of high printing quality. Moreover, polyhedral materials can be easily drawn to generate wedged bonds etc.
To illustrate the mechanism for switching the modes, the definition of the \Put@@@Line command for drawing bonds is cited from the package xymtx-ps.sty of the XyMTeX2PS:
\ifnum#3>0\relax \@tempcntXa=#5\relax
\advance\@tempcntXa by#1\relax
\multiply\@tempcntYa by#4\relax
\multiply\@tempcntYa by10\relax
\divide\@tempcntYa by#3\relax
\divide\@tempcntYa by10\relax\fi
\advance\@tempcntYa by#2\relax
\ifmolfront%bold dash bond for skeletal 
           %bond for pyranose etc.
\if@skbondlist%bold dash bond skeletal 
              %bond for general cases
\else%wedged bond
\@tempcntXa=0\relax \@tempcntYa=0\relax
\endgroup}%end of Put@@@Line
According to this definition, bold dash bonds in a cyclic skeleton (e.g., front bonds of \furanose) are drawn by using the command \psline of PSTrick, while wedged bonds for stereochemistry are drawn as long triangles by using the command \pspolygon of PSTrick. As illustrated in Figure 3, the commands \wedgehashedwedge and \wedgehasheddash set a switching flag \@wedgeswtrue. Thereby, the condition commands \if@wedgesw, \else, and \fi (used in the \Put@@@Line command) divide cases to draw a bold dash bond or a wedged bond.

4. 3 Techniques for Switching Fonts

The font for drawing substituents and atoms in a default mode (the TeX/LaTeX-compatible mode or the PostScript-compatible mode) is selected by the following setting:
According to this specification, the font and its size can be changed by substituting \substfont and \substfontsize as follows:

5 Drawing Complicated Formulas

5. 1 Carbocycles

Because the command \steroidchain for drawing steroids is involved in XyMTeX2PS, cholesterol (Cholest-5-en-3b-ol) can be drawn by the following code:
where the \lmoiety command is used to change the direction for printing out the methyl substituent. Thereby, we can obtain the following diagram:

where wedged bonds and hashed dash bonds are depicted by virtue of the functions of XyMTeX2PS.

5. 2 Heterocycles

As examples of drawing heterocycles, the following codes:
draw a fused heterocycle and a-D-ribofuranose-5-phoshporic acid as follows:

In the above codes, the XyMcompd environment defined by the chemist package is used to fix the drawing domain of each formula as well as to attach its ID number with a reference key. To show the domain, a frame surrounding the formula is depicted by means of the \fbox command. Each ID number can be referred to by the reference key declared in the argument of the XyMcompd environment. For example, the declaration \cref{cpd:1} gives the ID number (1}), because \cref is also defined in the chemist package.

6 Three Types of Derivation

The XyM Notation which the XyMTeX2PS system uses as commands contains three types of derivation, i.e., "substitution derivation" for nested substitution, "atom derivation" for generating spiro compounds, and "bond derivation" for ring fusion [2]. Instead of giving a general explanation of them, they are exemplified by using concrete examples in the present section.

6. 1 Nested Substitution

A skeleton can be changed into a substituent by declaring a function (yl) in the substitution list of the corresponding command. To draw the structural formula of ribavirin, for example, the function (yl) is declared in the command \fiveheterov. The resulting substituent is attached to a furanose skeleton by declaring in the command \furanose. This technique has been named "substitution derivation" [2].

Duplicated nesting is permissible. For example, the following code for drawing a structural formula of adonitoxin contains a substituent derived from a steroid skeleton, which, in turn, contains a five-membered substituent. First, the function (yl) is declared in the command \fiveheterov.
The substituent is attached to a steroid skeleton by declaring in the command \steroid. The resulting steroid skeleton is further converted into a substituent by declaring a (yl) function in the command \steroid. The steroidal substituent is attached to a pyranose skeleton. Thereby, the nested code typesets the following formula:

Further nesting enables us to draw a complicated formula of a photographic coupler. Thus, the code:
in which nine (yl) functions are declared, typesets the following structure:

The function of variable bond lengths, which is supported by redefining the command \tetrahedral in XyMTeX2PS (after XyMTeX Version 4.01), provides us with an elegant solution to draw this type of compounds. The vertical bond between the carbon on the main chain and the nitrogen on the five-membered ring is a prolonged bond depicted by the code <,,250,>.

6. 2 Spiro Compounds

A substituent generated by a (yl) function can be declared in an atom list of a skeleton so as to generate the formula of a spiro compound. This technique has been named "atom derivation" [2]. For example, illudin S (an anti-tumor antibiotic substance) is drawn in this way as follows:

where the three-membered spiro ring is drawn by the atom derivation.

6. 3 Ring Fusion

Fused rings can be drawn on the basis of so-called "bond derivation" [2], where a fusing unit such as the command \fivefusevi is used, as exemplified by the drawing of Penicillin V:

In this drawing, a five-membered ring created by the command \fiveheterovi is attached to a four-membered ring created by the command in a manner that they share an edge designated by the letter b (\fourhetero) and the letter d (\fiveheterovi). The command \lyl generates a phenyl substituent with a linking group (OCH2CONH).

7 Drawing Tetrahedral Molecules with Wedged Bonds

7. 1 Configurations

Tetrahedral molecules with wedged bonds can be drawn by using such commands as \tetrahedral, \utetrahedralS, and \UtetrahedralS so as to show their absolute configurations.

Fischer projection diagrams are used to show the absolute configurations of sugars. They can be depicted by using the command \tetrahedral in a multiply nested fashion. For example, the codes:
typeset the Fischer projection of \textsc{d}-glucose and its wedge-form representation as follows:

7. 2 Conformations

An eclipsed conformer and a staggered one are drawn by the codes:
which generate the following formulas:

7. 3 Reaction Schemes

Reaction schemes containing tetrahedral molecules with wedged bonds can be drawn by using such commands as \ltetrahedralS and \dtrigpyramid. For example, a Walden inversion reaction is drawn by the following code:
HO^{-} & + & 
\reactrarrow{0pt}{1cm}{}{} \qquad 
3B==C$_{2}$H$_{5}$}} \nonumber \\
&& \qquad\reactrarrow{0pt}{1cm}{}{} 
+ Cl^{-} 

Note that eq. (1) is drawn by the chemeqnarray environment defined by the chemist package. The \reactrarrow command for drawing reaction arrows has been defined also by the chemist package.

8 Wedged Skeletal Bonds

Although the diagram of a furanose depicted above (e.g., ribavirin) is of sufficient quality to be printed, one may require a more sophisticated format in which the three front bonds are expressed by the combination of wedge-dash-wedge. This type of format can be drawn by using the command \WedgeAsSubst as well as the PSTrick command \psline.
To simplify an input code, a tentative macro named \myfuranose is defined as follows:
Thereby, the formula of ribavirin can be typeset by writing a more simplified code:

9 Application to Publication

To exemplify the versatility of the XyMTeX2PS system, I should refer to a monograph entitled "Organic Chemistry of Photography" which I have recently published [16]. In fact, all of the structural formulas contained in this monograph have been typeset by using the XyMTeX2PS system. For example, a cyan dye releaser for instant color photography, the development of which I was engaged in as one of the inventors [17], has been typeset by the following code:
4==\ryl(0==N\dbond N){%
\end{XyMcompd} \\

10 Conclusion

The XyMTeX2PS system, which has been developed and distributed in the name "XyMTeX Version 4.02", is capable of typesetting chemical documents containing structural formulas so as to give PostScript files of high quality. By converting the resulting files into PDF files, the XyMTeX2PS system covers Internet communication as well as traditional printing. Because it is free from the restriction of the LaTeX picture environment, the XyMTeX2PS system is capable of giving more elaborate stereochemical expressions such as wedged bonds.

This work was supported in part by the Japan Society for the Promotion of Science: Grant-in-Aid for Scientific Research B(2) (No. 14380178, 2002-2004).


[ 1] Fujita, S., Comput. Chem., 18, 109 (1994).
Fujita, S., XyMTeX-Typesetting Chemical Structural Formulas, Addison-Wesley, Tokyo (1997).
The word "XyM" is the uppercase of cum, which is the Greek counterpart of the stem "chem" of the word "chemistry".
[ 2] Fujita, S., Tanaka, N., J. Chem. Inf. Comput. Sci., 39, 903 (1999).
[ 3] Fujita, S., Tanaka, N., TUGboat, 21(1), 7 (2000).
[ 4] Fujita, S., Tanaka, N., TUGboat, 22(4), 285 (2001).
[ 5] Lamport, L., LaTeX. A Document Preparation System, 2nd ed. for LaTeXe, Addison-Wesley, Reading (1994).
Lamport, L., LaTeX. A document Preparation System, Addison-Wesley, Reading (1986).
[ 6] Goossens, M., Mittelbach, F., Samarin, A., The LaTeX Companion, Addison-Wesley, Reading (1994).
[ 7] For epic macros, see Podar S., "Enhancements to the picture environment of LaTeX", Manual for Version 1.2 dated July 14, 1986.
[ 8] Tanaka, N., Fujita, S., J. Computer Aided Chem., 3, 37-47 (2002).
[ 9] Fujita, S., J. Chem. Inf. Comput. Sci., 39, 915-927 (1999).
[10] Tanaka, N., Ishimaru, T., Fujita, S., J. Computer Aided Chem., 3, 81-89 (2002).
[11] For graphic applications of TeX, \LaTeX and relevant systems, see
Goossens, M., Rahtz, S., Mittelbach, F., LaTeX Graphics Companion, Addison Wesley Longman, Reading.
[12] For the TeX system, see
Knuth D. E., The TeXbook, Addison-Wesley, Reading (1984).
[13] van Zandt, T., Girou, D., TUGboat, 15(3), 239 (1995).
[14] The XyMTeX2PS system (XyMTeX Version 4.02) can be downloaded from my homepage:
On-line manuals of XyMTeX Versions 1.01, 2.00, 3.00, 4.01, and 4.02 are also available from the same homepage.
[15] IUPAC Recommendations 1996, "Basic Terminology of Stereochemistry" (1996).
[16] Fujita, S., Organic Chemistry of Photography, Springer-Verlag, Berlin-Heidelberg (2004).
[17] Fujita, S, Koyama, K, Inagaki, Y, Waki, K (1982) US Patent 4 336 322.