Verification of a Chemical Reaction Database
-Is It Sufficient for Practical Use in Chemical Research?

Hiroko SATOHa* and Tadashi NAKATAb

aIntelligent Systems Research Division, National Institute of Informatics
2-1-2 Hitotsubashi, Chiyoda, Tokyo 101-8430, Japan
bSynthetic Organic Chemistry Laboratory, RIKEN
2-1 Hirosawa, Wako, Saitama 351-0198, Japan

(Received: June 27, 2003; Accepted for publication: July 18, 2003; Published on Web: September 8, 2003)

The accuracy of the chemical reaction data of a reaction database is verified to determine the validity for practical use in chemical research. Reaction databases have been traditionally used in data-search, but recently, there have been some approaches deriving knowledge for synthetic design and reaction prediction systems. The design and prediction based on the approaches are providing valid scientific technologies that could provide a new chemical research style in which design and prediction are done before experiments. The technologies must give an answer to the serious issues concerning the environments of Earth, and are expected to reduce the number of experiments, predict a synthetic route producing no useless side-reaction products, and design environmentally friendly catalysts and reagents for replacing to hazardous and toxic ones. In the reaction prediction and synthetic design systems based on a reaction database, the quality and contents of the reaction database are of critical importance. Low quality and lack of contents may lead to wrong outputs from the systems. High accuracy of reaction data is particularly essential for both database search and knowledge derivation, and a reaction database is accordingly verified in order to determine the correctness of the data. The verification is done using 329 sampling reaction data from 600,000 data in a commercially available database, and 151 error data are found. The types of error concern planar and/or stereochemical structures, the number of reaction steps, reaction schemes, reaction sites, reaction conditions, product ratios, and article information. This paper describes the results from the verification and discussion of the problems for practical use in the current available reaction databases.

Keywords: Chemical reaction data, Database, Chemical reaction prediction, Organic synthetic design

Abstract in Japanese

Text in Japanese

PDF file(68kB)