(Received: November 24, 2004; Accepted for publication: February 7, 2005; Published on Web: April 1, 2005)
This paper describes a novel reduced representation of three-dimensional protein structures and its applications for data mining in proteins. In this representation, a protein structure is represented by a set of pseudo-atoms corresponding to glycine residues and 3D coordinates of their alpha-carbons. To evaluate the performance of this reduced representation of protein structures, the authors modified the AIM program for searching 3D common structural features for three or more proteins. Three dehydrogenase proteins were represented in the above-mentioned format and their common structural features were searched by the modified AIM program. The substructures related to the NAD binding domain of all three proteins were successfully identified. In another trial, a reduced representation consisting of seven glycine residues from the NifH/frxC motif site in a nitrogenase (1NIPA) was used as a query for searching common substructures in 1,300 peptide chains. Then, similar substructures close to iron-sulfur clusters were identified in several hybrid cluster proteins. The presented results show the potential applicability of this method for 3D structural data mining of proteins.
Keywords: Glycine filter, Protein motif, 3D common structural feature, Structural data mining, NAD binding domain, Iron-sulfur cluster
Text in Japanese