Comparative Genome Analysis Focused on Periodicity from Prokaryote to Higher Eukaryote Genomes Based on Power Spectrum

Atsushi FUKUSHIMAa, Toshimichi IKEMURAa, b and Shigehiko KANAYAb, c*

aDepartment of Population Genetics, National Institute of Genetics
Yata 1111, Mishima, Shizuoka 411-8540, Japan
bACT-JST (Applying Advanced Computational Science and Technology, Japan Science and Technology Corp.)
Kawaguchi, Saitama-ken, 332-0012, Japan
cGraduate School of Information Science, Nara Institute of Science and Technology
8916-5, Takayama, Ikoma, Nara 630-0101, Japan

(Received: June 16, 2003; Accepted for publication: July 31, 2003; Published on Web: September 22, 2003)

We present studies of periodic patterns in nucleotide sequence and characterization for nucleotide sequences that confer periodicities to Caenorhabditis elegans, Arabidopsis thaliana, Drosophila melanogaster, Anopheles gambiae, and Homo sapiens by the power spectrum method and frequency of nucleotide sequences. To assign periodic regions in genome, we used periodic nucleotide distributions by a parameter Fk. For worm genome, a 68-bp periodicity in chromosome I, a 59-bp periodicity in chromosome II, and a 94-bp periodicity in chromosome III were found. In A. thaliana, we obtained three periodicities (248 bp-, 167 bp-, and 126 bp) in chromosome 3, three peaks (174 bp-, 88 bp-, and 59 bp-period) in chromosome 4, and four periodicities (356 bp, 174 bp, 88 bp, and 59 bp) in chromosome 5. These are related to ORF that consists of Gly-rich amino acid sequences. 167- or 84-bp periodicity was detected along the entire length of these chromosomes for human chromosomes 21 and 22. The 167-bp is identical to the length of DNA that forms two complete helical turns in nucleosome. For insect genomes (D. melanogaster and A. gambiae), we found that G or C spectral curves have flat regions at middle frequency, which may be associated with randomness of base sequence composition. This property has not been observed in Saccharomyces cerevisiae, C. elegans, A. thaliana, and H. sapiens yet.

Keywords: Periodicity, Repeat, Power spectrum analysis, Long-range correlation

