Diversity of preferred nucleotide sequences around the translation initiation codon in eukaryote genomes

So Nakagawa, Yoshihito Niimura, Takashi Gojobori, Hiroshi Tanaka*, Kin ichiro Miura

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

151 Scopus citations


Understanding regulatory mechanisms of protein synthesis in eukaryotes is essential for the accurate annotation of genome sequences. Kozak reported that the nucleotide sequence GCCGCC(A/G)CCAUGG (AUG is the initiation codon) was frequently observed in vertebrate genes and that this 'consensus' sequence enhanced translation initiation. However, later studies using invertebrate, fungal and plant genes reported different 'consensus' sequences. In this study, we conducted extensive comparative analyses of nucleotide sequences around the initiation codon by using genomic data from 47 eukaryote species including animals, fungi, plants and protists. The analyses revealed that preferred nucleotide sequences are quite diverse among different species, but differences between patterns of nucleotide bias roughly reflect the evolutionary relationships of the species. We also found strong biases of A/G at position -3, A/C at position -2 and C at position +5 that were commonly observed in all species examined. Genes with higher expression levels showed stronger signals, suggesting that these nucleotides are responsible for the regulation of translation initiation. The diversity of preferred nucleotide sequences around the initiation codon might be explained by differences in relative contributions from two distinct patterns, GCCGCCAUG and AAAAAAAUG, which implies the presence of multiple molecular mechanisms for controlling translation initiation.

Original languageEnglish (US)
Pages (from-to)861-871
Number of pages11
JournalNucleic acids research
Issue number3
StatePublished - Feb 2008
Externally publishedYes

ASJC Scopus subject areas

  • Genetics


Dive into the research topics of 'Diversity of preferred nucleotide sequences around the translation initiation codon in eukaryote genomes'. Together they form a unique fingerprint.

Cite this