You are not currently logged in.
Access JSTOR through your library or other institution:
If You Use a Screen ReaderThis content is available through Read Online (Free) program, which relies on page scans. Since scans are not currently available to screen readers, please contact JSTOR User Support for access. We'll provide a PDF copy for your screen reader.
Self-Identification of Protein-Coding Regions in Microbial Genomes
Stephane Audic and Jean-Michel Claverie
Proceedings of the National Academy of Sciences of the United States of America
Vol. 95, No. 17 (Aug. 18, 1998), pp. 10026-10031
Published by: National Academy of Sciences
Stable URL: http://www.jstor.org/stable/45600
Page Count: 6
You can always find the topics here!Topics: Genomes, Markov models, Datasets, Data coding, Open reading frames, Genomics, Matrices, Nucleotides, Bacterial genomes, Sequencing
Were these topics helpful?See something inaccurate? Let us know!
Select the topics that are inaccurate.
Since scans are not currently available to screen readers, please contact JSTOR User Support for access. We'll provide a PDF copy for your screen reader.
Preview not available
A new method for predicting protein-coding regions in microbial genomic DNA sequences is presented. It uses an ab initio iterative Markov modeling procedure to automatically perform the partition of genomic sequences into three subsets shown to correspond to coding, coding on the opposite strand, and noncoding segments. In contrast to current methods, such as GENEMARK [Borodovsky, M. & McIninch, J. D. (1993) Comput. Chem. 17, 123-133], no training set or prior knowledge of the statistical properties of the studied genome are required. This new method tolerates error rates of 1-2% and can process unassembled sequences. It is thus ideal for the analysis of genome survey and/or fragmented sequence data from uncharacterized microorganisms. The method was validated on 10 complete bacterial genomes (from four major phylogenetic lineages). The results show that protein-coding regions can be identified with an accuracy of up to 90% with a totally automated and objective procedure.
Proceedings of the National Academy of Sciences of the United States of America © 1998 National Academy of Sciences