Skip Navigation

Briefings in Bioinformatics 2002 3(2):181-194; doi:10.1093/bib/3.2.181
This Article
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Baytaluk, M. V.
Right arrow Articles by Mironov, A. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Baytaluk, M. V.
Right arrow Articles by Mironov, A. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© Henry Stewart Publications

Exact mapping of prokaryotic gene starts

Mikhail V. Baytaluk
PhD student at the Institute of Molecular Biology, RAS. His research is in the area of gene recognition.

Mikhail S. Gelfand
Director for Science, Integrated Genomics, Moscow. His research interests are comparative genomics, genome annotation, analysis of regulation of gene expression and gene recognition.

Andrey A. Mironov
Director for Technology, Integrated Genomics, Moscow. His research interests are the creation of algorithms for sequence and structure alignments, software development and genome annotation


M. S. Gelfand, Integrated Genomics — Moscow, PO Box 348, Moscow 117333, Russia Tel: +7 (095) 135 20 41 Fax: +7 (095) 132 60 80 E-mail: gelfand{at}integratedgenomics.ru

It is known that while the programs used to find genes in prokaryotic genomes reliably map protein-coding regions, they often fail in the exact determination of gene starts. This problem is further aggravated by sequencing errors, most notably insertions and deletions leading to frame-shifts. Therefore, the exact mapping of gene starts and identification of frame-shifts are important problems of the computer-assisted functional analysis of newly sequenced genomes. Here we review methods of gene recognition and describe a new algorithm for correction of gene starts and identification of frame-shifts in prokaryotic genomes. The algorithm is based on the comparison of nucleotide and protein sequences of homologous genes from related organisms, using the assumption that the rate of evolutionary changes in protein-coding regions is lower than that in non-coding regions. A dynamic programming algorithm is used to align protein sequences obtained by formal translation of genomic nucleotide sequences. The possibility of frame-shifts is taken into account. The algorithm was tested on several groups of related organisms: gamma-proteobacteria, the Bacillus/Clostridium group, and three Pyrococcus genomes. The testing demonstrated that, dependent on a genome, 1–10 per cent of genes have incorrect starts or contain frame-shifts. The algorithm is implemented in the program package Orthologator-GeneCorrector.

Keywords: gene, genomics, gene recognition, reading frame, start of translation, computer analysis, prokaryotes


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.