Briefings in Bioinformatics Advance Access originally published online on April 21, 2009
Briefings in Bioinformatics 2009 10(5):475-489; doi:10.1093/bib/bbp022
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
An Ariadne's thread to the identification and annotation of noncoding RNAs in eukaryotes
Corresponding author. Giulia Soldà, Department of Biology and Genetics for Medical Sciences, University of Milano, Via Viotti 3/5, 20133 Milan, Italy. Tel: +390250315852; Fax: +390250315864; E-mail: giulia.solda{at}unimi.it
Non-protein coding RNAs (ncRNAs) have emerged as a vast and heterogeneous portion of eukaryotic transcriptomes. Several ncRNA families, either short (<200 nucleotides, nt) or long (>200 nt), have been described and implicated in a variety of biological processes, from translation to gene expression regulation and nuclear trafficking. Most probably, other families are still to be discovered. Computational methods for ncRNA research require different approaches from the ones normally used in the prediction of protein-coding genes. Indeed, primary sequence alone is often insufficient to infer ncRNA functionality, whereas secondary structure and local conservation of portions of the transcript could provide useful information for both the prediction and the functional annotation of ncRNAs. Here we present an overview of computational methods and bioinformatics resources currently available for studying ncRNA genes, introducing the common themes as well as the different approaches required for long and short ncRNA identification and annotation.
Keywords: small and long noncoding RNA, gene prediction, genome annotation, bioinformatics analysis, regulatory RNA, bioinformatics programming
Submitted: December 5, 2008. Received (in revised form): March 12, 2009.