Briefings in Bioinformatics Advance Access originally published online on June 17, 2009
Briefings in Bioinformatics 2009 10(5):569-578; doi:10.1093/bib/bbp030
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Architecture, function and prediction of long signal peptides
Corresponding author. Gisbert Schneider, Johann Wolfgang Goethe-University, Chair for Chem- and Bioinformatics, Siesmayerstr. 70, D-60323 Frankfurt am Main, Germany. Fax: +49 69 798 24880; E-mail: gisbert.schneider{at}modlab.de
Protein targeting in eukaryotic cells is vital for cell survival and development. N-terminal signal peptides guide proteins to the membrane of the endoplasmic reticulum (ER) and initiate translocation into the ER lumen. Here, we review the status of signal peptide architecture and prediction with an emphasis on exceptionally long signal peptides, which often escape the notion of the currently available prediction methods. We benchmark publicly available prediction methods for their ability to correctly identify exceptionally long signal peptides. A set of 136 annotated eukaryotic signals served as reference data. The best prediction tool detected only 63%. A potential reason for the poor performance is the domain architecture of long signal peptides, whose structural peculiarities are insufficiently considered by current prediction algorithms. To overcome this limitation, we motivate a general domain view of long signal peptides, which becomes detectable when both the overall length and secondary structure of long signal peptides are taken into consideration. This concept provides a structural framework for identifying and understanding multiple targeting and post-targeting functions.
Keywords: bioinformatics, machine learning, organelle, protein targeting, signal sequence, transit peptide
Submitted: April 5, 2009. Received (in revised form): May 18, 2009.