Briefings in Bioinformatics Advance Access originally published online on August 27, 2007
Briefings in Bioinformatics 2007 8(5):277-278; doi:10.1093/bib/bbm041
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Current progress in bioinformatics 2007
Briefings in Bioinformatics is pleased to present our third annual Current Progress in Bioinformatics special issue. As in previous years, we have attempted to identify exciting or emerging fields of bioinformatics, and have asked leaders in these fields to present a brief summary of progress over the last 18–24 months and an annotated biography drawing attention to papers of particular significance.Each year, we have a logistical task of setting the order of the articles to appear in this volume. Typically, we organize them based on the linear logic of biology's central dogma: from DNA to RNA to protein to function and phenotype. The central dogma has undergone a transformation in the last 10 years, however. Biologists have demonstrated that the simple, linear model must be augmented with multiple feedback loops. For example, RNA feeds back to affect gene regulation, and feeds forward to modulate protein function. RNA itself is modified by proteins that can alter the message. The linear central dogma has become the networked central dogma! Our field of bioinformatics is also starting to become a network. The relationship between different subdisciplines is getting increasingly complex, and promises to keep us all busy for some time. So what is the appropriate order of seven reviews on (i) metabolomics, (ii) structured RNA, (iii) proteomics, (iv) gene product networks, (v) proteins and disease, (vi) biodiversity and (vii) text mining? Well, we chose that order. Hopefully it makes sense, but probably it doesn't matter!
In the first review, David Wishart provides a summary on Current Progress in Computational Metabolomics. He first provides a useful introduction to metabolomics—the study of the small molecules that interact with biological macromolecules as one of the major omics pillars. As expected, many of the challenges to metabolomics are analogous to similar challenges in other branches of bioinformatics—the need to catalog small molecules in databases, to search, compare and classify them. In addition, there are important vocabulary and standards issues. There are particular challenges relating to the experimental reality of proteomics, where analytic measurements require special purpose software for laboratory information management and interpretation. Metabolomics is a field where chemoinformatics touches genomics. Thus, we get a glimpse of a future where informatics tools from neighboring disciplines are interoperable and create a potent infrastructure for discovery and engineering.
Alain Laederach next reports on Informatics challenges in Structured RNA. Understanding the protean (!) functions of RNA is a new challenge for computational biology and bioinformatics. In particular, the field is approaching challenges associated with understanding the physical properties of 3D RNA molecules, which (like proteins) fold into precise 3D shapes, catalyze many important reactions, and participate in the control of gene expression. Unlike proteins, however, they are made of four subunits (not 20), are dominated by electrostatics (RNA itself is extremely electronegative), and form their secondary structure almost entirely before assembling into their final 3D structure. Special purpose methods, therefore, are required to analyze the experimental data about their folding kinetics, the nature of their thermodynamic stability, and to understand how to use RNA 3D structure to understand function when the 1D sequence alone is insufficient.
Bobbie-Jo Webb-Robertson writes about Current Trends in Computational Inference from Mass Spectrometry-based Proteomics. She focuses, in particular, on recent achievements in mass-spectrometry (MS) based proteomics. The power of MS for interrogating cellular protein populations is immense—it can be used to identify proteins, characterize new proteins by de novo sequencing, characterize posttranslationally modified proteins, quantify proteins in the cellular milieu and assess protein–protein interactions. As high-throughput biology extends from genome to transcriptome to proteome, the richness of information with direct relevance to phenotype is exciting. Of course, systems biological models should benefit greatly from proteomic technologies—which provides parameters for the models and is useful for validation.
Balaji Srinivasan's contribution, Toward Reference Networks for Key Model Organisms reviews the key data sources used to build interaction networks, and then discusses progress in the algorithms for comparing the resulting networks. In particular, he discusses methods for aligning networks to identify conserved network modules as well as those that confer species-specific capabilities. The challenges involve integration of multiple data sets, dealing with the differential noise in these data sets, creating robust data structures for representation and visualizing these networks. Interestingly, these activities highlight the need for a network ontology (NO!) that provides standard terms for annotating interaction networks.
Maricel Kann provides an overview of work in Protein Interactions and Disease: Computational Approaches to Uncover the Etiology of Diseases. She focuses on the role of proteins in disease, ranging from individual protein mutations, to protein ensembles, to networks of interacting proteins. In each of these there are difficult challenges, in part stemming from the hierarchical relationship of individual molecular mutations to emergent functional properties of the cell (and organism). Until recently, protein–interaction data was not amenable to high-throughput experimental measurements, but their emergence promises a connection between structural bioinformatics and systems biology.
Indra Sarkar provides a fascinating introduction to a fascinating emerging field in Biodiversity Informatics: Organizing and Linking Information Across the Spectrum of Life. Rising out of multiple intellectual threads, this currently focuses on methods for generating reliable species identifiers. With increasing interest in complete characterization of the species found in ecological niches (consider, for example, the metagenomic sequencing of entire bacterial ecosystems), nomenclatures for species have become critical. At the same time, we need to integrate hundreds of years of taxonomic and biological phenotypic descriptions into the digital corpus. The first critical achievements are the creation of systems for naming species, resolving alternative names and querying large heterogeneous data sources with them. Species nomenclatures are only the beginning, as we also need to characterize their niches—climates, geography, disease and other interacting species. The diversity of challenges is truly impressive.
Finally, Pierre Zweigenbaum and colleagues provide a comprehensive review of recent biological natural language processing (BioNLP) in New frontiers for biomedical text mining: current progress. Biological text analysis is an important activity in bioinformatics. Despite the obvious drawbacks, scientists continue to publish their scientific findings using natural language—with all its ambiguities and subtleties. Worse yet—scientists are creating and reporting useful knowledge at rates far beyond what can be read even by the most motivated. There has been outstanding progress in some of the traditional areas of BioNLP—to the point where the authors suggest that some problems are solved or nearly solved. They introduce fascinating new areas, such as mining the figures in scientific papers, supporting the activities of database curators and mapping text to ontologies.
Together, these seven reviews offer an exciting view of a field that continues to grow and extend its influence to all areas of biomedicine, creating a network of useful methods and data structures that will enable the next generation of discovery and engineering.
Stanford University
Stanford, CA 94305
USA
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||