Briefings in Bioinformatics Advance Access originally published online on October 13, 2006
Briefings in Bioinformatics 2006 7(4):318-330; doi:10.1093/bib/bbl036
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Resources for integrative systems biology: from data through databases to networks and dynamic system models
Corresponding author. Marketa Zvelebil, Bioinformatics and Systems Biology Group, Ludwig Institute for Cancer Research, University College London Branch, 91 Riding House Street, London W1W 7BS, UK. Tel: +44-20-7878-4012; Fax: +44-20-7878-4040; E-mail: marketa{at}ludwig.ucl.ac.uk
| ABSTRACT |
|---|
In systems biology, biologically relevant quantitative modelling of physiological processes requires the integration of experimental data from diverse sources. Recent developments in high-throughput methodologies enable the analysis of the transcriptome, proteome, interactome, metabolome and phenome on a previously unprecedented scale, thus contributing to the deluge of experimental data held in numerous public databases. In this review, we describe some of the databases and simulation tools that are relevant to systems biology and discuss a number of key issues affecting data integration and the challenges these pose to systems-level research.
Keywords: bioinformatics, systems biology, database and knowledgebase, dynamic modelling, networks and modules, data integration
| INTRODUCTION |
|---|
While there have been multiple definitions for systems biology, it is in essence, a discipline that seeks to promote a comprehensive approach towards understanding biological processes as integrated systems in which cellular components and networks interact dynamically within temporal, spatial and physiological contexts.
It would therefore be helpful to think about systems approaches in terms of integrating genome- and proteome-wide analyses with context-specific inquiry through hypothesis-driven experimentation. The objective is to develop quantitative models that realistically describe or predict the flow of information and the control mechanisms that determine cellular function in biological systems under different pathophysiological conditions.
Cellular and physiological processes are complex systems [1] that are modulated by signals from the extracellular environment and coordinated by intracellular interaction and transcriptional or gene regulatory networks assembled into functional modules [2]. In order to understand cellular processes as interconnected and interdependent systems and in the context of a biological phenomenon, requires an integrative approach that draws upon data from as many diverse data sources as possible including data from the literature, public databases, biochemical and kinetic experiments, phenotype studies and high-throughput analyses of the genome, transcriptome, proteome, interactome and metabolome (Figure 1).
|
Here, we review some of the many public databases, simulation tools, and standards that could serve as a useful resource for systems biology. We will discuss a number of key issues affecting data integration and the challenges these pose to systems biology.
| DATA AND RESOURCES FOR SYSTEMS BIOLOGY |
|---|
We have compiled a list of more than 150 publicly available databases, tools and resources, which we feel would be of relevance to systems biology (Table S1 in Supplementary Material, and also accessible from http://libra.licr.org/links.jsp). A selection of these resources and the data types they contain are described below.
Gene expression data
With the rapid growth in gene expression data, several public repositories were established to facilitate the annotation, management and storage of this data, thus providing free distribution and shared access to the data. ArrayExpress [3], Gene Expression Omnibus (GEO) [4] and CIBEX [5] are the three main repositories recommended by the Microarray Gene Expression Data (MGED) society [6] for the submission of data to support manuscript publication. These and other gene expression data repositories adopt standards developed by the MGED, which include the Minimum Information About a Microarray Experiment (MIAME) [7] and Microarray Gene Expression Markup Language (MAGE-ML) [8]. ArrayExpress stores well-annotated raw and normalised data from 12 000 hybridisations across 35 species, while GEO holds over 30 000 submissions for more than 100 organisms. GEO also contains data from other microarray-related approaches including chromatin immunoprecipitation (ChIP)-chip and tiling arrays. Like CIBEX, GEO also accepts and makes available non-array-based high-throughput data from serial analysis of gene expression (SAGE) and mass spectrometry (MS) proteomic analyses. Other repositories include the Stanford Microarray Database (SMD) [9], which contains 50 000 microarrays from 34 organisms, as well as those which feature specialised collections, e.g. Oncomine (a cancer microarray database) [10], Mouse Genome Informatics Gene Expression Database (MGI GXD, a mouse gene expression database) [11] and SGD (a yeast gene expression database) [12]. A useful link [13] and reviews [14, 15] about established software tools for gene expression data analysis are available.
Proteomic data
The term Proteome is used to describe the entire protein complement of a genome [16]. Unlike the genome the proteome is not static, but changes according to the state of the cell and external environmental conditions. In addition, the functional states of many proteins are altered by post-translational modifications such as glycosylation and phosphorylation. It is only by direct measurements of proteins under different experimental conditions that the proteome can be investigated. There are essentially three aspects to proteomic analysis: (i) separation of the protein mixtures, (ii) quantification of separated components and (iii) identification of distinct proteins and their post-translational modifications. Databases have been created to store and track the data created from each of the above steps.
High-resolution two-dimensional gel electrophoresis (2DE) enables identification and characterisation of proteins in mixtures, and is used to compare expression patterns and modification states of proteins from, for example, different cell types across different conditions. Several databases such as SWISS-2DPAGE [17], DCTB 2D-PAGE [18], GELBANK [19] and 2D-PAGE/DIFF [20] were developed to capture the growing amount of data generated by 2DE. The SWISS-2DPAGE database (ExPASy Proteomics Server) contains both 2-D PAGE and SDS-PAGE reference maps with a high level of annotation and integration with other relevant databases including SwissProt. The DCTB 2D-PAGE database features galleries of 2-D gels of cells, tissues and fluids, as well as 2-D immunoblots and immunohistochemistry images. The DIFF database presents differentially regulated (microbial) proteins and their abundance obtained from quantitative gel image analysis.
Developments in MS have accelerated large-scale identification of proteins from 2DE gels. Profiles from MS or tandem MS analysis are searched against protein fragment databases such as OPD [21] (which contains some 1.2 million spectra from four organisms) using specialised web-based search tools like ProteinProspector [22], Mascot [23] and X!Tandem [24]. The PRIDE database [25] presents an integrated platform by drawing information from protein databases, peptide identifications with literature annotations, supporting mass spectra data and information on post-translational modifications. Peptide-Atlas [26] is another very useful repository that serves to archive and integrate proteomic data with genomic data.
Protein interaction data
Characterising the physical interactions and functional associations in which genes and proteins participate can facilitate the extraction of meaningful network information for systems-level analysis. Comprehensive databases such as BIND [27], DIP [28], HiMAP [29], IntAct [30], MINT [31], MIPS/MPPI [32], pSTIING [33, 34] and STRING [35] contain protein interaction data assembled from a variety of sources including data generated from large-scale methods such as two-hybrid [36], coimmunoprecipitation, protein chip analyses [37], affinity purification [38, 39] and MS [40]. pSTIING also incorporates transcriptional regulatory associations with protein interaction information. This integration allows distinct protein interaction networks or pathways to be connected via transcriptional links. While databases like iHOP [41] use automated literature text-mining methods to infer and build networks, HPRD [42] and Reactome [43] contain interaction information that is manually extracted from the literature. A number of protein interaction databases such as HiMAP, IntAct, MINT, pSTIING and STRING have incorporated graphical tools for visualising dynamically generated network maps. Alternatively, interaction networks could also be visualised using external graphical programs such as BioLayout [44], Cytoscape [45], Osprey [46], Pajek [47] and VisANT [48]. Network representation of interactions serves as a very useful base from which functional modules may be defined, particularly when interaction information is combined with connectivity-based computational methods [4951] and other functionally relevant experimental approaches such as expression correlation or time-series analysis [52].
Metabolomic and metabonomic data
Metabolomics seeks to identify and measure the concentration of all metabolites (or the metabolome) of the cell, while metabonomics provides a multivariate extension of this study in multicellular systems by measuring the dynamic metabolic responses toward environmental, pathophysiological or genetic perturbations [53]. The metabolome is directly influenced by the extent to which the transcriptome, proteome and interactome change in response to a stimulus, and therefore, provides a good indication of the cellular state of the biological system under study. Analysis of the metabolome involves high-performance electrospray ionisation MS (ESI-MS), or linked to a separation method like liquid chromatography (LC) and gas chromatography (GC).
The Human Metabolite Database [54] was developed as part of the Human Metabolome Project and contains more than 1400 metabolite entries with cross-references and links to biochemical, enzymatic and clinical data from other public databases including KEGG, PubChem, MetaCyc, ChEBI, PDB, SwissProt and GenBank. Several other databases were created to store quantitative metabolite data generated from different technology platforms. The METLIN database [55] has a catalogue of high-resolution Fourier transform mass spectrometry (FTMS) spectra, tandem MS spectra, and LC-MS data linked to known metabolite structural information. The Golm Metabolome Database (GMD) [56] is a repository for metabolite profiling experiments including GC-MS and mass spectra libraries. It also serves as an open exchange platform for data and information sharing. A specialised database on nutrient-metabolome interaction is currently under development by the European Nutrigenomics Organisation [57].
Phenomic and phenotypic data
Phenomics involves the use of high-throughput methodologies to determine the phenotypic response of whole biological systems to environment, pathophysiological or genetic perturbations. Large-scale screening strategies such as dsRNA-mediated interference (RNAi), in which libraries of RNAi can be designed to induce a partial loss of function (i.e. knocked down) for every gene in the genome, provide a rapid way to obtain in vivo functional information such as genotypephenotype associations. This was first pioneered in Caenorhabditis elegans and then extended to Drosophila melanogaster. The deluge of data generated by these genome-scale screens resulted in a growing number of databases and repositories.
RNAiDB [58, 59] is a database containing phenotypic data from large-scale RNAi analyses in C. elegans. The database also contains information on experimental methods and phenotypic results, including raw data in the form of images and streaming time-lapse movies. A more specialised database focusing on mitotic cell division is PhenoBank [60], which contains time-lapse microscopy of the early embryo and phenotype data from genome-wide RNAi screens in C. elegans.
FlyBase [61] is a Drosophila-specific repository containing a variety of curated data including annotated genomes, expression patterns, mutant phenotypes, genetic interactions, aberrant chromosomes, genetic stock collections and anatomy images. Other databases containing data from large-scale RNAi screens in Drosophila include the Drosophila RNAi Screening Center (DRSC) database [62] and FLIGHT [63]. Additionally, FLIGHT provides information about cell lines, protocols, dsRNA primer sequences and microarray expression data for many of the Drosophila cell lines commonly used in RNAi screens.
Phenotypic data from mammalian systems is available from the The Mouse Phenome Database (MPD/MGI) [64] and the Online Mendelian Inheritance in Man (OMIM) [65], which is a knowledgebase of human genes and genetically determined phenotypes and disorders. MPD/MGI is a repository for raw data and detailed protocols from the Mouse Phenome Project, which aims to collect baseline phenotypic data on genetically diverse and commonly used inbred mouse strains.
The various databases and resources containing phenotypic data tend to be species-specific. Enabling cross-species comparative phenomic analysis is the goal of PhenomicDB [66], which is an integrated multi-species genotype-phenotype database created by integrating data from various species-specific databases including OMIM, MPD/MGI and FlyBase. This is achieved through coarse-grained semantic mapping of the phenotypic data and the use of associated orthology relationships, enabling phenotypes for a given gene or a set of genes from different organisms to be compared simultaneously. A publicly accessible version is available [67].
Pathways and quantitative models
Biological pathways and functional modules provide the conceptual framework from which models may be constructed (Figure 1). A wealth of pathway information is available from various web-accessible resources. For example, MetaCyc [68] contains some 700 curated metabolic pathways, while others like AfCS [69], BioCarta [70], PANTHER [71], pSTIING [34] and STKE [72] tend to focus on signal transduction pathways. pSTIING allows related canonical pathways to be connected up as integrated networks, which could be further extended by selectively incorporating protein interaction and transcriptional information from the database into the growing network. Comprehensive databases like KEGG [73], GenMAPP [74] and Reactome [43] contain both metabolic and signalling pathways. These resources provide very useful qualitative mappings of functional associations between key components in canonical pathways. However mathematical or computational modelling also requires quantitative data (such as concentration or kinetics), which are context-dependent and specific to the conditions under which they were generated. Very often, due to non-availability of data specific to the system under investigation, estimates are taken based on published measurements from related systems. For example, in a mitogen-activated protein (MAP) kinase cascade model [75], concentrations of MAP kinases were estimated from Xenopus oocytes and Km values for phosphorylation of MAP kinase were from phosphorylation data of mammalian ERK2. Therefore, there is a need for closer integration of quantitative modelling with experimentation. Efforts to truly integrate these are only just beginning [76], yielding testable quantitative models that can be experimentally verified or refined [77, 78].
Several repositories (Table 1 and Table S1, Supplementary Material) have been established for a growing number of quantitative models, which ideally should incorporate quantitative data generated from or validated with directed biochemical studies. CellML [79], which stores its models in Cell Markup Language (CellML) format, is currently the largest repository with 188 models representing cellular processes in for example, electrophysiology, metabolism, signal transduction and mechanics. Other model repositories use the Systems Biology Markup Language (SBML) format. This includes the BioModels Database [80, 81] which contains some 50 models with annotations featuring the use of controlled vocabularies and links to other relevant data resources such as literature reference and databases of compounds and pathways. In addition to being a model repository, SigPath [82] also provides an interactive interface for the user to assemble their pathway model using user-defined components, reactions and quantitative information. Some model repositories like JWS Online Cellular System Modeling [83] also provide tools for simulation, while others like CellML and SigPath are equipped with visualisation utilities.
|
Simulation tools
Most simulation tools are packaged as stand-alone utilities, although they may also be developed as libraries or as component-oriented software. Crucially, any simulation tool must be able to perform a mathematical simulation on a model retrieved from a file, database or graphic user interface (GUI), and output the simulation results. Libraries such as SOSlib [84] provide application programming interfaces (API) for simulation and analysis. Libraries can also be written specifically for general-purpose mathematical software. For example, the SBMLToolbox [85] is developed for Matlab, while MathSBML [86] is created for Mathematica. Both libraries use models in SBML format. Although most simulation tools were developed with SBML support in mind, the CellML format is also supported by software such as JSim [87].
Stand-alone simulation tools like BioTapestry [88] and PathwayLab [89] have additional features like model-building functions or graphical visualisation capabilities. More sophisticated tools such as CellWare [90] and SimCell [91] can also run in a distributed environment, thus overcoming the limitation of local computational resources.
The development of stand-alone tools inevitably results in a certain degree of functional overlap between related tools, i.e. a duplication of basic features and also a duplication of effort. A solution to this problem would be to use a component-oriented design strategy, which is a modular approach that builds on existing infrastructure. For example, if we wish to build a simulation tool employing a new algorithm, we only need to develop a module or component that can be integrated or plugged into an existing software framework via a defined protocol. This approach circumvents the need to build the entire software from scratch. An example of such a software framework is the Systems Biology Workbench (SBW) [92], which provides the modular infrastructure with which components can connect via a simple network protocol. A number of visualisation and modelling tools including CellDesigner [93] and PNK 2e [94] use simulation and other analysis packages (or modules) through SBW.
Another example of open source framework for Systems Biology is Bio-SPICE [95]. Bio-SPICE combines different software from different vendors, by integrating each tool into a pipeline to establish work flows for modelling, analysis and simulation through Bio-SPICE's core application, Dashboard.
Differential equations are widely used in many simulation tools including SimBiology [96], MathSBML, SBMLToolbox, PyBioS [97] and The Virtual Cell [98]. Some tools employ other methods such as stochastic simulation (Dizzy [99] and STOCKS [100]), Bayesian inference (WinBUGS [101]) and cellular automata (SimCell and MCell [102]). There are also other tools that provide more than one simulation algorithm. For example, E-Cell [103] uses differential equations, stochastic, power-law and discrete time analyses, while Copasi [104] employs differential equations and stochastic methods.
| INTEGRATING DIVERSE DATA TO FACILITATE SYSTEMS BIOLOGY INVESTIGATIONS |
|---|
Just because databases contain a different subset of biological information, a user should not have to visit multiple databases repeatedly to extract and manually assemble the various bits of information needed. Ideally, data from different database sources should be integrated into a separate database either locally or in a distributed manner to facilitate flexible and selective mining, cross-correlation and analysis. Integration of diverse data enables multiple features of a biological system to be analysed and incorporated into a testable dynamic model. This is refined iteratively with further hypothesis-driven investigations and perturbation analyses to achieve realistic and biologically relevant results.
There are several ways to integrate biological data. One approach is to assemble various biological data sources into a single biological database, known as data warehousing. Data from different data sources is imported into a unified structure in a local database. Another method is to build an environment around distributed data sources. An ideal solution is when each distributed database provides a standard interface (e.g. web service) for querying, so that when a search is triggered, the relevant information is assembled through a distributed querying system, consolidated and presented in an integrated format. The simplest and therefore the most popular variation of distributed integration is link integration, where links to relevant entries in other databases are created. While this is very useful for web applications as it provides easy navigation between different web pages, link integration is unfortunately not suitable for data analysis, when actual data instead of links to web pages is required.
Data integration issues
Each data integration method has its inherent advantages and disadvantages. The distributed database approach is very robust and flexible, but requires strict coordination among developers and compliance to standards. The single database method provides speed, reliable access and a convenient way to analyse data, but requires extensive maintenance, as there is no guarantee of the extent to which the structure of the external source databases will remain unaltered, thus requiring frequent fixes to current versions of data import scripts. In fact, the constant evolution of databases remains a significant data integration issue, particularly when unique identifiers change.
Other issues impeding effective data integration include inconsistencies in nomenclature and differences in conceptual understanding and use of terminology across databases developed by different research communities [105], resulting in naming clashes and semantic ambiguities in the annotations.
Probably the most acute data integration problem affecting systems biology is that presented by the heterogeneity of experimental data because this data is highly context-dependent [76]. For example, data from biochemical and kinetic measurements or gene expression analysis is meaningful only in the context of a detailed record of the conditions under which they were generated, such as the state of the system or perturbations introduced [7].
| STANDARDS FOR DATA INTEGRATION AND EXCHANGE IN SYSTEMS BIOLOGY |
|---|
|
|
|---|
Accurate systems modelling requires standardisation of description formats for models and data, which should contain sufficient ancillary information to be reproduced or used for downstream analysis. The development of well-structured controlled vocabularies, the adoption of standardised data exchange formats and the use of model representation languages are some of the ways to facilitate flexible data integration and model exchange for systems analysis.
Ontologies
Ontologies are well-structured controlled vocabularies which capture and convey commonly agreed definitions and concepts. Replacing the use of free-text descriptions with ontologies can facilitate data integration through semantic references. The hierarchical organisation of biological ontologies permits linking through shared ontology.
The Open Biomedical Ontologies (OBO) [106] is currently being developed for shared use across different biological and medical domains. Several domain-specific ontology projects are currently ongoing. A subset of these is associated with the OBO Foundry, which standardises the development of ontologies by adopting a set of principles specifying best practices in ontology development. Current members of the OBO Foundry include:
- Gene Ontology (GO) [107] which defines gene and gene product attributes,
- Cell Ontology (CL) [108], which defines cell types,
- Sequence Ontology (SO) [109], which provides a structured representation for annotations of nucleic acids, genomic databases and mutations,
- Chemical Ontology (ChEBI) [110] for small chemical compounds,
- Phenotype Ontology (PATO) [111] for describing phenotypic data,
- Functional Genomics Investigation Ontology (FuGO) [112] for annotating functional genomics experiments,
- Foundational Model of Anatomy (FMA) [113] for describing the structure of the human anatomy, and
- Relation Ontology (OBO_REL) [114], which defines relationships between entities and concepts.
Data exchange standards
For gene expression
MIAME [7] is a standard developed by the MGED society, which provides a framework for defining the type of data that should be stored for each microarray experiment. There are at least six attributes or minimum information types required by MIAME: experimental design, array design, details of samples, hybridisation conditions, image/spot measurements and normalisation methods used. This is to ensure that microarray data conforms to a common standard to facilitate data exchange using communication syntaxes of the MAGE-ML [8].
For proteomics
There are at least two extensible markup language (XML)-based standards for proteomic data: the Human Proteome Markup Language or HUP-ML (www1.biz.biglobe.ne.jp/~jhupo/HUP-ML/hup-ml.htm), and the Annotated Gel Markup Language or AGML [115]. These provide a standardisation framework for the annotation of 2DE gels and also formats for storing MS data related to specific spots. AGML also defines the Minimum Information about 2D Gel electrophoresis experimental protocols (MI2DG) and the Minimum Information About a Proteomics Experiment (MIAPE) [116], which will enable 2D electrophoresis protocols and proteomic experimental details to be machine readable.
For molecular interactions
The Proteomics Standards Initiative (PSI) Molecular Interaction (PSI MI) [117] is a data exchange format for molecular interactions. PSI MI is currently limited to proteinprotein interactions, but may be extended to include small molecules and nucleic acids in the future. PSI MI is supported by a growing number of protein interaction databases including BIND, DIP, IntAct, MINT and HPRD.
For biochemical models
MIRIAM (or Minimum Information Requested in the Annotation of biochemical Models) [118] is a standard for curating quantitative models of biological systems. MIRIAM lays down a set of rules that describe how a compliant model should be structured, e.g. using a standard markup language and capturing the necessary initial conditions and parameters. MIRIAM also details the annotation scheme, such as the use of appropriate unique identifiers that link models to documentation detailing full descriptions of the models.
For biological pathways
Biological Pathway Exchange (BioPAX) [119, 120] is a collaborative effort to create a data exchange format for biological pathway data. BioPAX specifies a standard file format to facilitate the integration and exchange of pathway data. The BioPAX format also provides a standard format for software tools to access pathway data, thus encouraging distributed curation and data sharing. Currently there are two levels in BioPAX. Level one involves metabolic networks and level two includes molecular interaction networks. Future levels will cover gene and DNA interactions, signal transduction and genetic interactions. Databases which support BioPAX include Kegg, BioCyc, Reactome, etc.
For graphical representation of networks
The System Biology Graphical Notation (SBGN) [121] project aims to facilitate consistency and uniformity in the diagrammatic representation of biological networks (i.e. analogous to the standardisation of electronic circuit diagrams).
As there are currently no established standards, the molecular interaction map (MIM) [122] project is also hoping to propose its graphical notation as the standard for regulatory network representation.
Model representation languages
SBML [123] is a machine-readable format for describing qualitative and quantitative models of biochemical networks, including metabolic networks, cell-signalling pathways and regulatory networks. The aim of SBML is to enable the exchange of models between software tools. SBML is widely adopted with almost 100 software systems and databases supporting it.
The CellML [124] is a language for describing and exchanging models of cellular and subcellular processes, their structure, the underlying mathematics and any associated metadata. CellML2SBML [125] is a new tool that has been recently developed to convert models expressed in CellML into SBML without significant loss of information.
| FROM EXPERIMENTAL DATA TO NETWORKS AND DYNAMIC SYSTEMS |
|---|
By extending hypothesis-driven experimentation to incorporate network and systems-wide information, integrative systems biology aims to provide additional insights not implicit from data generated by conventional experimental approaches or theoretical predictions alone. While this remains an ideal to strive towards, there are already some fine examples where certain elements of this integrative process have produced very encouraging and interesting results.
For example, although much is known about the induction of single genes in microbial-specific immune response of macrophages activated via Toll-like receptors (TLRs), the overall understanding of the regulatory control mechanisms of this complex transcriptional programme involving more than 1000 genes is still very much lacking [126]. Gilchrist et al. [126] adopted a systems approach by integrating genome-wide gene expression microarray data with interactome and network analysis, ChIP-chip and promoter analysis, which identified the induction and involvement of another transcription factor, ATF3 in TLR-mediated signalling, in addition to the well-characterised NF-
B family of transcription factors. A kinetic model incorporating the newly identified contribution of ATF3 along with NF-
B in the transcriptional control of cytokines [interleukin6 (IL6) and IL12ß] expression, predicted that ATF3 is a negative regulator of these cytokines, which was validated experimentally using Atf3/ mice. Therefore this integrated approach provided new insights into a previously unknown negative feedback loop regulating the TLR-mediated inflammatory response.
Work by Swameye et al. [78] provides another example where the close coupling of dynamic modelling with experimental data can facilitate functional understanding at the systems level. A mathematical model of the core module of the JAK-STAT pathway was established using time-resolved quantitative measurements of EpoR and STAT5 phosphorylation to estimate the dynamical parameters and determine the quantitative behaviour of the various STAT5 populations, some of which are not easily accessible from experimental measurement. Previous understanding of the JAK-STAT pathway tends to focus on the unidirectional flow of information from the cell surface to the cytoplasm and the nucleus. In contrast, their model reveals that STAT5 undergoes rapid nucleocytoplasmic cycling, suggesting a new function for STAT5 where it forms a remote sensor between the nucleus and receptor. This was verified experimentally, confirming that the inhibition of nuclear export results in a reduced transcriptional yield and that nucleocytoplasmic cycling is an important systems property of the JAK-STAT pathway.
These examples represent some of the many new and exciting developments adopting an integrative systems approach, incorporating various elements of hypothesis-driven experimentation, genome-scale investigations, data mining from public databases, bioinformatics, biological network analysis and dynamic modelling.
| CONCLUDING REMARKS |
|---|
The exciting and promising field of integrative systems biology comes with a number of practical challenges that will confront practitioners of systems-level research. The key issues we have outlined involve data integration, model exchange and software interoperability. While systems-level investigations rely substantially on a wide range of experimental data types provided by an ever-growing number of public databases and resources, the data sets are often not standardised or annotated sufficiently to facilitate data integration for systems-level analysis. This calls for the continued support and the widespread adoption of standards (e.g. MIAME, MIAPE, PSI MI) and well-structured controlled vocabulary (e.g. OBO), which should be implemented at various stages including data acquisition. Related to this is the recognition that experimental data for systems modelling is highly context dependent, requiring datasets to have full descriptions of experimental conditions and the need for quantitative modelling to be more closely intertwined with experimental investigations.
As the number of quantitative models grows, the development and adoption of a standard format (e.g. SBML) for model representation will allow models to be shared more easily, so that others in the research community can build upon these for further development. In a similar manner, software utilities could be built as modular units that could communicate seamlessly with other components within a common software infrastructure or framework (e.g. the SBW).
These challenges are not insurmountable but require the concerted effort of the entire research community across multiple disciplines. If done right, integrative systems biology will transform the way data from diverse sources is harnessed to provide comprehensive systems-level understanding of pathophysiological processes.
Key Points
|
| Acknowledgements |
|---|
The authors thank Anne Ridley and Buzz Baum for very helpful discussions, David Sims and Konstantinos Lykostratis for suggesting several useful websites for inclusion into the list of systems biology resources, and the manuscript reviewers for their very helpful insights and suggestions.
| FOOTNOTES |
|---|
Aylwin Ng is a postdoctoral research fellow at the Ludwig Institute. His background spans computational biology, immunology and virology. His research is in integrative systems biology and involves defining networks and functional modules in inflammatory processes.
Borisas Bursteinas is a postdoctoral research fellow at the Ludwig Institute. His background is in distributed data mining and software engineering. His current research is in computational systems biology.
Qiong Gao is a scientific officer at the Ludwig Institute. She has a dual background in chemistry and computer science. She is involved in the design and development of a variety of analytical tools and bioinformatics software.
Ewan Mollison has a background in Microbiology, Biotechnology and Bioinformatics. He is a scientific officer at the Ludwig Institute and is currently involved in data curation and integration.
Marketa Zvelebil is the group leader of Bioinformatics and Systems Biology at the Ludwig Institute. Her background is in structural bioinformatics. She is involved in bioinformatic analysis of proteomics and structural studies of PI-3 kinases and other drug-ligand interactions.
Submitted: June 6, 2006. Received (in revised form): September 6, 2006.
| References |
|---|
- Kitano H. Computational systems biology. Nature 2002; 420:20610.[CrossRef][Medline]
- Hartwell LH, Hopfield JJ, Leibler S, et al. From molecular to modular cell biology. Nature 1999; 402:C4752.[CrossRef][Medline]
- Parkinson H, Sarkans U, Shojatalab M, et al. ArrayExpressa public repository for microarray gene expression data at the EBI. Nucleic Acids Res 2005; 33:D5535.
[Abstract/Free Full Text] - Boyle J. Gene-Expression Omnibus integration and clustering tools in SeqExpress. Bioinformatics 2005; 21:25501.
[Abstract/Free Full Text] - Ikeo K, Ishi-i J, Tamura T, et al. CIBEX: center for information biology gene expression database. C R Biol 2003; 326:107982.[Web of Science][Medline]
- Microarray Gene Expression Data (MGED) society http://www.mged.org (1 June 2006, date last accessed).
- Brazma A, Hingamp P, Quackenbush J, et al. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet 2001; 29:36571.[CrossRef][Web of Science][Medline]
- Spellman PT, Miller M, Stewart J, et al. Design and implementation of microarray gene expression markup language (MAGE-ML). Genome Biol 2002; 3: RESEARCH0046.
- Ball CA, Awad IA, Demeter J, et al. The Stanford Microarray Database accommodates additional microarray platforms and data formats. Nucleic Acids Res 2005; 33:D5802.
[Abstract/Free Full Text] - Rhodes DR, Yu J, Shanker K, et al. ONCOMINE: a cancer microarray database and integrated data-mining platform. Neoplasia 2004; 6:16.[Web of Science][Medline]
- Hill DP, Begley DA, Finger JH, et al. The mouse Gene Expression Database (GXD): updates and enhancements. Nucleic Acids Res 2004; 32:D56871.
[Abstract/Free Full Text] - Christie KR, Weng S, Balakrishnan R, et al. Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms. Nucleic Acids Res 2004; 32:D3114.
[Abstract/Free Full Text] - A useful link for gene expression data analysis software tools http://www.biodirectory.com/directory/Microarrays/Software_2.html (1 June 2006, date last accessed).
- Bowtell DD. Options availablefrom start to finishfor obtaining expression data by microarray. Nat Genet 1999; 21:2532.[CrossRef][Web of Science][Medline]
- Bassett DE Jr, Eisen MB, Boguski MS. Gene expression informaticsit's all in your mine. Nat Genet 1999; 21:515.[CrossRef][Web of Science][Medline]
- Wilkins MR, Williams KL, Appel RD, et al. Proteome research: new frontiers in functional genomics. Berlin: Springer-Verlag 1997.
- Hoogland C, Mostaguir K, Sanchez JC, et al. SWISS-2DPAGE, ten years later. Proteomics 2004; 4:23526.[CrossRef][Web of Science][Medline]
- DCTB 2D-PAGE http://proteomics.cancer.dk/cgi-bin/CelisWeb.exe?MsetList.htm (1 June 2006, date last accessed).
- Babnigg G, Giometti CS. GELBANK: a database of annotated two-dimensional gel electrophoresis patterns of biological systems with completed genomes. Nucleic Acids Res 2004; 32:D5825.
[Abstract/Free Full Text] - Pleissner KP, Schmelzer P, Wehrl W, et al. Presentation of differentially regulated proteins within a web-accessible proteome database system of microorganisms. Proteomics 2004; 4:298790.[CrossRef][Web of Science][Medline]
- Prince JT, Carlson MW, Wang R, et al. The need for a public proteomics repository. Nat Biotechnol 2004; 22:4712.[CrossRef][Web of Science][Medline]
- Clauser KR, Baker P, Burlingame AL. Role of accurate mass measurement (+/10 ppm) in protein identification strategies employing MS or MS/MS and database searching. Anal Chem 1999; 71:287182.[Medline]
- Hirosawa M, Hoshida M, Ishikawa M, et al. MASCOT: multiple alignment system for protein sequences based on three-way dynamic programming. Comput Appl Biosci 1993; 9:1617.
[Abstract/Free Full Text] - Craig R, Beavis RC. TANDEM: matching proteins with tandem mass spectra. Bioinformatics 2004; 20:14667.
[Abstract/Free Full Text] - Jones P, Cote RG, Martens L, et al. PRIDE: a public repository of protein and peptide identifications for the proteomics community. Nucleic Acids Res 2006; 34:D65963.
[Abstract/Free Full Text] - Desiere F, Deutsch EW, King NL, et al. The PeptideAtlas project. Nucleic Acids Res 2006; 34:D6558.
[Abstract/Free Full Text] - Alfarano C, Andrade CE, Anthony K, et al. The Biomolecular Interaction Network Database and related tools 2005 update. Nucleic Acids Res 2005; 33:D41824.
[Abstract/Free Full Text] - Salwinski L, Miller CS, Smith AJ, et al. The Database of Interacting Proteins: 2004 update. Nucleic Acids Res 2004; 32:D44951.
[Abstract/Free Full Text] - Human Interactome Map (HiMAP) http://www.himap.org/index.jsp (1 June 2006, date last accessed).
- Hermjakob H, Montecchi-Palazzi L, Lewington C, et al. IntAct: an open source molecular interaction database. Nucleic Acids Res 2004; 32:D4525.
[Abstract/Free Full Text] - Zanzoni A, Montecchi-Palazzi L, Quondam M, et al. MINT: a Molecular INTeraction database. FEBS Lett 2002; 513:13540.[CrossRef][Web of Science][Medline]
- Pagel P, Kovac S, Oesterheld M, et al. The MIPS mammalian protein-protein interaction database. Bioinformatics 2005; 21:8324.
[Abstract/Free Full Text] - pSTIING http://pstiing.licr.org (1 June 2006, date last accessed).
- Ng A, Bursteinas B, Gao Q, et al. pSTIING: a systems approach towards integrating signalling pathways, interaction and transcriptional regulatory networks in inflammation and cancer. Nucleic Acids Res 2006; 34:D52734.
[Abstract/Free Full Text] - von Mering C, Jensen LJ, Snel B, et al. STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res 2005; 33:D4337.
[Abstract/Free Full Text] - Li S, Armstrong CM, Bertin N, et al. A map of the interactome network of the metazoan C. elegans. Science 2004; 303:5403.
[Abstract/Free Full Text] - Zhu H, Bilgin M, Bangham R, et al. Global analysis of protein activities using proteome chips. Science 2001; 293:21015.
[Abstract/Free Full Text] - Gavin AC, Bosche M, Krause R, et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 2002; 415:1417.[CrossRef][Medline]
- Bouwmeester T, Bauch A, Ruffner H, et al. A physical and functional map of the human TNF-alpha/NF-kappa B signal transduction pathway. Nat Cell Biol 2004; 6:97105.[CrossRef][Web of Science][Medline]
- Ho Y, Gruhler A, Heilbut A, et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 2002; 415:1803.[CrossRef][Medline]
- Hoffmann R, Valencia A. A gene network for navigating the literature. Nat Genet 2004; 36:664.[CrossRef][Web of Science][Medline]
- Peri S, Navarro JD, Kristiansen TZ, et al. Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res 2004; 32:D497501.
[Abstract/Free Full Text] - Joshi-Tope G, Gillespie M, Vastrik I, et al. Reactome: a knowledgebase of biological pathways. Nucleic Acids Res 2005; 33:D42832.
[Abstract/Free Full Text] - Goldovsky L, Cases I, Enright AJ, et al. BioLayout(Java): versatile network visualisation of structural and functional relationships. Appl Bioinformatics 2005; 4:714.[CrossRef][Medline]
- Ideker T, Ozier O, Schwikowski B, et al. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 2002; 18:Suppl 1, S23340.[Abstract]
- Breitkreutz BJ, Stark C, Tyers M. Osprey: a network visualization system. Genome Biol 2003; 4:R22.[CrossRef][Medline]
- Pajek http://vlado.fmf.uni-lj.si/pub/networks/pajek/default.htm (1 June 2006, date last accessed).
- Hu Z, Mellor J, Wu J, et al. VisANT: data-integrating visual framework for biological networks and modules. Nucleic Acids Res 2005; 33:W3527.
[Abstract/Free Full Text] - Girvan M, Newman ME. Community structure in social and biological networks. Proc Natl Acad Sci USA 2002; 99:78216.
[Abstract/Free Full Text] - Milo R, Shen-Orr S, Itzkovitz S, et al. Network motifs: simple building blocks of complex networks. Science 2002; 298:8247.
[Abstract/Free Full Text] - Spirin V, Mirny LA. Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci USA 2003; 100:121238.
[Abstract/Free Full Text] - Calvano SE, Xiao W, Richards DR, et al. A network-based analysis of systemic inflammation in humans. Nature 2005; 437:10327.[CrossRef][Medline]
- Nicholson JK, Wilson ID. Opinion: understanding global systems biology: metabonomics and the continuum of metabolism. Nat Rev Drug Discov 2003; 2:66876.[CrossRef][Web of Science][Medline]
- Human Metabolite Database (HMDB) http://www.metabolomics.ca (1 June 2006, date last accessed).
- Smith CA, O'Maille G, Want EJ, et al. METLIN: a metabolite mass spectral database. Ther Drug Monit 2005; 27:74751.[CrossRef][Web of Science][Medline]
- Kopka J, Schauer N, Krueger S, et al. GMD@CSB.DB: the Golm Metabolome Database. Bioinformatics 2005; 21:16358.
[Abstract/Free Full Text] - European Nutrigenomics Organisation http://www.nugo.org/metabolomics/13184 (1 June 2006, date last accessed).
- RNAiDB http://www.rnai.org (1 June 2006, date last accessed).
- Gunsalus KC, Yueh WC, MacMenamin P, et al. RNAiDB and PhenoBlast: web tools for genome-wide phenotypic mapping projects. Nucleic Acids Res 2004; 32:D40610.
[Abstract/Free Full Text] - PhenoBank http://www.worm.mpi-cbg.de/phenobank2/cgi-bin/MenuPage.py (1 June 2006, date last accessed).
- Grumbling G, Strelets V. FlyBase: anatomical data, images and queries. Nucleic Acids Res 2006; 34:D4848.
[Abstract/Free Full Text] - Flockhart I, Booker M, Kiger A, et al. FlyRNAi: the Drosophila RNAi screening center database. Nucleic Acids Res 2006; 34:D48994.
[Abstract/Free Full Text] - Sims D, Bursteinas B, Gao Q, et al. FLIGHT: database and tools for the integration and cross-correlation of large-scale RNAi phenotypic datasets. Nucleic Acids Res 2006; 34:D47983.
[Abstract/Free Full Text] - Bogue MA, Grubb SC. The Mouse Phenome Project. Genetica 2004; 122:714.[CrossRef][Web of Science][Medline]
- Hamosh A, Scott AF, Amberger JS, et al. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 2005; 33:D5147.
[Abstract/Free Full Text] - Kahraman A, Avramov A, Nashev LG, et al. PhenomicDB: a multi-species genotype/phenotype database for comparative phenomics. Bioinformatics 2005; 21:41820.
[Abstract/Free Full Text] - phenomicDB http://www.phenomicDB.de (1 June 2006, date last accessed).
- Caspi R, Foerster H, Fulcher CA, et al. MetaCyc: a multiorganism database of metabolic pathways and enzymes. Nucleic Acids Res 2006; 34:D5116.
[Abstract/Free Full Text] - Alliance for Cellular Signaling (AfCS) http://www.signaling-gateway.org (1 June 2006, date last accessed).
- BioCarta http://www.biocarta.com/genes/index.asp (1 June 2006, date last accessed).
- Mi H, Lazareva-Ulitsky B, Loo R, et al. The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res 2005; 33:D2848.
[Abstract/Free Full Text] - Signal Transduction Knowledge Environment (STKE) http://stke.sciencemag.org (1 June 2006, date last accessed).
- Kanehisa M, Goto S, Kawashima S, et al. The KEGG resource for deciphering the genome. Nucleic Acids Res 2004; 32:D27780.
[Abstract/Free Full Text] - Dahlquist KD, Salomonis N, Vranizan K, et al. GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways. Nat Genet 2002; 31:1920.[CrossRef][Web of Science][Medline]
- Huang CY, Ferrell JE. Ultrasensitivity in the mitogen-activated protein kinase cascade. Proc Natl Acad Sci USA 1996; 93:1007883.
[Abstract/Free Full Text] - Cassman M. Barriers to progress in systems biology. Nature 2005; 438:1079.[CrossRef][Medline]
- Sasagawa S, Ozaki Y, Fujita K, et al. Prediction and validation of the distinct dynamics of transient and sustained ERK activation. Nat Cell Biol 2005; 7:36573.[CrossRef][Web of Science][Medline]
- Swameye I, Muller TG, Timmer J, et al. Identification of nucleocytoplasmic cycling as a remote sensor in cellular signaling by databased modeling. Proc Natl Acad Sci USA 2003; 100:102833.
[Abstract/Free Full Text] - Crampin EJ, Halstead M, Hunter P, et al. Computational physiology and the Physiome Project. Exp Physiol 2004; 89:126.
[Abstract/Free Full Text] - BioModels Database http://www.ebi.ac.uk/biomodels (1 June 2006, date last accessed).
- Le Novere N, Bornstein B, Broicher A, et al. BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems. Nucleic Acids Res 2006; 34:D68991.
[Abstract/Free Full Text] - Campagne F, Neves S, Chang CW, et al. Quantitative information management for the biochemical computation of cellular networks. Sci STKE 2004; 2004:l11.
- Olivier BG, Snoep JL. Web-based kinetic modelling using JWS Online. Bioinformatics 2004; 20:21434.
[Abstract/Free Full Text] - SOSlib http://www.tbi.univie.ac.at/~raim/odeSolver/(1 June 2006, date last accessed).
- Keating SM, Bornstein BJ, Finney A, et al. SBMLToolbox: an SBML toolbox for MATLAB users. Bioinformatics 2006; 22:12757.
[Abstract/Free Full Text] - Shapiro BE, Hucka M, Finney A, et al. MathSBML: a package for manipulating SBML-based biological models. Bioinformatics 2004; 20:282931.
[Abstract/Free Full Text] - Raymond GM, Butterworth E, Bassingthwaighte JB. JSIM: Free software package for teaching physiological modeling and research. Exper Biol 2003; 280:102.
- Longabaugh WJ, Davidson EH, Bolouri H. Computational representation of developmental genetic regulatory networks. Dev Biol 2005; 283:116.[CrossRef][Web of Science][Medline]
- PathwayLab http://www.innetics.com (1 June 2006, date last accessed).
- Dhar PK, Meng TC, Somani S, et al. Grid cellware: the first grid-enabled tool for modelling and simulating cellular processes. Bioinformatics 2005; 21:12847.
[Abstract/Free Full Text] - Wishart DS, Yang R, Arndt D, et al. Dynamic cellular automata: an alternative approach to cellular simulation. In Silico Biol 2005; 5:13961.[Medline]
- Hucka M, Finney A, Sauro HM, et al. The ERATO Systems Biology Workbench: enabling interaction and exchange between software tools for computational biology. Pac Symp Biocomput 2002 45061.
- Funahashi A, Tanimura N, Morohashi M, et al. CellDesigner: a process diagram editor for gene-regulatory and biochemical networks. BioSilico 2003; 1:15962.[CrossRef]
- PNK 2e http://page.mi.fu-berlin.de/~trieglaf/PNK2e/index.html (1 June 2006, date last accessed).
- Sauro HM, Hucka M, Finney A, et al. Next generation simulation tools: the Systems Biology Workbench and BioSPICE integration. Omics 2003; 7:35572.[CrossRef][Medline]
- SimBiology http://www.mathworks.com/products/simbiology/(1 June 2006, date last accessed).
- PyBioS http://pybios.molgen.mpg.de/(1 June 2006, date last accessed).
- Moraru II, Schaff JC, Slepchenko BM, et al. The virtual cell: an integrated modeling environment for experimental and computational cell biology. Ann NY Acad Sci 2002; 971:5956.[Web of Science][Medline]
- Ramsey S, Orrell D, Bolouri H. Dizzy: stochastic simulation of large-scale genetic regulatory networks. J Bioinform Comput Biol 2005; 3:41536.[CrossRef][Medline]
- Kierzek AM. STOCKS: STOChastic Kinetic Simulations of biochemical systems with Gillespie algorithm. Bioinformatics 2002; 18:47081.
[Abstract/Free Full Text] - WinBUGS http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml (1 June 2006, date last accessed).
- Siles JR, Bartol TM. Monte Carlo methods for simulating realistic synaptic microphysiology using MCell. In De Schutter E (Ed.). Computational Neuroscience: Realistic Modeling for Experimentalists.Boca Raton, FL: CRC Press 2001 pp. 87127.
- Takahashi K, Ishikawa N, Sadamoto Y, et al. E-Cell 2: multi-platform E-Cell simulation system. Bioinformatics 2003; 19:17279.[CrossRef][Web of Science][Medline]
- Copasi http://www.copasi.org (1 June 2006, date last accessed).
- Stein LD. Integrating biological databases. Nat Rev Genet 2003; 4:33745.[Web of Science][Medline]
- Open Biomedical Ontologies (OBO) http://obo.sourceforge.net/(1 June 2006, date last accessed).
- Gene Ontology (GO) http://www.geneontology.org/(1 June 2006, date last accessed).
- Cell Ontology (CL) http://obo.sourceforge.net/cgi-bin/detail.cgi?cell (1 June 2006, date last accessed).
- Sequence Ontology (SO) http://song.sourceforge.net/(1 June 2006, date last accessed).
- Chemical Ontology (ChEBI) http://www.ebi.ac.uk/chebi/(1 June 2006, date last accessed).
- Phenotype Ontology (PATO) http://obo.sourceforge.net/cgi-bin/detail.cgi?attribute_and_value (1 June 2006, date last accessed).
- Functional Genomics Investigation Ontology (FuGO) http://fugo.sourceforge.net/(1 June 2006, date last accessed).
- Foundational Model of Anatomy (FMA) http://sig.biostr.washington.edu/projects/fm/AboutFM.html (1 June 2006, date last accessed).
- Relation Ontology (OBO_REL) http://obo.sourceforge.net/relationship/(1 June 2006, date last accessed).
- Stanislaus R, Chen C, Franklin J, et al. AGML Central: web based gel proteomic infrastructure. Bioinformatics 2005; 21:17547.
[Abstract/Free Full Text] - Orchard S, Hermjakob H, Julian RK Jr, et al. Common interchange standards for proteomics data: Public availability of tools and schema. Proteomics 2004; 4:4901.[CrossRef][Web of Science][Medline]
- Hermjakob H, Montecchi-Palazzi L, Bader G, et al. The HUPO PSI's molecular interaction formata community standard for the representation of protein interaction data. Nat Biotechnol 2004; 22:17783.[CrossRef][Web of Science][Medline]
- Le Novere N, Finney A, Hucka M, et al. Minimum information requested in the annotation of biochemical models (MIRIAM). Nat Biotechnol 2005; 23:150915.[CrossRef][Web of Science][Medline]
- Biological Pathway Exchange (BioPAX) http://www.biopax.org (1 June 2006, date last accessed).
- Luciano JS. PAX of mind for pathway researchers. Drug Discov Today 2005; 10:93742.[CrossRef][Web of Science][Medline]
- System Biology Graphical Notation (SBGN) http://sbgn.org (1 June 2006, date last accessed).
- Kohn KW, Aladjem MI, Weinstein JN, et al. Molecular interaction maps of bioregulatory networks: a general rubric for systems biology. Mol Biol Cell 2006; 17:113.
[Abstract/Free Full Text] - Finney A, Hucka M. Systems biology markup language: Level 2 and beyond. Biochem Soc Trans 2003; 31:14723.[Web of Science][Medline]
- Lloyd CM, Halstead MD, Nielsen PF. CellML: its future, present and past. Prog Biophys Mol Biol 2004; 85:43350.[CrossRef][Web of Science][Medline]
- Schilstra MJ, Li L, Matthews J, et al. CellML2SBML: conversion of CellML into SBML. Bioinformatics 2006; 22:101820.
[Abstract/Free Full Text] - Gilchrist M, Thorsson V, Li B, et al. Systems biology approaches identify ATF3 as a negative regulator of Toll-like receptor 4. Nature 2006; 441:1738.[CrossRef][Medline]
This article has been cited by other articles:
![]() |
F. Molina, M. Dehmer, P. Perco, A. Graber, M. Girolami, G. Spasovski, J. P. Schanstra, and A. Vlahou Systems biology: opening new avenues in clinical research Nephrol. Dial. Transplant., February 5, 2010; (2010) gfq033v1. [Full Text] [PDF] |
||||
![]() |
E. A. Houseman, B. C. Christensen, M. R. Karagas, M. R. Wrensch, H. H. Nelson, J. L. Wiemels, S. Zheng, J. K. Wiencke, K. T. Kelsey, and C. J. Marsit Copy number variation has little impact on bead-array-based measures of DNA methylation Bioinformatics, August 15, 2009; 25(16): 1999 - 2005. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Endler, N. Rodriguez, N. Juty, V. Chelliah, C. Laibe, C. Li, and N. Le Novere Designing and encoding models for synthetic biology J R Soc Interface, August 6, 2009; 6(Suppl_4): S405 - S417. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Kearns and A. Hoffmann Integrating Computational and Biochemical Studies to Explore Mechanisms in NF-{kappa}B Signaling J. Biol. Chem., February 27, 2009; 284(9): 5439 - 5443. [Full Text] [PDF] |
||||
![]() |
C. Zhang, O. Crasta, S. Cammer, R. Will, R. Kenyon, D. Sullivan, Q. Yu, W. Sun, R. Jha, D. Liu, et al. An emerging cyberinfrastructure for biodefense pathogen and pathogen host data Nucleic Acids Res., January 11, 2008; 36(suppl_1): D884 - D891. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Lechat, L. Hummel, S. Rousseau, and I. Moszer GenoList: an integrated environment for comparative analysis of microbial genomes Nucleic Acids Res., January 11, 2008; 36(suppl_1): D469 - D474. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





