Information extraction in molecular biology
Post-doctoral fellow in the Protein Design Group at the Spanish National Centre for Biotechnology. Dr. Blaschke's main scientific interest is the extraction of biologically relevant information from the scientific literature with statistical and linguistic methods and the application of machine learning techniques to reduce the cost of adapting these systems to new domains.
Chief Scientist for the Information Technology Division at the MITRE Corp. in Beford, Massachusetts, USA. Her recent research has focused on the intersection of natural language processing and biomedical informatics.
Senior scientist of the Spanish research council (CSIC), Coordinator of the Spanish Network of Bioinformatics, an editorial board member of Bioinformatics, and Vice-president of the International Association for Computational Biology (ISCB). In 1994 Dr Valencia created a multidisciplinary group of 20 researchers at the Spanish National Centre for Biotechnology. His main scientific interest is the use of the genomic and proteomic information for the study of molecular evolution and for the development of new biotechnological resources.
Alfonso Valencia, Protein Design Group, National Center for Biotechnology, CNB-CSIC, Cantoblanco, Madrid E-28049, Spain Tel: +34 91 585 45 70 Fax +34 91 585 45 06 E-mail: valencia{at}cnb.uam.es
Information extraction has become a very active field in bioinformatics recently and a number of interesting papers have been published. Most of the efforts have been concentrated on a few specific problems, such as the detection of proteinprotein interactions and the analysis of DNA expression arrays, although it is obvious that there are many other interesting areas of potential application (document retrieval, protein functional description, and detection of disease-related genes to name a few). Paradoxically, these exciting developments have not yet crystallised into general agreement on a set of standard evaluation criteria, such as the ones developed in fields such as protein structure prediction, which makes it very difficult to compare performance across these different systems. In this review we introduce the general field of information extraction, we outline the status of the applications in molecular biology, and we then discuss some ideas about possible standards for evaluation that are needed for the future development of the field.
Keywords: information extraction, molecular biology, ontologies, proteinprotein interactions, document retrieval
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
U. Hahn and A. Valencia Semantic Mining in Biomedicine (Introduction to the papers selected from the SMBM 2005 Symposium, Hinxton, U.K., April 2005). Bioinformatics, March 15, 2006; 22(6): 643 - 644. [Full Text] [PDF] |
||||
![]() |
J. Herrero, J. M. Vaquerizas, F. Al-Shahrour, L. Conde, A. Mateos, J. S. R. Diaz-Uriarte, and J. Dopazo New challenges in gene expression data analysis and the extended GEPAS Nucleic Acids Res., July 1, 2004; 32(suppl_2): W485 - W491. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Rebholz-Schuhmann, S. Marcel, S. Albert, R. Tolle, G. Casari, and H. Kirsch Automatic extraction of mutations from Medline and cross-validation with OMIM Nucleic Acids Res., January 2, 2004; 32(1): 135 - 142. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Schlitt, K. Palin, J. Rung, S. Dietmann, M. Lappe, E. Ukkonen, and A. Brazma From Gene Networks to Gene Function Genome Res., December 1, 2003; 13(12): 2568 - 2576. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Albert, S. Gaudan, H. Knigge, A. Raetsch, A. Delgado, B. Huhse, H. Kirsch, M. Albers, D. Rebholz-Schuhmann, and M. Koegl Computer-Assisted Generation of a Protein-Interaction Database for Nuclear Receptors Mol. Endocrinol., August 1, 2003; 17(8): 1555 - 1567. [Abstract] [Full Text] [PDF] |
||||



