Skip Navigation



Briefings in Bioinformatics Advance Access published online on December 6, 2008

Briefings in Bioinformatics, doi:10.1093/bib/bbn043
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
9/6/466    most recent
bbn043v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Winnenburg, R.
Right arrow Articles by Schroeder, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Winnenburg, R.
Right arrow Articles by Schroeder, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. For Permissions, please email: journals.permissions@oxfordjournals.org

Facts from text: can text mining help to scale-up high-quality manual curation of gene products with ontologies?

Rainer Winnenburg, Thomas Wächter, Conrad Plake, Andreas Doms and Michael Schroeder

Corresponding author. Michael Schroeder, Biotechnology Center, Technische Universität Dresden, Tatzberg 47-49, 01307 Dresden, Germany. Tel: +49 351 463 40062; Fax: +49 351 463 40061; E-mail: ms{at}biotec.tu-dresden.de

The biomedical literature can be seen as a large integrated, but unstructured data repository. Extracting facts from literature and making them accessible is approached from two directions: manual curation efforts develop ontologies and vocabularies to annotate gene products based on statements in papers. Text mining aims to automatically identify entities and their relationships in text using information retrieval and natural language processing techniques. Manual curation is highly accurate but time consuming, and does not scale with the ever increasing growth of literature. Text mining as a high-throughput computational technique scales well, but is error-prone due to the complexity of natural language. How can both be married to combine scalability and accuracy? Here, we review the state-of-the-art text mining approaches that are relevant to annotation and discuss available online services analysing biomedical literature by means of text mining techniques, which could also be utilised by annotation projects. We then examine how far text mining has already been utilised in existing annotation projects and conclude how these techniques could be tightly integrated into the manual annotation process through novel authoring systems to scale-up high-quality manual curation.

Keywords: text mining, data curation, ontology generation, entity recognition, GO annotation, authoring systems

Submitted: May 23, 2008. Received (in revised form): September 10, 2008.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Brief BioinformHome page
E. Antezana, M. Kuiper, and V. Mironov
Biological knowledge management: the emerging role of the Semantic Web technologies
Brief Bioinform, July 1, 2009; 10(4): 392 - 407.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.