Special Issue Papers |
A survey of current work in biomedical text mining
A postdoctoral fellow in the medical informatics programme at OHSU. Dr Cohen works in the area of text mining, focusing on issues and applications important to biomedical researchers. He was chairman of the W3C working group that produced version 2 of the Synchronized Multimedia Integration Language (SMIL 2.0).
Professor and Chair of the Department of Medical Informatics & Clinical Epidemiology in the School of Medicine at Oregon Health & Science University (OHSU) in Portland, Oregon. Dr Hersh's research focuses on the development and evaluation of information retrieval systems for biomedical practitioners and researchers.
Aaron Michael Cohen, Postdoctoral Fellow, Department of Medical Informatics and Clinical Epidemiology, School of Medicine, Oregon Health & Science University, 3181 S.W. Sam Jackson Park Road, Portland, OR 97239309, USA Tel: +1 503 494 0046 Fax: +1 503 494 4551 E-mail: cohenaa{at}ohsu.edu
The volume of published biomedical research, and therefore the underlying biomedical knowledge base, is expanding at an increasing rate. Among the tools that can aid researchers in coping with this information overload are text mining and knowledge extraction. Significant progress has been made in applying text mining to named entity recognition, text classification, terminology extraction, relationship extraction and hypothesis generation. Several research groups are constructing integrated flexible text-mining systems intended for multiple uses. The major challenge of biomedical text mining over the next 510 years is to make these systems useful to biomedical researchers. This will require enhanced access to full text, better understanding of the feature space of biomedical literature, better methods for measuring the usefulness of systems to users, and continued cooperation with the biomedical research community to ensure that their needs are addressed.
Keywords: text-mining, bioinformatics, natural language processing
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
K. M. Hettne, R. H. Stierum, M. J. Schuemie, P. J. M. Hendriksen, B. J. A. Schijvenaars, E. M. v. Mulligen, J. Kleinjans, and J. A. Kors A dictionary to identify small molecules and drugs in free text Bioinformatics, November 15, 2009; 25(22): 2983 - 2991. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. G. Soldatos, S. I. O'Donoghue, V. P. Satagopam, L. J. Jensen, N. P. Brown, A. Barbosa-Silva, and R. Schneider Martini: using literature keywords to compare gene sets Nucleic Acids Res., October 25, 2009; (2009) gkp876v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Antezana, M. Kuiper, and V. Mironov Biological knowledge management: the emerging role of the Semantic Web technologies Brief Bioinform, July 1, 2009; 10(4): 392 - 407. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Solt, D. Tikk, V. Gal, and Z. T Kardkovacs Semantic Classification of Diseases in Discharge Summaries Using a Context-aware Rule-based Classifier JAMIA, July 1, 2009; 16(4): 580 - 584. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Solt, D. Tikk, V. Gal, and Z. T. Kardkovacs Semantic Classification of Diseases in Discharge Summaries Using a Context-aware Rule-based Classifier J. Am. Med. Inform. Assoc., July 1, 2009; 16(4): 580 - 584. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. P. Diz, E. Dudley, B. W. MacDonald, B. Pina, E. L. R. Kenchington, E. Zouros, and D. O. F. Skibinski Genetic Variation Underlying Protein Expression in Eggs of the Marine Mussel Mytilus edulis Mol. Cell. Proteomics, January 1, 2009; 8(1): 132 - 144. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-j. Kim and D. Rebholz-Schuhmann Categorization of services for seeking information in biomedical literature: a typology for improvement of practice Brief Bioinform, November 1, 2008; 9(6): 452 - 465. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Agarwal and D. B. Searls Literature mining in support of drug discovery Brief Bioinform, November 1, 2008; 9(6): 479 - 492. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Lee, G.-S. Yi, and J. C. Park E3Miner: a text mining tool for ubiquitin-protein ligases Nucleic Acids Res., July 1, 2008; 36(suppl_2): W416 - W422. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Kim, S.-Y. Shin, I.-H. Lee, S.-J. Kim, R. Sriram, and B.-T. Zhang PIE: an online prediction system for protein-protein interactions from text Nucleic Acids Res., July 1, 2008; 36(suppl_2): W411 - W415. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Saeys, I. Inza, and P. Larranaga A review of feature selection techniques in bioinformatics Bioinformatics, October 1, 2007; 23(19): 2507 - 2517. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Lin, W. Li, K. Chen, and Y. Liu A Document Clustering and Ranking System for Exploring MEDLINE Citations J. Am. Med. Inform. Assoc., September 1, 2007; 14(5): 651 - 661. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-W. Fan and C. Friedman Semantic Classification of Biomedical Concepts Using Distributional Similarity J. Am. Med. Inform. Assoc., July 1, 2007; 14(4): 467 - 477. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. I. Torvik and N. R. Smalheiser A quantitative model for linking two disparate sets of articles in MEDLINE Bioinformatics, July 1, 2007; 23(13): 1658 - 1665. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Friedman, T. Borlawsky, L. Shagina, H. R. Xing, and Y. A. Lussier Bio-Ontology and text: bridging the modeling gap Bioinformatics, October 1, 2006; 22(19): 2421 - 2429. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Dinakarpandian, Y. Lee, K. Vishwanath, and R. Lingambhotla MachineProse: an Ontological Framework for Scientific Assertions J. Am. Med. Inform. Assoc., March 1, 2006; 13(2): 220 - 232. [Abstract] [Full Text] [PDF] |
||||





