Briefings in Bioinformatics Advance Access originally published online on February 29, 2008
Briefings in Bioinformatics 2008 9(2):102-118; doi:10.1093/bib/bbn005
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Approaches to dimensionality reduction in proteomic biomarker studies
Corresponding author. Melanie Hilario, Computer Science Department, University of Geneva, Battelle Bât. A, 7 route de Drize, CH-1227 Carouge, Switzerland. Tel: +41-22-379 0222; Fax: +41-22-379 0250; E-mail: Melanie.Hilario{at}cui.unige.ch
Mass-spectra based proteomic profiles have received widespread attention as potential tools for biomarker discovery and early disease diagnosis. A major data-analytical problem involved is the extremely high dimensionality (i.e. number of features or variables) of proteomic data, in particular when the sample size is small. This article reviews dimensionality reduction methods that have been used in proteomic biomarker studies. It then focuses on the problem of selecting the most appropriate method for a specific task or dataset, and proposes method combination as a potential alternative to single-method selection. Finally, it points out the potential of novel dimension reduction techniques, in particular those that incorporate domain knowledge through the use of informative priors or causal inference.
Keywords: proteomics, mass spectra, biomarkers, dimensionality reduction, feature transformation, feature selection
Submitted: August 25, 2007. Received (in revised form): January 18, 2008.