Briefings in Bioinformatics Advance Access published online on October 4, 2008
Briefings in Bioinformatics, doi:10.1093/bib/bbn042
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Gene-set analysis and reduction
Corresponding author. Irina Dinu, PhD, School of Public Health, University of Alberta, 13-106J Clinical Sciences Building, Edmonton, Alberta T6G 2G3, Canada. Tel: +1-780-492-8336; Fax: +1-780-492-0364; E-mail: idinu{at}ualberta.ca
Gene-set analysis aims to identify differentially expressed gene sets (pathways) by a phenotype in DNA microarray studies. We review here important methodological aspects of gene-set analysis and illustrate them with varying performance of several methods proposed in the literature. We emphasize the importance of distinguishing between self-contained versus competitive methods, following Goeman and Bühlmann. We also discuss reducing a gene set to its subset, consisting of core members that chiefly contribute to the statistical significance of the differential expression of the initial gene set by phenotype. Significance analysis of microarray for gene-set reduction (SAM-GSR) can be used for an analytical reduction of gene sets to their core subsets. We apply SAM-GSR on a microarray dataset for identifying biological gene sets (pathways) whose gene expressions are associated with p53 mutation in cancer cell lines. Codes to implement SAM-GSR in the statistical package R can be downloaded from http://www.ualberta.ca/~yyasui/homepage.html.
Keywords: DNA microarray, gene sets, gene set n, multivariate means, pathways, significance analysis of microarray, two-sample test
Submitted: May 29, 2008. Received (in revised form): July 29, 2008.