Briefings in Bioinformatics Advance Access originally published online on March 11, 2009
Briefings in Bioinformatics 2009 10(3):289-294; doi:10.1093/bib/bbn054
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Potential Bias in GO::TermFinder
Corresponding author. Peter D. Wentzell, Department of Chemistry, Dalhousie University, Halifax, Nova Scotia B3H 4J3, Canada. Tel: 902-494-3708; Fax: 902-494-1310; E-mail: peter.wentzell{at}dal.ca
The increased need for multiple statistical comparisons under conditions of non-independence in bioinformatics applications, such as DNA microarray data analysis, has led to the development of alternatives to the conventional Bonferroni correction for adjusting P-values. The use of the false discovery rate (FDR), in particular, has grown considerably. However, the calculation of the FDR frequently depends on drawing random samples from a population, and inappropriate sampling will result in a bias in the calculated FDR. In this work, we demonstrate a bias due to incorrect random sampling in the widely used GO::TERMFINDER package. Both T2 and permutation tests are used to confirm the bias for a test set of data, which leads to an overestimation of the FDR of about 10%. A simple fix to the random sampling method is proposed to remove the bias.
Keywords: false discovery rate, bias, gene ontology, GO::Termfinder, enrichment
Submitted: October 8, 2008. Received (in revised form): November 7, 2008.