Briefings in Bioinformatics Advance Access originally published online on July 10, 2008
Briefings in Bioinformatics 2008 9(5):355-366; doi:10.1093/bib/bbn028
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Detecting short tandem repeats from genome data: opening the software black box
Corresponding author. Angelika Merkel, School of Biological Sciences, University of Canterbury, Private Bag 4800, Christchurch 8041, New Zealand. Tel: +64 (0)3 364 2987 ext 7048; Fax: +64 (0) 3 364 2509; E-mail: ame52{at}student.canterbury.ac.nz
Short tandem repeats, specifically microsatellites, are widely used genetic markers, associated with human genetic diseases, and play an important role in various regulatory mechanisms and evolution. Despite their importance, much is yet unknown about their mutational dynamics. The increasing availability of genome data has led to several in silico studies of microsatellite evolution which have produced a vast range of algorithms and software for tandem repeat detection. Documentation of these tools is often sparse, or provided in a format that is impenetrable to most biologists without informatics background. This article introduces the major concepts behind repeat detecting software essential for informed tool selection. We reflect on issues such as parameter settings and program bias, as well as redundancy filtering and efficiency using examples from the currently available range of programs, to provide an integrated comparison and practical guide to microsatellite detecting programs.
Keywords: microsatellite, tandem repeat, genome, algorithm, software, method, comparison
Submitted: March 21, 2008. Received (in revised form): June 6, 2008.