Computational analysis of mutation spectra
Research fellows at the National Center for Biotechnology Information NLM/NIH (Bethesda, MD, USA) and senior research scientists at the Institute of Cytology and Genetics RAS (Novosibirsk, Russia).
Group leader at the Istituto di Technologie Biomediche Avanzate CNR (Milano, Italy).
Research fellow at the National Institute of Environmental Health Sciences (Research Triangle Park, NC, USA).
I. B. Rogozin, NCBI/NLM/NIH, 8600 Rockville Pike, Building 38A, Bethesda, MD 20894, USA Tel: +1 301 594 4271 Fax: +1 301 480 9241 E-mail: rogozin{at}ncbi.nlm.nih.gov
Mutation frequencies vary along a nucleotide sequence, and nucleotide positions with an exceptionally high mutation frequency are called hotspots. Mutation hotspots in DNA often reflect intrinsic properties of the mutation process, such as the specificity with which mutagens interact with nucleic acids and the sequence-specificity of DNA repair/replication enzymes. They might also reflect structural and functional features of target protein or RNA sequences in which they occur. The determinants of mutation frequency and specificity are complex and there are many analytical methods for their study. This paper discusses computational approaches to analysing mutation spectra (distribution of mutations along the target genes) that include many detectable (mutable) positions. The following methods are reviewed: mutation hotspot prediction; pairwise and multiple comparisons of mutation spectra; derivation of a consensus sequence; and analysis of correlation between nucleotide sequence features and mutation spectra. Spectra of spontaneous and induced mutations are used for illustration of the complexities and pitfalls of such analyses. In general, the DNA sequence context of mutation hotspots is a fingerprint of interactions between DNA and DNA repair/replication/modification enzymes, and the analysis of hotspot context provides evidence of such interactions.
Keywords: hotspot, mutation spectra, classification, DNA sequence context, mutable motif, somatic hypermutation, correlation