Skip Navigation


Briefings in Bioinformatics Advance Access originally published online on July 18, 2007
Briefings in Bioinformatics 2007 8(6):393-395; doi:10.1093/bib/bbm035
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
8/6/393    most recent
bbm035v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Montana, G.
Right arrow Articles by Hoggart, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Montana, G.
Right arrow Articles by Hoggart, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. For Permissions, please email: journals.permissions@oxfordjournals.org

Statistical software for gene mapping by admixture linkage disequilibrium

Giovanni Montana and Clive Hoggart

Corresponding author. Giovanni Montana, Imperial College London, Department of Mathematics, South Kensington Campus, 180 Queen's Gate, London SW7 2AZ. Tel: +44 (0) 207-594-8577; E-mail: g.montana{at}imperial.ac.uk


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 DOWNLOAD AND DATA INPUT
 STATISTICAL MODELS
 TESTING FOR LINKAGE
 FOOTNOTES
 References
 
Admixture mapping is a statistical methodology that detects genetic variants in recently admixed populations that are responsible for ethnic differences in disease risk. Three software packages are now available for admixture mapping and we provide a brief overview of the statistical methods and other principal features they implement.

Keywords: gene mapping, admixture linkage disequilibrium, hidden Markov models


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 DOWNLOAD AND DATA INPUT
 STATISTICAL MODELS
 TESTING FOR LINKAGE
 FOOTNOTES
 References
 
Empirical evidence shows that, in human populations formed by relatively recent mixing of distinct ancestral groups, linkage disequilibrium (LD) is observed over greater distances than in other, less heterogeneous populations. For example, in African-Americans, weak LD may persist over distances as large as 20 cM [1]. For diseases which vary in prevalence between two or more ancestral populations, the long range LD observed in admixed populations can be exploited to search for genetic variants responsible for the ethnic difference in disease risk. The resulting methodology is referred to as admixture mapping or mapping by admixture linkage disequilibrium (MALD).

The principle underlying a MALD study is simple; loci in LD with a locus responsible for an ethnic difference in disease risk will have a greater than expected proportion of markers with ancestry from the high risk population. In these studies, a sample of diseased individuals from the admixed population under study is collected and the probability of ancestry of each marker to each ancestral population and the proportion of ancestry of each individual to the ancestral populations is statistically determined. An unexpectedly large bump of ancestry in a localized genomic region then suggests that the region may harbour a disease-bearing gene.

When compared to more traditional association studies, MALD requires far fewer markers with the extra advantage of protecting against allelic heterogeneity. Extensive reviews illustrating advantages and disadvantages of this promising gene mapping strategy have been written [2–5]. Further guidelines are also available [6–8]. In this review we describe the three main publicly available software packages specifically designed for admixture mapping: the suite of programs STRUCTURE2/MALDsoft [8,9], ADMIXMAP [6] and ANCESTRYMAP [7].


    DOWNLOAD AND DATA INPUT
 TOP
 ABSTRACT
 INTRODUCTION
 DOWNLOAD AND DATA INPUT
 STATISTICAL MODELS
 TESTING FOR LINKAGE
 FOOTNOTES
 References
 
STRUCTURE2 and MALDsoft are executable programs available for Windows, Linux and Solaris and can be downloaded with documentation at http://pritch.bsd.uchicago.edu/software.html. They are meant to be used jointly, and they both share the same data format. Careful error checking is implemented in these programs to make sure that the data set is in the correct format.

Documentation for ADMIXMAP is available at http://www.ucd.ie/genepi/software.html, source code and executable code for Linux and Windows is available at http://sourceforge.net/projects/admixmap. Documentation, source code and executable code (Linux and Unix) for ANCESTRYMAP are available at http://genepath.med.harvard.edu/~reich/.

When importing the data, ANCESTRYMAP checks for duplicate individuals and misspecification of marker position. Both ANCESTRYMAP and ADMIXMAP can test for Hardy–Weinberg equilibrium at each marker and discrepancy in the ancestry specific allele frequencies in the admixed and unadmixed populations.

All programs have varying input formats. However, code to translate between ADMIXMAP and ANCESTRYMAP format and from STRUCTURE2 format to ADMIXMAP is available from http://www.ucd.ie/genepi/admixmap/tools.html. All programs are executed from the command line, but a graphical front-end is available for STRUCTURE2.


    STATISTICAL MODELS
 TOP
 ABSTRACT
 INTRODUCTION
 DOWNLOAD AND DATA INPUT
 STATISTICAL MODELS
 TESTING FOR LINKAGE
 FOOTNOTES
 References
 
The three programs implement similar statistical methodologies. In particular, the same hierarchical model for population admixture, individual admixture and locus ancestry is implemented. They also share the same probability model for the stochastic variation of ancestry along the chromosomes; it assumes that chromosomes can be broken into blocks of common ancestry, breakpoints between adjacent blocks occur as a Poisson process and transitions between adjacent ancestral blocks are governed by a Markov model.

Using MCMC simulations, the three programs can produce posterior summaries of allele frequencies, individual ancestry proportions, and a measure of the time since admixture. ADMIXMAP can model any number of ancestral populations, whereas MALDsoft and ANCESTRYMAP are limited to modelling admixture between two ancestral populations only. ANCESTRYMAP allows only for biallelic markers, whereas the other programs can handle markers with any number of alleles. ADMIXMAP allows for allelic association between markers by modelling the unobserved haplotypes and thus tightly linked loci can be included in the analysis.

All programs allow prior distributions for allele frequencies to be specified. In ANCESTRYMAP and ADMIXMAP these priors are specified in a separate file containing allele counts from unadmixed individuals, whereas in STRUCTURE/MALDsoft admixed and unadmixed individuals are included in the same file and their status is specified by an indicator variable. The default setting of STRUCTURE/MALDsoft assumes that markers have not been selected to be informative for ancestry and thus a prior that imposes correlation between the allele frequencies in each subpopulation at each marker is specified. ADMIXMAP and ANCESTRYMAP do not in general assume any correlation between the allele frequencies in each subpopulation. ANCESTRYMAP can model X chromosome marker data from males and females, whereas the other programs are restricted to autosomal data.


    TESTING FOR LINKAGE
 TOP
 ABSTRACT
 INTRODUCTION
 DOWNLOAD AND DATA INPUT
 STATISTICAL MODELS
 TESTING FOR LINKAGE
 FOOTNOTES
 References
 
All three programs can test for linkage in an affected-only and case-control study design. The latter is generally less powerful than an affected-only study [2, 6, 8], but the former can be sensitive to miss-specified allele frequencies. ADMIXMAP and ANCESTRYMAP implement a one stage test for linkage in which the test is calculated within the MCMC sampler for admixture conditional on the current realized values of the model parameters, whereas MALDsoft calculates tests for linkage using the final output of STRUCTURE2; the process of importing STRUCTURE2 output into MALDsoft is automated. All three programs produce an output file containing a test statistic for each locus.

The affected-only test statistic used in MALDsoft is defined as the observed minus expected locus ancestry summed across all individuals where the expected locus ancestry is given by the estimated individual admixture. If controls are available, the user has the option of running a case-control test in which it is tested whether the observed minus expected locus ancestry differs between cases and controls at each locus. The standard deviation required for the computation of the test statistics is estimated via parametric bootstrap, based on a modified Baum–Welch algorithm. Under the null hypothesis of no association, each test has a normal distribution. In order to control the experiment-wise significance level, MALDsoft produces the empirical distribution of the most extreme tests.

ADMIXMAP tests for linkage using score tests. The affected-only test is similar to that used in MALDsoft; the test statistic is the same, however, applying a score test, which assumes normality, the variance of the test statistic is minus the second derivative of the log-likelihood. The test for linkage in a case-control study design is implemented by testing for no affect of locus ancestry in a logistic regression model conditional on individual admixture. This setup allows the test to be conditioned on other covariates. A test for linkage to a quantitative trait can be calculated as a score test in a linear regression model. ADMIXMAP can also test for residual LD not accounted for by the admixture model. No correction for multiple testing is provided.

In an affected-only analysis ANCESTRYMAP tests for linkage at each locus using a likelihood ratio test. The likelihood ratio test is calculated for the null hypothesis of no marker ancestry effect versus alternative hypotheses specified by the user. The hypotheses are specified in terms of the risk ratio associated with the number of alleles from the high risk population. The user can specify one or more alternative hypotheses. The case-control test is fundamentally the same as that used in MALDsoft, but a t-statistic is calculated for a difference in means between cases and controls. A Bayesian genome-wide test for significance is provided.


Key Points

  • This article reviews the main features of all existing software packages for gene mapping by admixture linkage disequilibrium.
  • It provides an overview of the underlying statistical methods and testing procedures implemented by each package.
  • It provides a concise introduction to the principle of admixture mapping.

 


    FOOTNOTES
 TOP
 ABSTRACT
 INTRODUCTION
 DOWNLOAD AND DATA INPUT
 STATISTICAL MODELS
 TESTING FOR LINKAGE
 FOOTNOTES
 References
 
Giovanni Montana is a Lecturer of Statistics in the Mathematics Department, Imperial College London, UK. He is interested in data mining and statistical pattern recognition.

Clive Hoggart is Research Fellow in Statistical Genetics in the Department of Epidemiology and Public Health, Imperial College London, UK. He is interested in statistical genetics and Bayesian methodology.

Received (in revised form): June 7, 2007.


    References
 TOP
 ABSTRACT
 INTRODUCTION
 DOWNLOAD AND DATA INPUT
 STATISTICAL MODELS
 TESTING FOR LINKAGE
 FOOTNOTES
 References
 

  1. Parra EJ, Marcini A, Akey J, et al. Estimating African American admixture proportions by use of population-specific alleles. Am J Hum Genet (1998) 63:1839–51.[CrossRef][Web of Science][Medline]
  2. McKeigue PM. Prospects for admixture mapping of complex traits. Am J Hum Genet (2005) 76:1–7.[CrossRef][Web of Science][Medline]
  3. Darvasi A, Shifman S. The beauty of admixture. Nat Genet (2005) 37:118–19.[CrossRef][Web of Science][Medline]
  4. Reich D, Patterson N. Will admixture mapping work to find disease genes%. Philos Trans R Soc Lond B Biol Sci (2005) 360:1605–7.[Abstract/Free Full Text]
  5. Smith MW, O’Brien SJ. Mapping by admixture linkage disequilibrium: advances, limitations and guidelines. Nat Rev Genet (2005) 6:623–32.[Web of Science][Medline]
  6. Hoggart CJ, Shriver MD, Kittles, et al. Design and analysis of admixture mapping studies. Am J Hum Genet (2004) 74:965–78.[CrossRef][Web of Science][Medline]
  7. Patterson N, Hattangadi N, Lane B, et al. Methods for high-density admixture mapping of disease genes. Am J Hum Genet (2004) 74:979–1000.[CrossRef][Web of Science][Medline]
  8. Montana G, Pritchard JK. Statistical tests for admixture mapping with case-control and cases-only data. Am J Hum Genet (2004) 75:771–89.[CrossRef][Web of Science][Medline]
  9. Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics (2003) 164:1567–87.[Abstract/Free Full Text]

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Genome ResHome page
K. M. Weiss and J. C. Long
Non-Darwinian estimation: My ancestors, my genes' ancestors
Genome Res., May 1, 2009; 19(5): 703 - 710.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
8/6/393    most recent
bbm035v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Montana, G.
Right arrow Articles by Hoggart, C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Montana, G.
Right arrow Articles by Hoggart, C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?