Bayesian Analysis of Case-control Genetic Association Studies in the Presence of Population Stratification or Genetic Model Uncertainty Open Access
Downloadable ContentDownload PDF
Genetic association studies, which examine the association between genetic markers and diseases, have become an important area in human health research. The population-based case-control study is one of the most common designs in genetic association studies. Traditionally, the strength of genetic association is measured by the p-value of statistical tests. Bayesian analysis, which has been increasingly used in genetic studies, provides a different perspective of how to interpret the evidence of association in the data. An important measurement of association in Bayesian analysis is the Bayes factor, which unifies both p-value and power. Bayesian analysis in case-control genetic association studies is susceptible to two pitfalls: the presence of hidden subpopulation, also known as population stratification, could potentially deviate the distribution of Bayes factor and hence lead to invalid conclusion; and the uncertainty of the underlying genetic model (e.g. recessive, additive, dominant) renders a robust Bayesian measurement of association difficult to obtain. The objective of this dissertation is to develop appropriate approaches to resolve these issues. We study how the hidden population stratification affects the asymptotic distribution of the estimate of odds ratio and its confidence interval, and develop a correction for its impact using a principal components analysis method. We then apply these corrected estimates and the asymptotic variances to correct for the hidden population stratification in Bayesian analysis. This method is evaluated by both simulation studies and the application to HapMap data. We also develop a Bayes factor that measures the evidence of association contained in a maximum statistic (MAX) which is robust under genetic model uncertainty. The robustness of our proposed Bayes factor is evaluated by simulation studies and application to real data from genetic studies. We also discuss the relationship between the proposed Bayes factor and the corresponding p-value, as well as some implications.