Abstract
In the wake of new sequencing and genotyping technologies, whole genome studies are now being undertaken to understand the genetic basis of phenotypes. Many of the principles underlying the measurement of genotype-phenotype relationships, as well as computing related population genetic parameters, are relatively well understood. However, the upcoming technologies dramatically change the scale and scope of these studies, which already encompass tens of thousands of individuals over a genome-wide region. The analysis of this data requires novel algorithmic and statistical techniques. This project focuses on a subset of the problems that could arise in a typical whole-genome based association study. These include:
- Phasing of genotypes into haplotypes using overlapping sequence data, and the application of this algorithm to phasing individual human sequences; the availability of high coverage long sequence data will make this approach the method of choice for phasing in the near future.
- Fast filtering for pairs of loci that interactively influence a phenotype and its application to multiple-locus testing of common disease phenotypes. The proposed work reduces the computational bottleneck in multiple locus testing.
- Detection of regions under balancing selection. Available tests are focused on detection of regions under positive selection. The proposed research looks for evidence of balancing selection in the genome, with specific attention on genes associated with bipolar disorder.
- Reconstruction of regulatory pathways using associations between genetic variation and gene-expression.
All software from this research is freely available as source-code, or as web-tools for academic, research and non-commercial purposes in accordance with University policy.
Students
Publications
Courses