Our Research

We study functional genetic variation in human populations, and the mechanisms how it affects human traits and disease. Our work combines computational genomics, human population genetics approaches, and high-throughput experimental work. We focus on studying genetic effects on transcriptome and other molecular traits, which has further implications on the cellular and individual level. While some of our projects are closely related to specific diseases, our overall goal is to uncover general rules of the genomic sources of variation in human traits.


Genetic associations for molecular traits

Our lab has a strong track record in integration of large-scale genome and transcriptome sequencing data sets to characterize the genetic architecture of variants that affect the transcriptome. These include both rare and common variants in noncoding and coding regions of the genome with molecular QTL and other approaches. This has applications in interpreting disease-associated loci, and in improving our understanding of the regulatory code and interpretation of the personal genome. While genome and transcriptome data from RNA-sequencing are the main data types that we analyze, in several projects we apply similar approaches to epigenomic and other cellular data sets as well.

From Aguet et al. 2022

CRISPR-based characterization of GWAS loci

In addition to studying natural genetic “perturbations” in human populations, we use experimental perturbations of genome function for a highly complementary perspective to molecular and cellular function of the genome and GWAS loci. We have been part in developing a scalable and sensitive toolkit for CRISPR inhibition at enhancers that harbor trait/disease-associated loci. This has elucidated the functional architecture of GWAS loci and the optimal, comprehensive toolkit for studying disease mechanisms. A lot of our work in this space is in collaboration with Dr. Neville Sanjana’s lab.

From Morris et al. 2022

Gene dosage as a driver of downstream function

Several lines of evidence, including our work, point to a powerful paradigm of functional gene dosage – the amount of functional protein produced – as a major point of convergence of how genetic and epigenetic effects drive downstream function in cells. With this framework, rare and common genetic variants in the coding and noncoding regions of the genome can all be studied together. However, we have limited understanding of what is the gradual dosage-to-function relationship of each gene, what are the mechanisms across phenotypes and cellular contexts, and how it relates to functional architecture of human traits. This is the premise of my recently awarded ERC Consolidator award, where we will use the CRISPRi/a system to titrate gene dosage and quantify its relationship to molecular and cellular dosage and variation in blood cell traits.

Multi-omics for precision medicine

Precision medicine and biobank projects across the world are collecting rich data sets of human phenotypes, environmental exposures, and genetics. The layer that is nearly always missing is molecular phenotyping due to a key practical bottleneck: lack of informative biospecimens beyond blood. We have developed wetlab solutions from noninvasive sampling of biospecimens for RNA sequencing, and are pursuing these applications in clinical cohorts. Furthermore, we have developed computational methods for integration of multimodal data sets from complex and diverse disease cohorts

Human genomics consortia

Our lab has a strong track record in integration of large-scale genome and transcriptome sequencing data sets to characterize the genetic architecture of variants that affect the transcriptome. These include both rare and common variants in noncoding and coding regions of the genome. This has applications in interpreting disease-associated loci, and in improving our understanding of the regulatory code and interpretation of the personal genome. We are part of many important consortium projects in this domain, including the Genotype Tissue Expression (GTEx) project, developmental GTEx (dGTEx), TOPMed and MoTrPAC studies. While genome and transcriptome data from RNA-sequencing are the main data types that we analyze, in several projects we apply similar approaches to epigenomic and other cellular data sets as well. Better understanding of regulatory mechanisms and multi-omics data integration is a major goal of the lab.


Our past and present funding