Baran Y, Subramaniam M, Biton A, Tukiainen T, Tsang EK, Rivas MA, Pirinen M, Gutierrez-Arcelus M, Smith KS, Kukurba KR, Zhang R, Eng C, Torgerson DG, Urbanek C; GTEx Consortium, Li JB, Rodriguez-Santana JR, Burchard EG, Seibold MA, MacArthur DG, Montgomery SB, Zaitlen NA*, Lappalainen T* (2015) The landscape of genomic imprinting across diverse adult human tissues. Genome Research DOI: 10.1101/gr.192278.115
* Equal contribution
Our genomes carry about one hundred genes that rebel against one of the basic rules of genetics: equal contribution of both parents. This small group of genes are expressed and functional only from a copy inherited from only from our mother or only from the father, in contrast to the over 20,000 genes expressed from alleles inherited from both parents. This happens by epigenetic silencing of the other copy, where specific marks on the DNA carry a memory of whether a gene sitting on a copy of a chromosome came from the sperm or the egg. These marks can lead to the cell to inactivate the gene from one of the parental copies. This phenomenon called genomic imprinting is the topic of our new paper in Genome Research.
Why do we care about a phenomenon that is clearly a rare exception from the normal patter? First of all, imprinting is famous for its role in an intriguing set of human diseases, where the effect of a genetic mutation depends on whether the affected individual inherited it from the mother or from the father. The best known example is an imprinted locus in chromosome 15, where the exact same genetic mutations cause either the Prader-Willi syndrome if the mutation comes from the mother, or the Angelman syndrome if it comes from the father – with essentially opposite sets of symptoms. The same parental effect can apply for weak genetic risk for common diseases. These effects are part of an interesting general phenomenon where the impact of an individual’s genetic variants cannot be predicted from the genetic code alone, without knowing their genetic context or environment. In the case of imprinted genes, the context of crucial importance is the parental ancestry of those variants, and during the recent years there has been increasing emphasis on our need to understand how imprinting modifies genetic associations to disease.
Additionally – putting my geek hat on – this little quirk of nature is just a tremendously cool biological phenomenon and we don’t even know why it exists. Imprinting happens in mammals and some plants, but the evolutionary fitness benefits of giving up two functional chromosomes in specific loci are not known. There are a few theories, from parental conflict to mother-offspring coadaptation, but the situation is terribly unclear. A persistent problem has been that the field has been lacking comprehensive, systematic data sets to empirically test the mathematical models of the evolution of imprinting.
In our new paper, we have characterized imprinting across a diverse set of human tissues, using a systematic genome-wide approach. This was made possible by the data set of the Genotype Tissue Expression (GTEx) project pilot phase, with genotype and RNA-sequencing data across 33 tissues and 178 individuals. From these data, we first measured allele-specific expression, where a heterozygous site in a gene can be used to distinguish gene expression from the two copies of a gene. Usually, both copies are expressed roughly in same amounts, but under imprinting, one gene copy is silenced and thus we will see expression of only one allele.
The challenge in analyzing imprinting from GTEx family data comes from it being a population sample, without any information of parents of each individual. Thus, we can’t directly observe expression of only the maternally or paternally inherited allele in a given gene. However, a strong signal of imprinting will lead to expression of only one allele in all individuals in a manner that is independent of genetic effects. Detecting this signal is not an easy task due to various technical and biological confounding factors, but together with Yael Baran and Noah Zaitlen we developed a sophisticated statistical method that finds genes where the pattern of monoallelic expression is consistent with imprinting and not with other processes. These results are supported by several validation data sets and careful curation. When possible, previously published resources were used to classify maternally and paternally expressed genes
Applying our method to the GTEx data set, we discovered 42 genes with a solid pattern of imprinted gene expression. This included 30 genes with some previous evidence of imprinting, and also 12 new genes, mostly discovered in rarely studied tissues. This alone highlights the importance of analyzing diverse tissues. This is still not the full list of imprinted genes – we intentionally chose a conservative approach to avoid false positives – but for the first time, we can now take a look at how imprinting of this confident set of genes varies between human tissues.
About half of these genes are imprinted in all the tissues where they are expressed, but several genes show tissue-specificity of imprinting by being expressed from both alleles in some tissues. Most tissues have similar overall levels of imprinting, but testis came out as an outlier, having significantly less imprinting. This makes perfect sense: up to 60-70% of the cells in the testis samples are from the male germline, and in these cells the imprinting marks of these individuals’ parents are erased and replaced with paternal imprinting marks for the next generation.
The real surprise of our study came from looking if tissue-specificity of imprinting could also manifest in a gene having maternal expression in some tissues and paternal in others. Usually this is not the case, but we found two exceptions to this rule – a previously known case of GRB10, and something that made me fall off the chair: IGF2. If you google this gene, you will see that it’s one of the best known imprinted genes, has a role to many diseases, and is expressed from the paternal allele. Except that in the brain it’s not. Our data shows that the brain expresses the opposite allele than other tissues, i.e. the maternal one. The functional consequences, evolutionary origins, and molecular mechanisms of this flip of the imprinted allele in IGF2 need to be figured out by future studies. But our discovery demonstrates tremendous plasticity in imprinting across tissues, and the power of our approach to detect novel phenomena even in very well studied genes.
Many human traits – from gene expression levels to disease risk – vary between individuals. But what about imprinting? We find several genes with evidence of inter-individual differences in how tightly imprinted the gene is, suggesting that imprinting can be a variable trait not only between tissues but also between individuals. Interestingly, some of this variation seems to be driven by sex in skeletal muscle – a sexually dimorphic tissue – where a few maternally imprinted growth repressors have lower level of imprinting in females. One can’t help but speculate if this is an attempt by mothers to suppress muscle growth in their daughters but not sons. An interesting question for the future will be the quest for the potential genetic variants that may regulate imprinting levels – iQTLs, analogously to eQTLs that affect gene expression levels.
We’ve made an effort to make everything accessible to the community by making the paper open-access, releasing the software, R code for the analysis and figures, and publishing a comprehensive set of statistics and data in the supplement and soon also in the GTEx portal. We are convinced of the power and robustness of our method, and look forward to applying it to future data sets. We hope that our method and results will be a useful resource for imprinting researchers and for the wider genomics community.
But our study is much more than just a resource paper. We take an important step towards systematic, statistically rigorous high-throughput analysis of imprinting in humans. In many other areas of genomics, the shift from candidate gene analysis to genome-wide approaches has enabled a major boost in biological and medical discovery, and the same is not happening for imprinting. While family-based study designs have many benefits in analysis of imprinting, these samples are often very difficult to collect from humans, and population-based study designs such as GTEx provide access to an unparalleled diversity of tissues that is otherwise not available. By providing a map of imprinting across human tissues, we have refined previous catalogs of imprinted gene and discovered new patterns of how imprinting varies between tissues and individuals. We look forward to even deeper analyses with the growing GTEx data set, as well as integration of our findings with research of molecular mechanisms and disease relevance.