1. A Project-Archive of Complexity

The GTEx project began in 2010. In its founding paper, the consortium that would carry out this project introduced it like this: "Genome-wide association studies have identified thousands of loci for common diseases, but for the majority of these, the mechanisms underlying disease susceptibility remain unknown. Most associated variants are not correlated with protein-coding changes, suggesting that polymorphisms in regulatory regions are likely to contribute to many disease phenotypes. The careful examination of gene expression and its relationship to genetic variation has thus become a critical next step in the elucidation of the genetic basis of common disease."

This association is established with eQTLs, a term that hides the relationship between the sequence of letters in the DNA and the greater or lesser chance that the gene it affects will be expressed, first as RNA and later as a protein. It would be something like the relationship between a book and reading it: some variants (different letters) make this easier and others complicate it, as if some made a few letters bigger and others translated them into a foreign language. But the variants can act in different ways in different tissues, depending on the cells and their environment, so it's not enough to just sequence the genome, take a blood sample and measure the RNA or proteins. DNA is shared, but how it is expressed has to be studied individually for each organ, each type of tissue. The relationship has to be established as if it were a piece of precious metal. This is the only way the information from the first association studies can be understood.

In 2015, the first results obtained through the project were published. For these, 1,500 tissue samples were analyzed from 175 donors, taken just hours before death. There were already some curious results in that data. For example, the variation of gene expression was much greater between different organs in a single person than between different individuals, as explained Roderic Guigó, head of Bioinformatics and Genomics at the Center for Genomic Regulation (CRG) in Barcelona and one of the leaders of this B·Debate. Plus, most of the variation was caused by gender, ethnicity or age. They found differences associated with being male or female in more than 750 genes, mostly in mammary tissue. And up to 2,000 genes (10% of the total) changed activity level with age.

Now the project is much further along. As explained Kristin Ardlie, head researcher on the project at the Broad Institute in Massachusetts, they have created “an atlas of gene expression and the eQTLs for 960 donors.” The sample includes “53 tissues, with samples from up to 11 different areas of the brain.” They have collected more than 20,000 samples and the analysis results are stored in an open archive available to any researcher who wants to use them.

Expanding the samples and data has allowed scientists to prove, for example, that nearly all tissues differ in expression between men and women. As demonstrated by Shmuel Pietrokovski, researcher at the Weizmann Institute of Science in Israel, most of the differences are found in mammary tissue, but also in muscle and adipose tissue (fat).  And even in the anterior cingulate cortex, a region of the brain involved in many cognitive functions.

Apart from gender differences, the huge amount of data is starting to be analyzed to obtain information on the risks and mechanisms of different diseases, as well as the intimate function of the genome, with its extremely complex architecture and cell messaging.