In Abdellaoui et al (2019), we looked at the geographic distribution of human DNA in Great Britain using the UK Biobank dataset (N ~450,000 people of European descent). This dataset provided us with the statistical power to look beyond the expected strong relationship between geography and ancestry to the thus far unexplored relationship between geography and complex trait variation. We analyzed and discussed the geographic distribution of many genome-wide aggregate measures, more than we could display in the article. We have now visualized them in the animations below.
Ancestry differences are known to show striking geographic patterns, because the closer people live to each other, the more likely they are to share more ancestors. Patterns of shared ancestry can be captured by conducting a principal component analysis (PCA) on genome-wide single nucleotide polymorphisms (SNPs). PCA is a statistical approach that can summarize the largest patterns of genetic variation in principal components (PCs). Animation 1 below shows the geographic distributions of the first 100 PCs from the PCA conducted on UK Biobank participants of European descent.
To study the geographic distribution of complex trait variation, we constructed polygenic scores for 33 complex traits, including traits related to physical and mental health as well as non-disease traits such as personality and educational attainment. Polygenic scores are the sum of all alleles one carries weighted by their estimated effect sizes, where the effect sizes are computed in genome-wide association studies (GWASs) that did not include the people for which you build polygenic scores (so GWASs that excluded UK Biobank in our case).
Animation 2 shows the geographic distribution of the polygenic scores before and after correcting for the 100 PCs, alongside the change in Moran’s I, which is a measure for geographic clustering. After controlling for ancestry captured by the 100 PCs, educational attainment (EA) shows the strongest regional differences. The geographic distribution of EA is largely in line with the geographic distribution of socio-economic status (SES), and is likely driven by SES-related migration. Using the birth place and current address of the UK Biobank participants, we have looked further into the role of migration in this process. We see for example that people that leave the poorer regions of the country have a higher polygenic score for EA on average than the rest of the country. Animation 3 shows the UK Biobank participants moving from their birth place to their current address and back.
For more details, see our paper:
Abdel Abdellaoui, David Hugh-Jones, Loic Yengo, Kathryn E. Kemper, Michel G. Nivard, Yan Holtz, Laura Veul, Brendan P. Zietsch, Timothy M. Frayling, Naomi Wray, Jian Yang, Karin J.H. Verweij, Peter M. Visscher (2019). Genetic Correlates of Social Stratification in Great Britain. Nature Human Behavior, 3, 1332–1342. Link