On phylogenetic instrumentalism

ADMIXTURE and STRUCTURE tests aren’t formal mixture tests. Yes! In fact, in the “open science” community this issue is repeated over and over and over, because people routinely get confused (our audience does not consist of population geneticists and phylogeneticists by and large). So sometimes it is necessary to lay it out in detail as in the post above. The key point to always remember is that population genetic & phylogenetic statistics and visualizations are a reduction and summary of reality in human palatable form. They tell us something, but they do not tell us everything. A common issue is that for purposes of mental digestion it is useful to label ancestral elements “European,” or on PCA refer to a “European-Asian” cline, as if the population genetic abstractions themselves are the measure of what European or Asian is. But European and Asian are themselves human constructions, and subject to debate (e.g., do Turks count as Europeans? Indians as Asians?) The population genetic statistics are not themselves subjective, but the meanings we give them are.

Let’s illustrate this with a concrete example. The Cape Coloured population of South Africa is a compound of Khoisan, Bantu, South Asian, Southeast Asian, and Northern European ancestry. But if you use a basic summary statistic which measured genetic distance, such as Fst, they turn out to exhibit the lowest value with South Asians. What’s going on here? This is a real result, but Fst is blind to extraneous information of demographic history. If you used ADMIXTURE or STRUCTURE with only African and European populations you would overestimate the European ancestry of the Cape Coloureds. Why? Because the non-European and non-African component would probably collapse into the “European” element. The algorithms work fine, given the conditions you start it out with. Adding in South and Southeast Asians as reference populations allows these components to fall out. We expect such a division based on history, but recall that South Asians themselves are an admixed population! But for the purposes of understanding the ethnogenesis of the Cape Coloureds, which dates to the past 400 years, an admixture event ~3,000 years before the present is not relevant. In other words, how misleading the result from a given tool is is contingent upon the questions we’re asking. If we are trying to extract answers which are inappropriate to the tools, then we’ll get inappropriate answers.

For the purposes of human population genetics and phylogenetics the main issue is the historical and cognitive bias toward Platonism and types. Instead of “European” being a convenient label for pragmatic purposes, we imbue European with the essences of value of an ideal type. Once we make this transition hilarity ensues. For example, using classic Platonic typology the “Caucasian race” was defined using as a measure the exemplar of that race, the Georgian people of the Caucasus. The classic meaning of Caucasian naturally included the people of Europe and West Asia, with some more expansive definitions inclusive of most South Asians. But in the American context Caucasian has transformed into “white European Westerner.” This means that there are debates whether genuine Caucasians, such as Armenians, are actually Caucasian! What was once a convenient word used to illustrate a clear and distinct concept has transmuted itself so as to generate confusion and diminish clarity.

But I think the current wave of human population and phylogenetics unmasks an even deeper problem. The extant races of modern humans may themselves be recent syntheses of a very different human phylogenetic tree as recently as ~15,000 years ago. For example, nearly every single indigenous resident of South Asia seems to exhibit some level of admixture between two very distinct branches of the human tree within the last 10,000 years. The “Indian race,” as we understand it, is definitely a feature of near prehistory at the earliest (the Neolithic), and perhaps as late as the Indo-Aryan migrations ~4,000 years before the present. And now there are suggestive clues that the same applies to Europe. The people of Europe have roots in the Ice Age inhabitants of the continent, but also the Neolithic peoples of West Asia. And, due to the limitations of demography-blind model based clustering algorithms they may even have more exotic affinities to East Asia which have long been masked! The last may even be an Ice Age era admixture (see the comment at the first link on the relationship to First Americans).

One of the realities of trying to reconstruct the past from what we have in the present is that the past becomes a jigsaw puzzle using pieces of the present. This is informative, but there are limitations. Because the reality is is that the present is a jigsaw puzzle constructed out of the past. Obviously we can’t run an experiment from the past to the present. We have to go backwards, rather than forwards. These are the constraints which bound and shade our understanding. They should not lead us down the path of pure skepticism. Rather, they should instill in us the importance of constant critique, and evaluation of our premises. In fact, one of the things which seem clear from the latest wave of paleogenetic research is that empirical results themselves can overturn premises.

