Burning down the trees in historical population genetics

BurnTreephylogenetic tree is an essential tool in understanding the broad scope of natural history, placing particular lineages in specific evolutionary contexts of relatedness. These sorts of trees range from Ernst Haeckel’s classical attempt, depicting relationships which biologists derived from intuition within the framework of a grand evolutionary scheme, all the way down to modern methods implemented in software packages such as Mr. Bayes, which many frankly utilize in a “turnkey” manner. These trees are abstractions, in that they reduce down a wide range of phenomena into schematic representations which impart aspects of particular interest in a stylized form. This is important, because the actual nature of the phenomena being represented may be more complex than is being represented. A simple illustration of what I’m getting is clear when you look at the long history of phylogenetics and phylogeography utilizing mitochondrial DNA lineages (mtDNA). Because mtDNA is copious in comparison to nuclear DNA, it is easy to obtain. And, as there is no recombination and it is inherited in a haploid fashion (mother to daughter) it makes the inference of gene trees much easier. The key problem is that the genealogy of this particular sequence is used to infer aspects about population history, when they may not accurately represent the history of other regions of the genome very well. Different genes may have different histories.

These issues of conflating the history of genes with the history of populations move further into the foreground the less genetic distance separates the populations you are comparing. Phylogenetic analysis involving distinct species has its own problems, but they are dwarfed by what must confront those who attempt to parse out relatedness of populations within species. Because of the ubiquity of gene flow across populations within species attempts to generate a tree of relationships of populations is always bound to be a gross simplification. Instead of a sequence of bifurcations the true relationship of putative populations is more accurately represented by a networked graph.

Jumping from the theoretical to the concrete one of the major issues in regards to constructing a sequence of events of the human past which can be used to inform the human present is that a graph relationship is very complex and difficult to tease apart when the tips of your tree are extant populations which are highly admixed. When you try and reconstruct the past from the present, a necessity in phylogenetic analysis which utilizes genetic data (obviously the issues are different if you are focusing on paleontological information), you necessarily gain a blinkered perspective.*

All this came to a head for me when I read the post The First of the Mohicans, which cited a preprint I’d skimmed over earlier in the year, Efficient moment-based inference of admixture parameters and sources of gene flow. It is by its nature a technical paper, but within it is lodged some genuine dynamite. Let me quote:

Our interpretation is that most if not all modern Europeans are descended from at least one large-scale ancient admixture event involving, in some combination, at least one population of Mesolithic European hunter-gatherers; Neolithic farmers, originally from the Near East; and/or other migrants from northern or Central Asia. Either the first or second of these could be related to the “ancient western Eurasian” branch in Figure 5, and either the first or third could be related to the “ancient northern Eurasian” branch. Present-day Europeans differ in the amount of drift they have experienced since the admixture and in the proportions of the ancestry components they have inherited, but their overall profiles are similar.

The result here is outlined graphically in the preprint:

Figure

What you see above are two varieties of abstractions which attempt to reconstruct phylogenetic relatedness, and implicitly historical change over evolutionary lineages. To the left is a classical tree, where all the terminal nodes (contemporary populations) are te outcomes of bifurcation events. To the right you have an attempt to produce more informatively representation of the relatedness by drawing out likely admixture events. Here’s the major result: modern Europeans seem to be the products of a major admixture event between a population which roots in northern Eurasia, and another with roots in western Eurasia. At the current rate it seems likely that most major world population are the result of mingling between very distinct populations (to varying extents). In fact, I’m rather certain these sorts of inferences underestimate the extent of admixture, rather than overestimate them. By their nature the methods elide complexity.

The ubiquity of this admixture leaves me a bit chagrined, because with the rise of genome-wide data in the mid aughts I’ve been reading papers which produce neat trees and elegant admixture bar plots, all the while unable to confront the reality that the abstractions before me were not reflecting what truly transpired over the past ~10,000 years. A world where modern human expansion resulted in isolation of several major lineages from each other by the end of the last Ice Age down to the present never existed. A world where these major lineages were connected by continuous isolation-by-distance dynamics is very misleading. Here is what I think is more accurate: a world where the “tips” of the phylogenetic tree are pruned repeatedly, and populations which are the outcomes of admixture events expand rapidly to fill the emptying space. Both “ancient North Eurasians” and the “ancient South Eurasians” do not seem to exist in unadmixed form, perhaps with the exception of Andaman Islanders, and some populations in the far north of Siberia. This begs the question, do any populations exist in an “unadmixed form”? What does that even mean? The paper I mention above actually does answer the question in a somewhat precise manner. Populations such as the Japanese are useful in forming an unadmixed scaffold after populations identified as admixed are removed using f3 statistics (see Ancient Admixture in Human History). But this is not the last word on whether the Japanese are admixed or not, though it suffices for the purposes of the questions being asked in the paper.

Where does this leave us? Let’s go back to Europeans. The authors of Efficient moment-based inference of admixture parameters and sources of gene flow assert that pretty much all Europeans exhibit evidence of massive admixture between very distinct lineages. To me this is highly suggestive of events which have roots prior to the Neolithic Revolution. In other words admixture between west and north Eurasian lineages may have occurred in Europe at the end of the last Ice Age, as the continent was being resettled by hunters from the east and south. Later, Neolithic farmers from the Middle East related to the west Eurasian population in Europe during the Pleistocene added a subsequent layer of west Eurasian ancestry, and to a great extent replaced or absorbed the admixed hunter-gatherers. Finally, it seems now entirely possible that a further wave of migrants from Central Asia, who were also an admixed population, erupted into Europe and replaced or absorbed many of the descendants of the Neolithic farmers.

What we’re confronted by is intellectual rubble and bombs are dropped all over the landscape. The world is turned upside down. We’ll rebuild, but it’s going to take time. The past was a strange land, far stranger than we’d thought. In science you go for the boring answers as a null, but in this case the boring answers are turning out to be wrong.

* Ancient DNA analysis is changing this somewhat.

The post Burning down the trees in historical population genetics appeared first on Gene Expression.

Source: Discover Magazine – Gene Expression