Reconstructing genetic ripples in time and space

The inimitable Joe Pickrell has dropped his Khoisan-are-part-Italian preprint onto arXiv, Ancient west Eurasian ancestry in southern and eastern Africa. I’m being glib in my characterization of the paper’s core conclusion, but there’s a reason for such a flip response: the inferences that he seems to draw from the genetic data strike me as verging on crazy. But that’s OK, what genetics is telling us is that history was a whole lot crazier than we had imagined.

Let’s back up for a moment here. For several decades now geneticists have assumed that the Bushmen of the Kalahari, the Khoisan-qua-Khoisan, Africa’s last hunter-gatherers who retain their ancestral language along with the Hadza, are the ur-humans. The basal lineage that first diverged from the rest of mankind at the cusp of the Out of Africa event. This is evident in Y chromosomal and mtDNA phylogenies, where the Bushmen and their kin harbor variants which coalesce deeply in time with those of others. And, a few years ago another group revealed the likelihood that Bushmen also are products of an admixture event in the last ~50,000 years with a distinct hominin lineage which diverged ~1 million years before the present from the main line which led up to anatomically modern humanity. Now Pickrell et al. present us with a twist which is perhaps even more astringent than a lime: in their genomes the Bushmen and their Khoisan kin, the Khoe herders, reflect an ancient admixture event with East Africans, who themselves were the outcomes of hybridizations between West Eurasians and indigenous African populations. More relevantly for my concise summation of the conclusion, the West Eurasian component does not necessarily reflect modern Middle Eastern populations, so much as Southern Europeans!

How did they infer such bizarre results? Magic? No. Basically the authors looked at patterns of linkage disequilibrium. Got it? Probably not. If you are curious, confused, and intent upon understanding the thrust of their methods in your bones, you probably need to read Loh et al. Barring that trust in the great hive-mind that is the Reich lab, or attempt to swallow my trite condensation.

If you consider a short to medium length sequence of the genome, there are genetic variants, alleles, segregating across that sequence. The frequency of these alleles vary across populations. And, there are on occasion correlations of allelic combinations, seen together across a single sequence than would be likely if the alleles across the loci assorted at random. A concrete example would be a population which is the product of a recent admixture event between Africans and Europeans. Recombination would take many generations to break apart all the associations between alleles which are diagnostic and distinctive of African and European ancestry, so long blocks of ancestry tracts could be inferred simply by phasing the genome on the individual level (i.e., you know the sequence of each homolog inherited from each parent, instead of just genotype values). There would be linkage disequilibrium within the population because particular variants would be associated with others across loci due to recent distinct ancestry at the genomic level. If you noticed that SNP 1 had an African allele, then SNP 2 located nearby in the locus is also more likely to have an African allele than expectation, until the point that linkage equilibrium is attained.

As I noted above, these associations are broken apart over time in a regular fashion by genetic recombination. Therefore, the decay in linkage disequilibrium across the genome can allow you to infer time since a putative admixture event. This works at various time depths. African Americans have long range LD because the admixture was relatively recent. To date older admixture events one must be more cunning, as the LD decays and becomes exceedingly faint as recombination hacks apart previous distinctive associations as two genetic backgrounds merge. But what about multiple admixture events and the consequent linkage disequilibrium patterns? What the authors did in the above paper was to test the fit of the data to a composite of LD curves in scenarios where it seems likely that there were two possible admixture events. And, they found multiple populations which did fit this model.

Dispensing with the technicalities, here are the results of admixture events as inferred from the LD decay curves:

The most parsimonious model that Pickrell et al. propose is simple as it is crazy.

1) An ancient initial admixture event in the environs of the Horn of Africa between a proto-West Eurasian population and a proto-Sudanic population

2) A second admixture event which occurs when a population derived downstream from event 1 encounters the ancestors of the Khoisan

Pickrell et al. infer a ~3,000 year old admixture event between West Eurasians and Africans for the Semitic populations of the Ethiopian plateau in keeping with Pagani et al.’s only marginally less crazy results. Then you have step 2, with an admixture between proto-Bushmen/proto-Khoe and the hybrid East Africans ~1,500 years ago. Let us accept these genetic results on the face of it. What they bring home to me is the power of culture. Though vastly diminished today, groups such as the Khoe Nama managed to preserve their integrity and independence down to the period of European colonialism (only being truly decimated in Namibia in the early 20th century by the Germans). A wave of Bantu farmers overwhelmed most of southern Africa, but select groups of Khoisan managed to maintain zones of habitation where they persisted with their unique cultural traditions and perpetuated their language. Some of this surely was ecology, as the vast Karoo region is not particularly amenable to the Bantu cultural toolkit. But, I also suspect that institutional and economic (e.g. cattle culture) influences that the East Africans had upon the Khoe, and perhaps even indirectly the Bushmen, also made these populations more robust to the Bantu expansion than otherwise would have been the case.

Being a preprint on arXiv, the paper of which I speak here is free to you, and copiously explained in loving detail in the supplements in terms of method and madness. I am not particularly enthusiastic about having long discussions about how these results are crazy and can not be right. They are crazy. But I know enough about the methodology here to understand the logic, and accept that the authors are grasping at something very strange and true, even if their particular interpretation and specific results may be disputable. Let me quote the paper at this point:

The hypothesis that west Eurasian ancestry entered eastern Africa through Arabia must be reconciled with the observation that the best modern proxies for this ancestry are often found in southern Europe rather than the Middle East (Supplementary Table 4). This observation can be interpreted in the context of ancient DNA work in Europe, which has shown that, approximately 5,000 years ago, people genetically closely related to modern southern Europeans were present as far north as Scandinavia [Keller et al., 2012; Skoglund et al., 2012]. We thus find it plausible that the people living in the Middle East today are not representative of the people who were living the Middle East 3,000 years ago. Indeed, even in historical times, there have been extensive population movements from and to the Middle East [Davies, 1997; Kennedy, 2008].

Think on that. If Pickrell et. al. are right do you think that the Middle East is particularly special in this regard? I will say that it comes to mind that the high consanguinity may result in strange outcomes if one is not careful with the sampling strategy (I’m thinking of the Samaritans I see in their data), though I doubt that this is an incautious group. But I do think it is plausible that some European populations are better proxies for the ancient Levantines than the modern Levantines because the latter have been washed over by multiple demographic waves (though I want to see more comparisons with Christian Arab* samples).

A second bombshell dropped by Pickrell et. al.:

We note that we have interpreted admixture signals in terms of large-scale movements of people. An alternative frame for interpreting these results might instead propose an isolation-by-distance model in which populations primarily remain in a single location but individuals choose mates from within some relatively small radius. In principle, this sort of model could introduce west Eurasian ancestry into southern Africa via a “diffusion-like” process. Two observations argue against this possibility. First, the gene ow we observe is asymmetric: while some eastern African populations have up to 50% west Eurasian ancestry, levels of sub-Saharan African ancestry in the Middle East and Europe are considerably lower than this (maximum of 15% [Moorjani et al., 2011]) and do not appear to consist of ancestry related to the Khoisan. Second, the signal of west Eurasian ancestry is present in southern Africa but absent from central Africa, despite the fact that central Africa is geographically closer to the putative source of the ancestry. These geographically-specific and asymmetric dispersal patterns are most parsimoniously explained by migration from west Eurasia into eastern Africa, and then from eastern to southern Africa.

Isolation-by-distance is alluded to implicitly when we speak of human genetic variation as clinal. And it’s not totally lacking in utility as a null model. But I think we need to add another layer of complexity upon this parsimonious elegance of human clans eternally exchanging mates in monotonous step-wise fashion. Multiple populations over the past 10,000 years (and likely earlier!) were rocked massive demographic turmoil, as foreigners from afar amalgamated themselves upon the local substrate, and abolished the old to bring forth something new. The author of this post is himself a product of such an event. The genetic story of mankind is not just one of continuous and diffuse gene flow gradually over a landscape of small-sale societies. No, this placid background condition was periodically perturbed by an explosion of translocating peoples, likely triggered by a technological or cultural revolution of some sort. The genetic impact in many cases is too great to be anything but a folk wandering.

Unlike isolation-by-distance these patterns do not flow linearly across space, but exhibit discordant lashing patterns through ecologically fertile terrain. Rather than a mist gliding across the plains, imaging a flash flood scouring a ravine. A more gentle analogy would be that these are demographic ripples, which expand outward, temporarily distorting the calm surface of isolation-by-distance dynamics, and eventually fading back into the background and becoming the new normal. But once the ripple has faded how do we know that it was once? That is a difficult thing indeed, and these results indicate the problems inherent. It may be that the echoes of the ripple that Pickrell et al. detect issue from a source which no longer exists. Are the scions of the first farmers of the ancient Levant hidden away in the valleys of Tuscany and the plains of Tanzania? A crazy proposition also, but not necessarily a false one.

Citation: arXiv:1307.8014v1 [q-bio.PE]

* I know some Christian Arabs do not want to be called Arabs.

The post Reconstructing genetic ripples in time and space appeared first on Gene Expression.

Source: Discover Magazine – Gene Expression