The Roma have multitudes

Credit: Dbachmann

Credit: Dbachmann

There was a recent case in Ireland of a young Roma girl who was blonde haired and blue eyed being removed from her home, on the suspcision that she was not in fact the biological child of the presumed parents (who, like most Roma, are reportedly of dark complexion, hair, and eye). I even saw a report that a hospital was consulted on the probability of such an outcome, and they said it would be “extremely unusual”. It turns out that DNA tests confirmed that this girl was the biological child of the putative parents. And of course all this has be interpreted in light of the case of “Maria” in Greece; a little blonde girl who turned out not to be the biological child of the two Roma who claimed her as their daughter (it looks like there was welfare fraud in that case).

My initial response was that consultant should be fired, because in an admixed population like the Roma it shouldn’t be that unusual to have offspring who deviate a great deal from the parental phenotype. This prompted some interesting reactions. First, there were those who seem blissfully ignorant of the fact that the Roma are an admixed population. That’s easy enough to resolve, as there have been scientific papers published on this issue using genome-wide data. Second, there are claims that pretty much no Roma have blonde hair and blue eyes (on the order of less than 1%). The latter may be a defensible claim, though not indisputably so.

Before we move on I have to clarify that there is a distinction between “Roma” and “Romani.” The latter refers broadly to the populations across Europe which were referred to as “Gypsy,” while the former denotes a set of populations with a locus of distribution in Southeast Europe. In much of Northern and Western Europe there are now two populations of Romani with very distinct histories (and likely genetics): the Roma who have recently arrived from Southeast Europe, and the various non-Roma groups who have a very long history.

We know a fair amount about the genetics of pigmentation in humans. Though the fine grained individual predictive models are coarse, most of the genes which have large effects on population-scale differences are now well characterized. This allows me to produce a model which is reasonably plausible to give you an intuition for why brown-skinned populations can produce a wide range of outcomes in realized phenotype.

Imagine five loci rank ordered in effect size, gene 1, gene 2, gene 3, gene 4, and gene 5. Each gene comes in two flavors, alleles. One is a “dark” allele (produces dark pigmentation) and another is a “light” allele. You can produce a distribution of complexion which is referred to as a “melanin index” (it’s dependent on reflectance). Imagine that you assume each allele at each gene exhibits a melanin index value like so:

Gene 1 = 30, 2
Gene 2 = 15, 1
Gene 3 = 10, 1
Gene 4 = 5, 2
Gene 5 = 5, 0


What you see above are potential genotypes with their phenotypic values. One allele at gene 1 contributes 30 melanin units, and the other 2. And so on. Taking the “dark” alleles and assume they’re all homozygote (so doubling them), you get a maximal potential value of 130, and a minimal one of 6. But of course in most cases you’ll get a combination. But what would be the frequency? Since I’m lazy I ran a simulation. I set the frequencies of the dark allele for each each like so:

Gene 1 = 60%
Gene 2 = 45%
Gene 3 = 35%
Gene 4 = 46%
Gene 5 = 50%

Then I generated 10,000 multilocus genotypes, and added a “noise” parameter so that the trait wasn’t totally determined by the genes, so the phenotypic value can be higher than what genotype would predict. Here’s the distribution:


The mean value is 73. The 25th percentile is 55. 1 out of 26 individuals should have an exclusively “light” genotype across all five genes. The point is that in a polygenic character if you have polymorphism on the genotypic level you’re likely to have it on the phenotypic level.

roma2The second major question is is this even plausible for Roma? Yes. They’ve very admixed. Two recent papers make the case definitively, Reconstructing Roma History from Genome-Wide Data and Reconstructing the Population History of European Romani from Genome-wide Data. You can see in the bar plot to the left that the Roma have much higher European-like ancestry proportions than other Indians. It is likely their parental population is Punjabi-like, so it seems that they’re ~50% non-Indian in admixture. The second paper offers up a wider population set for comparison, and it seems likely that the Roma did not experience much gene flow with Middle Eastern groups. Rather, their primary phase of admixture occurred ~1,000 years ago in the Balkans. Reconstructing the Population History of European Romani from Genome-wide Data has a wide range of Romani populations, and it seems evident that the Western and Northern Romani have more European admixture than the Balkan Roma. In fact, the Welsh Romani seem totally Europanized in their genome. Because these Romani originally spoke an Indo-Aryan language it seems likely that they are genuine Romani, they have simply undergone enough gene flow with the surrounding population to lose their genetic distinctiveness.

Speaking of which, despite the Romani history of admixture in Europe, they are genetically very isolated now. There’s widespread evidence of inbreeding and founder effect across the Romani populations. I believe one of the problems with adducing phylogenetic relationships of the Romani with Y and mtDNA markers was simply that bottleneck effects are more powerful for uniparental lines, and they were buffeted more by the small population size.

The post The Roma have multitudes appeared first on Gene Expression.

Source: Discover Magazine – Gene Expression