RSS

Category Archives: mtDNA

Ancient DNA from Clovis culture is Native American (also Tianyuan affinity mystery)

Figure 4 | [c] (…) maximum likelihood tree. 
A recent study on the ancient DNA of human remains from Anzick (Montana, USA), dated to c. 12,500 calBP, confirms close ties to modern Native Americans, definitely discarding the far-fetched and outlandishly Eurocentric “Solutrean hypothesis” for the origins of Clovis culture (what pleases me greatly, I must admit).
While this fits well with the expectations (at least mine), there is some hidden data that has surprised me quite a bit: it sits at the bottom of a non-discussed formal test graph in which modern populations are compared with both Anzick and Tianyuan (c. 40,000 BP, North China). See below.
Morten Rasmussen et al., The genome of a Late Pleistocene human from a Clovis burial site in western Montana. Nature 2014. Pay per viewLINK [doi:10.1038/nature13025]

Abstract


Clovis, with its distinctive biface, blade and osseous technologies, is the oldest widespread archaeological complex defined in North America, dating from 11,100 to 10,700 14C years before present (bp) (13,000 to 12,600 calendar years bp)1, 2. Nearly 50 years of archaeological research point to the Clovis complex as having developed south of the North American ice sheets from an ancestral technology3. However, both the origins and the genetic legacy of the people who manufactured Clovis tools remain under debate. It is generally believed that these people ultimately derived from Asia and were directly related to contemporary Native Americans2. An alternative, Solutrean, hypothesis posits that the Clovis predecessors emigrated from southwestern Europe during the Last Glacial Maximum4. Here we report the genome sequence of a male infant (Anzick-1) recovered from the Anzick burial site in western Montana. The human bones date to 10,705 ± 35 14C years bp (approximately 12,707–12,556 calendar years bp) and were directly associated with Clovis tools. We sequenced the genome to an average depth of 14.4× and show that the gene flow from the Siberian Upper Palaeolithic Mal’ta population5 into Native American ancestors is also shared by the Anzick-1 individual and thus happened before 12,600 years bp. We also show that the Anzick-1 individual is more closely related to all indigenous American populations than to any other group. Our data are compatible with the hypothesis that Anzick-1 belonged to a population directly ancestral to many contemporary Native Americans. Finally, we find evidence of a deep divergence in Native American populations that predates the Anzick-1 individual.

Haploid DNA
The Y-DNA lineage of Anzick is Q1a2a1* (L54) to the exclusion of the common Native American subhaplogroup Q1a2a1a1 (M3). Among the modern compared sequences that of a Maya is the closest one.

The mtDNA belongs to the common Native American lineage D4h3a at its underived stage (root). 
For starters I must explain that these underived haplotypes can only be found within mtDNA and never in modern Y-DNA (common misconception) because this one accumulates mutations every single generation, while the much shorter mtDNA does only occasionally. Hypothetically we could find the exact ancestor of some modern Y-DNA haplogroup in ancient remains but that would be like finding the proverbial needle in the haystack. On the other hand, finding the underived stage in mtDNA, be it ancient or modern, does not mean that we are before a direct ancestor but just a non-mutated relative of her, who can be very distant in fact.


Autosomal DNA

In this aspect, the Anzick man shows clearly strongest affinities to Native Americans, followed at some distance by Siberian peoples, particularly those near the Bering Strait. 

Figure 2 | Genetic affinity of Anzick-1. a, Anzick-1 is most closely related to Native Americans. Heat map representing estimated outgroup f3-statistics for shared genetic history between the Anzick-1 individual and each of 143 contemporary human populations outside sub-Saharan Africa. (…)
However Anzick-1 shows clearly closer affinity to the aboriginal peoples of Meso, Central and South America (collectively labeled as SA) and less so to those of Canada and the American Arctic (labeled as NA). No data was available from the USA. 
This was pondered by the authors in several competing models of Native American ancestry:
Figure 3 | Simplified schematic of genetic models. Alternative models of the population history behind the closer shared ancestry of the Anzick-1 individual to Central and Southern American (SA) populations than Northern Native American (NA) populations; seemain text for further definition of populations. We find that the data are consistent with a simple tree-like model in which NA populations are historically basal to Anzick-1 and SA. We base this conclusion on two D-tests conducted on the Anzick-1 individual, NA and SA. We used Han Chinese as outgroup. a, We first tested the hypothesis that Anzick-1 is basal to both NA and SA populations using D(Han, Anzick-1; NA, SA). As in the results for each pairwise comparison between SA and NA populations (Extended Data Fig. 4), this hypothesis is rejected. b, Next, we tested D(Han, NA; Anzick-1, SA); if NA populations were a mixture of post-Anzick-1 and pre-Anzick-1 ancestry, we would expect to reject this topology. c, We found that a topology with NA populations basal to Anzick-1 and SA populations is consistent with the data. d, However, another alternative is that the Anzick-1 individual is from the time of the last common ancestral population of the Northern and Southern lineage, after which the Northern lineage received gene flow from a more basal lineage.
The most plausible model they believe is “c”, in which Anzick-1 is close to the origin of the SA population, while NA diverged before him. However model “d” in which Anzick-1 is close to the overall Native American root but NA have received further inputs from a mystery population (presumably some Siberians, related to the Na-Dené and Inuit waves) is also consistent with the data. Choosing between both “consistent” models (or something in between) clearly requires further investigation. 

Tianyuan and East Asian origins
All the above is very much within expectations, although refreshingly clarifying. But there is something in the formal tests (extended data fig. 5) that is most unexpected (but not discussed in the paper). 
The formal f3 tests of ED-fig.5 a to e fall all within reasonable expectations. Maybe the most notable finding is that, after all, the pre-Inuit people of the Dorset culture (represented by the Saqqaq remains) left some legacy in Greenland, but they also show some extra affinity with several Siberian populations (notably the Naukan, Chukchi, Koryak and Yukaghir, in this order) before to any other Native Americans, including Aleuts). 
But the really striking stuff is in figs. f and g, where it becomes obvious that the Tianyuan remains of Northern China show not a tad of greater affinity to East Asians (nor to Native Americans) than to West Eurasians. Also two East Asian populations (Tujia and Oroqen) are considerably more distant than the bulk of East Asian peoples to Tianyuan but also to Aznick.
Extended Data Figure 5 | Outgroup f3-statistics contrasted for different combinations of populations. (…) f, g, Shared genetic history with Anzick-1 compared to shared genetic history with the 40,000-year-old Tianyuan individual from China.
This is very difficult to explain, more so as Tianyuan’s mtDNA haplogroup B4’5 is part of the East Asian and Native American genetic pool, and the authors make no attempt to do it. 
The previous study by Qiaomei Fu et al. (open access) placed Tianyuan’s autosomal DNA near the very root of Circum-Pacific populations (East Asians, Native Americans and Australasian Aborigines) but after divergence from West Eurasians:
From Qiaomei Fu 2013
They even had doubts about the position of Papuans (the only Australasian representation) in that tree, which they suspected an artifact of some sort.
Since I saw that graph (h/t to an anonymous commenter at Fennoscandian Ancestry) I am squeezing my brain trying to figure out a reasonable explanation, considering that the formal f3 test has almost certainly more weight than the ML tree made with an algorithm. 
My first tentative explanation would be to imagine a shared triple-branch origin for Tianyuan, East Asians and West Eurasians, maybe c. 60 Ka ago (it must have been before the colonization of West Eurasia), to the exclusion of other, maybe isolated, ancient populations, whose admixture with the ancestors of the Tujia, Oroqen and Melanesians (maybe via Austronesians?) causes those striking low affinity values for these.
This would be a similar mechanism to the one explaining lower Tianyuan (and generally all ancient Eurasian) affinity for Palestinians (incl. Negev Bedouins) and also the Makrani, who have some African admixture and (in the Palestinian case) also, most likely, residual inputs from the remains of the first Out-of-Africa episode in Arabia.
However to this day we have no idea of which could be those hypothetical ancient isolated populations of East Asia. In normal comparisons such as ADMIXTURE analysis the Tujia and Oroqen appear totally normal within their geographic context, but this may be an artifact of not doing enough runs to reach higher K values, according to the cross-validation test, much more likely to discern the actual realistic components. 
The matter certainly requires further research, which may well open new avenues for the understanding the genesis of Eurasian populations, particularly those from the East.
Advertisements
 

Italian haploid genetics (second round)

More than a year ago I commented (as much as I could) on the study of Italian haploid genetics by Francesca Brisighelli et al. Sadly the study was published with several major errors in the figures, making it impossible to get anything straight. 
I know directly from the lead author that the team has been trying since then to get the paper corrected but this correction was once and again delayed by apparent inefficiency of PLoS ONE’s management, much to their frustration. Finally this week the correction has been published and the figures corrected.
So let’s give this study another chance:
Francesca Brisighelli et al., Uniparental Markers of Contemporary Italian Population Reveals Details on Its Pre-Roman Heritage. PLoS ONE 2012 (formally corrected in February 2014). Open accessLINK [doi:10.1371/journal.pone.0050794]
 
Notice please that you have to read the formal correction in order to access the new figures, the wrong ones are still in the paper as such. 
The corrected figures are central to the study:

Figure 1 (corrected). Map showing the location of the samples analyzed in the present study and those collected from the literature (see Table 1).
Pie
charts on the left display the distribution of mtDNA haplogroup
frequencies, and those on the right the Y-chromosome haplogroup
frequencies.

So now we know that the Northern mtDNA pie was duplicated in the original graph and that Central Italians are outstanding in R0(xH,V), which reaches 14% (probably most HV*), while they have some other peculiarities relative to their neighbors from North and South: some less U and no detected V. 
Other variations are more clinal: H decreases from North to South while J and T do the opposite.

Figure 3 (corrected). Phylogeny of Y-chromosome SNPs and haplogroup frequencies in different Italian populations.

In the Y-DNA side, the most obvious transition is between the high frequencies of R1b1a2-M269 (R1b3 in the paper) in the North versus much lower frequencies in the South. But also:
  • J2 is notorious in the Central region (and also the South) but rare in the North.
  • G frequencies in the South are double than those of Center and North.
  • The same happens with lesser intensity regarding E1b1b1-M35 (E3b in the study).
  • In contrast haplogroup I is most common in the North. However the Sardinian and sub-Pyrenean clade I2a1a-M26 (I1b2 in the paper), which is also the one documented in Chalcolithic Languedoc, is rare in all regions.

The study also deals with several isolated populations:

Figure 4. Haplogroup frequencies of Ladins, Grecani
Salentini and Lucera compared to the rest of the Italian populations
analyzed in the present study.

All them show large frequencies of mtDNA H relative to their regions. The Grecani Salentini do have some extra Y-DNA E1b1b1 (E3b) and J2, what may indeed underline their partial Greek origins. The Ladini show unusually high frequencies of R1b*(xR1b1a2) and K*(xR1a,R1b,L,T,N3), while the Lucerans are outstanding in their percentage of G.
I want to end this entry with a much needed scolding to the staff of PLoS ONE for their totally unacceptable original sloppiness and delay in the correction. And my personal thanks and appreciation to Francesca Brisighelli for her indefatigable persistence and enthusiasm for her work, which is no doubt of great interest.
 

Mitochondrial lineages from Myanmar

Myanmar, also known as Burma, has been one of those blind spots in the mapping of human genetics. Finally now we get to know something about the peoples of this SE Asian multiethnic state, although there are limitations because the sampling was performed among refugees in Thailand.
Monica Summerer et al., Large-scale mitochondrial DNA analysis in Southeast Asia reveals evolutionary effects of cultural isolation in the multi-ethnic population of Myanmar. BMC Evolutionary Biology 2014. Open accessLINK [doi:10.1186/1471-2148-14-17]

Abstract


Background


Myanmar is the largest country in mainland Southeast Asia with a population of 55 million people subdivided into more than 100 ethnic groups. Ruled by changing kingdoms and dynasties and lying on the trade route between India and China, Myanmar was influenced by numerous cultures. Since its independence from British occupation, tensions between the ruling Bamar and ethnic minorities increased.


Results


Our aim was to search for genetic footprints of Myanmar’s geographic, historic and sociocultural characteristics and to contribute to the picture of human colonization by describing and dating of new mitochondrial DNA (mtDNA) haplogroups. Therefore, we sequenced the mtDNA control region of 327 unrelated donors and the complete mitochondrial genome of 44 selected individuals according to highest quality standards.


Conclusion


Phylogenetic analyses of the entire mtDNA genomes uncovered eight new haplogroups and three unclassified basal M-lineages. The multi-ethnic population and the complex history of Myanmar were reflected in its mtDNA heterogeneity. Population genetic analyses of Burmese control region sequences combined with population data from neighboring countries revealed that the Myanmar haplogroup distribution showed a typical Southeast Asian pattern, but also Northeast Asian and Indian influences. The population structure of the extraordinarily diverse Bamar differed from that of the Karen people who displayed signs of genetic isolation. Migration analyses indicated a considerable genetic exchange with an overall positive migration balance from Myanmar to neighboring countries. Age estimates of the newly described haplogroups point to the existence of evolutionary windows where climatic and cultural changes gave rise to mitochondrial haplogroup diversification in Asia.

The main sampled ethnic group are the Karen, who live at the border with Thailand, but the Bamar or Burmans, the largest ethnic group, were also sampled in big numbers. 
Fig. 2.- Origin of samples and mitochondrial haplogroup distribution of Southeast Asian populations. Although most of the study participants originated from Karen State (red), a broad
sample spectrum from nearly all divisions and states of Myanmar (a) was included in this study. b shows the haplogroup distributions of populations from Myanmar and four other Southeast
Asian regions. In the white insert box the haplogroup heterogeneity of two ethnic
groups of Myanmar is illustrated. The hatched area in the map surrounding the border
between Myanmar and Thailand shows the main population area of the Karen people. The
Bamar represent the largest ethnic group (68%) in Myanmar. The size of the pie diagrams
corresponds to sample size.
The smaller samples are only detailed in the supplementary data for what I have seen, so I will not discuss them right now (maybe in an update?). 
Overall all SE Asians including the Southern Han from Hong-Kong appear similar in broad terms. Excepted Laos, this relative similitude is quite apparent in figure 3:
Fig. 3.- Multi-dimensional scaling plot of pairwise Fst-values and haplogroup distribution
of populations from Myanmar and 12 other Asian regions.
A distinct geographic pattern appeared in the multi-dimensional scaling plot (Stress = 0.086;
R2 = 0.970) of pairwise Fst-values: The Myanmar sample fitted very well within the Southeast
Asian cluster, the Central Asian populations formed a second cluster, the Korean sample
represented East Asia, the Afghanistan population was representative for South Asia
and Russia symbolized Western Eurasia. The main haplogroup distributions are displayed
as pie charts. The size of the pie diagrams corresponds to sample size. The proportion
of N-lineages (without A,B and R9’F) increases from very low percentages in Southeast
and East Asia over 50% in Central Asia to more than 75% in Afghanistan and 100% in
the sample of Russian origin. The proportion of the American founding haplogroups
A,B,C and D displayed an interesting pattern: from inexistent in Russians it increased
to more than 50% in East Asian Korea.
Looking at the particular differences in haplogroup frequencies, I’d say that the Thai are quite unremarkable, while the other populations show some peculiarities:
  • Karen: higher frequencies of R9/F, A, C and G
  • Bamar: much higher M* (and extremely diverse)
  • Laotian: higher frequencies of B and M7
  • Vietnamese: more B and N*
  • South Han (Hong-Kong): more D
It is very notable the high diversity of paragroup M* among the Bamar. The authors notice that not more than three individuals shared each different subhaplogroup, what points to a very high diversity within haplogroup M. I don’t have time right now to ponder the various lineages, some of which are newly described, but I probably will in the future, because, together with the high diversity in NE India, they have the potential of shifting the paradigm of Asian colonization by H. sapiens a bit towards the East.
The various M* and other novel haplogroups described in Myanmar is shown in fig. 4. Haplogroups M90 and M91 are new basal M sublineages, along with three other unnamed private lineages, which also appear as basal. Also M20a, M49a and G2b1a are new sublineages further downstream. Within N/R, another newly described lineage is B6a1.
The Bamar are extremely diverse not just within M*:

… the haplogroup composition of Bamar
was exceptionally diverse with 80 different haplogroups and a maximum of 6 samples
in the same haplogroup (Figure 4).

On the other hand, the Karen show the signs of genetic isolation instead, with large concentrations in the same haplogroups.
Interestingly, the authors think that rather than being a receiver, Myanmar was a major source of population to its neighbors:

Migration analyses of Myanmar and four Southeast Asian regions displayed a vivid exchange
of genetic material between the countries and demonstrated a strong outwards migration
of Myanmar to all analyzed neighboring regions (for details see Additional file 4: Table S4).

This influence is most intense to Laos, Thailand and South China, while things are more balanced regarding Vietnam instead.
 

Andalusian mtDNA highlights W-E differences

The issue of W-E genetic differences in Iberia has been discussed before in this blog by guest author Argiedude on the grounds of Y-DNA, showing that roughly the Western third of Iberia is distinct from the rest, most notably because of its higher presence of North African lineage E1b-M81. However the differences also appear in the mtDNA, even if they may be a bit more subtle because they cross with a N-S gradient of sorts.
A new study focused on the matrilineages of two characteristic Andalusian provinces, Granada in the East and Huelva in the West, underscores that this difference is very real.
Candela L. Hernández et al., Human maternal heritage in Andalusia (Spain): its composition reveals high internal complexity and distinctive influences of mtDNA haplogroups U6 and L in the western and eastern side of region. BMC Genetics 2014. Open accessLINK [doi:10.1186/1471-2156-15-11]

Abstract (provisional)


Background


The archeology and history of the ancient Mediterranean have shown that this sea has been a permeable obstacle to human migration. Multiple cultural exchanges around the Mediterranean have taken place with presumably population admixtures. A gravitational territory of those migrations has been the Iberian Peninsula. Here we present a comprehensive analysis of the maternal gene pool, by means of control region sequencing and PCR-RFLP typing, of autochthonous Andalusians originating from the coastal provinces of Huelva and Granada, located respectively in the west and the east of the region.


Results


The mtDNA haplogroup composition of these two southern Spanish populations has revealed a wide spectrum of haplogroups from different geographical origins. The registered frequencies of Eurasian markers, together with the high incidence and diversification of African maternal lineages (15% of the total mitochondrial variability) among Huelva Andalusians when compared to its eastwards relatives of Granada and other Iberian populations, constitute relevant findings unknown up-to-date on the characteristics of mtDNA within Andalusia that testifies a female population substructure. Therefore, Andalusia must not be considered a single, unique population.


Conclusions


The maternal legacy among Andalusians reflects distinctive local histories, pointing out the role of the westernmost territory of Peninsular Spain as a noticeable recipient of multiple and diverse human migrations. The obtained results underline the necessity of further research on genetic relationships in both sides of the western Mediterranean, using carefully collected samples from autochthonous individuals. Many studies have focused on recent North African gene flow towards Iberia, yet scientific attention should be now directed to thoroughly study the introduction of European genes in northwest Africa across the sea, in order to determine its magnitude, timescale and methods, and to compare them to those terrestrial movements from eastern Africa and southwestern Asia. 

Naturally the most noteworthy data is the frequency of mtDNA haplogroups in the two sampled populations:

A map comparing this data with some other areas of Iberia (mostly the West) is also provided:

Figure 3 – mtDNA haplogroup profiles registered in some populations of the Iberian Peninsula. The two Andalusian subpopulations studied here are marked with a red arrow. Codes are as in Additional file 3.
What makes Onubenses (the inhabitants of Huelva, ancient Onuba, West Andalusia) peculiar in relation to their Granadino neighbors is their lower frequency of H (notably H*, H3 and H5), along with their higher frequencies of K1, U3a and several North African related haplogroups (U6, L1b and L2). The peculiarities of Granadinos (notably the high H5 frequency) are not so notable in comparison.
Some of these Onubense peculiarities are reproduced in other parts of West Iberia, notably the low frequencies of H (but not at all in the NW corner), the presence of U6 (notably in North Portugal and also, not shown here, among the Maragatos of the León-Galicia border area) and to some extent the elevated frequency of K and some L(xM,N) lineages (varied localized frequencies).
Much of this seems best explained by ancient flows from NW Africa (flows which may be Neolithic, Paleolithic or from the Metal Ages but hardly related to Phoenician or Muslim colonization, which had no W-E gradient whatsoever) but I have some qualms about the quick identification by the authors of the origins of Onubense high K1 in that area. At the very least it must be noticed that, unlike the other African markers, K1 is much more common towards the Eastern parts of NW Africa and is also found at similarly high frequencies in many parts of Europe, such as France and Central Europe.
K1 was first spotted in Iberian ancient DNA in the early Neolithic (Los Cascajos, Navarre), and was an important Neolithic lineage through Europe. While I can’t discard the North African suggested origin, I think that other possibilities are at least as likely.
On U6, the presence of the very rare U6c, one of two basal lineages of U6, only found previously in 5 Moroccans, 10 Canarians and in one Italian, reinforces the idea of U6 expanding from West to East (against what most conclusions suggest on the most unclear grounds) with a most likely Moroccan origin. In turn this raises the question on how pre-U6 arrived to Morocco, especially as the Aurignacoid Dabban industries now seem to never have gone further West than Cyrenaica, opening the possibility of this lineage having arrived to NW Africa via Europe, where we know that U in general was common in the Upper Paleolithic, and which clearly influenced North Africa at the Oranian (aka Iberomaurusian) cultural genesis (LGM) via Morocco (Taforalt and other sites).
On the L(xM,N) lineages, it is worth mentioning that Cerezo 2012 claimed that some of them may be pre-Neolithic in Europe, especially L1b1a variants, which are widespread at low frequencies through Europe with an Iberian centrality.
The overall picture is maybe best visualized by the Hierarchical Cluster Analysis:

Figure 5 – Hierarchical Cluster Analysis (HCA) of 53 populations based on their mtDNAdiversity. The haplogroups used here are marked with arrows (vectors). Populations are indicated with numbers as in Additional file 3.
Cluster 1 is particularly noticeable for its high frequency of mtDNA H, being composed by mostly Iberian samples (along with South Germans and some Sicilians). Granada, as well as nearby Córdoba, sit here (even if with a tendency towards the more mainstream European Cluster 2). Huelva is not too remarkable when compared with other European samples in the mtDNA HCA, although it does show some deviation towards NW Africa, clustering closest to Canarians.
In spite of the good frequency of North African samples, it must be noted that the horizontal axis, in essence contrasting Europe vs. West Asia, has a weight of almost 50%, while the vertical axis, contrasting these two vs. North Africans (and the Sámi) only weights 13% (axis are almost never of the same relevance in PCA-like graphs, even if they are presented as such).
In synthesis: NW Africa had some minor but significant genetic influence in the West Iberian third, long before the Muslim period, and this mtDNA study ratifies these distinctions in the particular case of Andalusia, adding some rich detail of data.
Also, as the authors underline in their discussion, the issue of European genetic influence in North Africa still requires some serious investigation.
 
 

Ancient European DNA and some debatable conclusions

There is a rather interesting paper still in preparation available online and causing some debate.
Iosif Lazaridis, Nick Patterson, Alissa Mittnik, et al., Ancient human genomes suggest three ancestral populations for present-day Europeans. BioArxiv 2013 (preprint). Freely accessibleLINK [doi:10.1101/001552]

Abstract

Analysis of ancient DNA can reveal historical events that are difficult to discern through study of present-day individuals. To investigate European population history around the time of the agricultural transition, we sequenced complete genomes from a ~7,500 year old early farmer from the Linearbandkeramik (LBK) culture from Stuttgart in Germany and an ~8,000 year old hunter-gatherer from the Loschbour rock shelter in Luxembourg. We also generated data from seven ~8,000 year old hunter-gatherers from Motala in Sweden. We compared these genomes and published ancient DNA to new data from 2,196 samples from 185 diverse populations to show that at least three ancestral groups contributed to present-day Europeans. The first are Ancient North Eurasians (ANE), who are more closely related to Upper Paleolithic Siberians than to any present-day population. The second are West European Hunter-Gatherers (WHG), related to the Loschbour individual, who contributed to all Europeans but not to Near Easterners. The third are Early European Farmers (EEF), related to the Stuttgart individual, who were mainly of Near Eastern origin but also harbored WHG-related ancestry. We model the deep relationships of these populations and show that about ~44% of the ancestry of EEF derived from a basal Eurasian lineage that split prior to the separation of other non-Africans.

Haploid DNA
The Lochsbour skull.
The prominent browridge
is very unusual for
Paleolithic Europeans.
The new European hunter-gatherer samples carried all Y-DNA I and mtDNA U5a and U2e.
More specifically, the hunter-gatherer mtDNA lineages are:
  • Lochsbour (Luxembourg): U5b1a
  • Motala (Sweden):
    • Motala 1 & 3: U5b1a
    • Motala 2 & 12: U2e1
    • Motala 4 & 6: U5a2d
    • Motala 9: U5a2
Additionally the Stuttgart Linear Pottery farmer (female) carried the mtDNA lineage T2c1d1.
The Y-DNA lineages are:
  • Lochsbour: I2a1b*(xI2a1b1, I2a1b2, I2a1b3)
  • Motala 2: I*(xI1, I2a2,I2a1b3)
  • Motala 3: I2*(xI2a1a, I2a2, I2b)
  • Motala 6: uncertain (L55+ would make it Q1a2a but L232- forces it out of Q1)
  • Motala 9: I*(xI1)
  • Motala 12: I2a1b*(xI2a1b1, I2a1b3)
These are with certainty the oldest Y-DNA sequences of Europe so far and the fact that all them fall within haplogroup I(xI1) supports the notion of this lineage being once common in the subcontinent, at least in some areas. Today I2 is most common in Sardinia, the NW Balcans (Croatia, Bosnia, Montenegro), North Germany and areas around Moldavia.
I2a1b (which may well be all them) is currently found (often in large frequencies) in the Balcans and Eastern Europe with some presence also in the eastern areas of Central Europe. It’s relative I2a1a is most common in Sardinia with some presence in SW Europe, especially around the Pyrenees. I2a1 (probably I2a1a but not tested for the relevant SNPs) was also found, together with G2a, in a Chalcolithic population of the Treilles group (Languedoc) and seems to be somehow associated to Cardium Pottery Neolithic.
If you want my opinion, I’d think that I2a before Neolithic was dominant, like mtDNA U5 (and satellites U4 and U2e), in much of Central and Eastern Europe but probably not in SW Europe, where mtDNA U5 seems not so much hyper-dominant either, being instead quite secondary to haplogroup H (at least in Western Iberia). But we’ll have to wait until geneticists manage to sequence Y-DNA in several SW European Paleolithic remains to be sure.

Autosomal DNA and derived speculations
Most of the study (incl. the must-read supplemental materials) deals however with the autosomal DNA of these and other hunter-gatherers, as well as of some Neolithic farmers from Central Europe and Italy (Ötzi) and their comparison with modern Europeans. 
To begin with, they generated a PCA plot of West Eurasians (with way too many pointless Bedouins and Jews, it must be said) and projected the ancient Europeans, as well as a whole bunch of Circum-Pacific peoples on it:
The result is a bit weird because, as you can see, the East Asians, Native Americans and Melanesians appear to fall way too close to the peoples of the Caucasus and Anatolia. This seems to be a distorting effect of the “projection” method, which forces the projected samples to align relative to a set of already defined parameters, in this case the West Eurasian (modern) PCA. 
So the projection basically formulates the question: if East Asians, etc. must be forcibly to be defined in West Eurasian (WEA) terms, what would they be? And then answers it as follows: Caucasian/Anatolian/Iranian peoples more or less (whatever the hidden reasons, which are not too clear).
Similarly, it is possible (but uncertain) that the ancient European and Siberian sequences show some of this kind of distortion. However I have found experimentally that the PCA’s dimension 1 (but not the dimension 2, which corresponds largely to the Asian-specific distinctions) still correlates quite well with the results of other formal tests that the authors develop in the study and is therefore a valuable tool for visualization.
But this later. By the moment the PCA is asking and answering three or four questions by projecting ancient European and Siberian samples in the West Eurasian plot:
  • If ancient Siberians are forced to be defined in modern WEA terms, what would they be? Answer: roughly Mordvins (Afontova Gora 2) or intermediate between these and North Caucasus peoples (Mal’ta 1).
  • If ancient Scandinavian hunter-gatherers are forced in modern WEA terms, what would they be? Answer: extreme but closest (Skoglund) to Northern European peoples like Icelanders or Lithuanians.
  • If ancient Western European hunter-gatherers are forced in modern WEA terms, what would they be? Answer: extreme too but closest (La Braña 2) to SW European peoples like Basques and Southern French.
  • If ancient Neolithic/Chalcolithic farmers from around the Alps and Sweden are forced in modern WEA terms, what would they be? Answer: Canarians (next close: Sardinians, then Spaniards).
Whatever the case, there seems to be quite a bit of autosomal diversity among ancient Western hunter-gatherers, at the very least when compared with modern peoples. This makes some good sense because Europe was a big place already in Paleolithic times and must have harbored some notable diversity. Diversity that we may well find to grasp if we only sample people from the same areas once and again.
On the other hand, they seem to cluster in the same extreme periphery of the European cluster, opposed to the position of West Asians, and therefore suggesting that there has been some West Asian genetic flow into Europe since then (something we all assume, of course). 
Using Lochsbour as proxy for the WHG (Western hunter-gatherer) component, Mal’ta 1 as proxy for the ANE (ancient north Eurasian) one and Stuttgart as proxy for the EEF (early European farmer) one, they produce the following graph (to which I added an important note in gray):
The note in gray is mine: highlighting the contradictory position where the other Western hunter-gatherers may fall in because of assuming Lochsbour as valid proxy, when it is clearly very extreme. This was not tested in the study so it is inferred from the PC1, which seems to best approach the results of their formal tests in the WHG vs EEF axis, as well as those of the WHG vs Near East comparisons.
I tried to figure out how these formal tests are reflected, if at all in the PCA, mostly because the PCA is a much easier tool for comprehension, being so visual. Eventually I found that the dimension 1 (horizontal axis) is very close to the genetic distances measured by the formal tests (excepted those for the ANE component, obviously), allowing a visualization of some of the possible problems caused by their use of Lochsbour as only reference, without any control. Let’s see it:

The same PCA as above with a few annotations in magenta and green
While not exactly, the slashed vertical magenta line (median in the dimension 1 between Lochsbour and Stuttgart) approximates quite well the WHG vs EEF values measured in the formal tests. Similarly, the slashed green axis (median in PC1 between Lochsbour and an good looking Bedouin) approximates to a great extent the less precise results of the formal tests the authors applied to guesstimate the West Asian and WHG ancestry of EEFs, which ranged between 60% and almost 100% West Asian (my line is much closer to the 60% value, which seems more reasonable). 
When I tried to find an alternative median WHG/West Asian line, using Braña 2 and the first non-Euro-drifted Turk I could spot (Anatolia is much more likely to be the direct source of West Asian ancestry in Europe than Bedouins), I got exactly the same result, so no need to plot any second option (two wrongs sometimes do make one right, it seems). But when I did the same with La Braña 2 and Stuttgart I got a genuine good-looking alternative median line, which is the slash-and-dot magenta axis.
This alternative line is probably a much more reasonable 50% WHG-EEF approximation in fact and goes right through Spain, what makes good sense for all I know.
Of course the ideal solution would be that someone performed good formal tests, similar to those done in the study, with Braña 2 and/or Skoglund, which should be more similar to the actual WHG ancestry of modern Europeans than the extremely divergent Lochsbour sequence. An obvious problem is that La Braña produced only very poor sequences but, well, use Skoglund instead or sample some Franco-Cantabrian or Iberian other Paleolithic remains.
Whatever the solution, I think that we do have a problem with the use of Lochsbour as only WHG proxy and that it demands some counter-testing. 
What about the ANE component? I do not dare to give any alternative opinion because I lack tools to counter-analyze it. What seems clear is that its influence on modern Europeans seems almost uniformly weak and that it can be ignored for the biggest part. As happens with the WHG, it’s quite possible that the ANE would be enhanced if the sequence from Afontova Gora is used instead of that of Mal’ta but I can’t foresee how much. 
Finally some speculative food-for-thought. Again using the visual tool of the PCA, I spotted some curiosities:

Speculative annotations on the PCA

Most notably it is apparent that the two WHG populations (Western and Scandinavian) are aligned in natural axes, which seem to act as clusters. Extending both (dotted lines) they converge at a point closest to some French, notably the only “French” that tends towards “Southern France” and Basques. So I wonder: is it possible that these two WHG cluster-lines represent derived ancient branches from an original population of SW France. We know that since the LGM, the area of Dordogne (Perigord) was like the megapolis of Paleolithic Europe, with population densities that must have been several times those of other areas. We know that this region was at the origin of both Solutrean and Magdalenian cultures and probably still played an important role in the Epipaleolithic period. 
So I do wonder: is that “knot” a mere artifact of a mediocre representation or is it something much more real? Only with due research in the Franco-Cantabrian region we will find out. 
 

Neanderthals, Denisovans and everything else

A recent analysis of the nuclear DNA of a Neanderthal toe from Altai has caused widespread interest.
Kay Prüffer et al., The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 2013. Pay per viewLINK [doi:10.1038/nature12886]
The story of a finger and a toe
Both the Denisovan and Neanderthal DNA sequences discussed in this paper come from small bones found at the same location: Denisova cave, Altai Republic. The Denisovan sequence that revolutionized human paleogenetics a few years ago corresponds to a finger phalanx bone of some 50,000 years ago. The less notorious Neanderthal sequence discussed in this study corresponds to a toe imal phalanx, which was found in a lower layer in the same gallery of the same cave, and hence should be older.
This is very interesting to underscore because it seems to imply that Neanderthals were in Altai and specifically in Denisova cave very early, at dates similar to those we find in West Asia (Tabun excepted) and they may even be older than Denisovans in the very cave that gave them their name.
The toe sequence was found in a previous study to have Neanderthal mtDNA, closely related to the lineages of European Neanderthals of various dates and sites. Instead the finger mtDNA (Denisovan) was derived from a more ancient branch of humankind than the very point of split between Neanderthals and modern humans (H. sapiens) and has been recently shown to be related to European H. heidelbergensis from Atapuerca
Notes in red are mine.
This study focuses on the autosomal DNA of both Neanderthals and Denisovans. Unlike mtDNA, whose phylogenetic position is simple and quite straightforward, autosomal or nuclear DNA (nDNA) is extremely much more complex to understand because of its recombining nature, requiring of statistical approaches, which may get extremely complex and potentially subject to premise biases. When comparing two individuals this gets largely simplified but it is a lot more complex when doing the same with larger samples.
And that is precisely what this study does: comparing one Denisovan, several Neanderthals and also several modern humans. Therefore it is a very complex paper and the authors necessarily assume some evaluation risks, which nevertheless are discussed in depth in the supplemental material, a methodology of the Pääbo team that we can’t but greatly appreciate.
Age estimates
The study makes two age estimates, one based on a very conservative and truly unbelievable Pan-Homo split date of 6.5 Ma BP and the other based on observed per generation mutation rates, which happens to be perfectly coincident with a Pan-Homo split of 13 Ma BP, the oldest extreme of Langergraber’s estimate. This coincidence alone is of enough relevance for all molecular clock approaches, because it effectively demands the doubling of all age estimates based on the ridiculously short 6.5 Ma Pan-Homo split supposition. 
Red outlines are mine. Click to enlarge.
It also produces a semi-reasonable San-West African age estimate of c. 86-130 Ka, although I would think it a bit older in fact or at the very least at the top end. This highlights the severe difficulties of such molecular clock estimates, because a 4 Ma divergence between the alleged introgressing mystery archaic in the Denisovan genome, seems out of the question according on the archaeological and paleontological record, which only documents Homo species since c. 2 Ma ago, half that time (within the estimate but clearly very far from the top end).
Altai Neanderthal inbreeding
An important finding of this study is that the studied individual was extremely inbred, with parents in effective relationship comparable to that of grandparent and grandchild or half siblings. This inbreeding tendency, even if extreme, is not so strange in populations that have experienced founder effect bottlenecks and small population sizes. The Denisovan and the modern human Karitiana people are not so extreme but range in the lower end of double first cousins level of genetic relationship between the parents. Other Native Americans like the Mixe are close to that range, while the other compared populations, Papuans and Sardinians, show much lower levels of inbreeding.
Whatever we may think of Altai Neanderthal inbreeding, their drift parameter is still very low when compared with European Neanderthals. This is not discussed in the paper but such extreme drift also seems to imply extreme inbreeding issues in European Neanderthals, even if these may have other causes such as an extremely strong founder effect or whatever.
Bonobo-specific segments were removed, so the bonobo position is not realistic.
Inferred population history
Both populations leading to the Altai Neanderthal and Denisovans, but not modern humans, appear to have gone through a strong decline in population size since hundreds of millennia ago. The Denisovan decline seems to begin c. 800 Ka ago while the Neanderthal one may have begun c. 500 Ka ago. While this is coincident with a general expansion of the H. sapiens branch (still undifferentiated in Africa), peaking around c. 250 Ka ago before differentiation and relative decline. In their words:

All genomes analysed show evidence of a reduction in population size that occurred sometime before 1.0 million years ago. Subsequently, the population ancestral to present-day humans increased in size,whereas the Altai and Denisovan ancestral populations decreased further in size. It is thus clear that the demographic histories of both archaic populations differ substantially from that of present-day humans.

Neanderthal and Denisovan admixture in modern humans

The new tests confirm in essence the previous findings: there is significant Neanderthal introgression in modern humans descending from the migrants out of Africa and there is also significant Denisovan one among Australasian populations.

Additionally and with some caution, the authors think that much lesser Denisovan introgression (of around 0.2%) is found among East Asians and that these, as well as Native Americans, show slightly more Neanderthal admixture than West Eurasians. In my understanding this may be caused by minor African flow to West Eurasia after the admixture event (and/or residual “First Arabian” persistence) and I would think that measuring South Asians would help to clarify this issue (because African admixture is negligible in the subcontinent but they are also distinct from East Asians).

These measurements are so weak that the authors agree to all kind of cautions about them in any case.

In addition to all this, the supplemental material (section 13) also detects tiny, almost homeopathic, amounts of Neanderthal gene flow to Yorubas (~0.02%), obviously mediated by H. sapiens backflow from Asia and Europe into parts of Africa, which eventually influenced other African populations. An even more diluted amount may also be present among the Mbuti Pygmies.

Altai Neanderthal admixture in Denisovans

This issue is not really explained in the paper as such, and we have to reach out to the Supplemental Information chapter 15 in order to grasp it.

It is clear that the Altai Neanderthals are closer to Denisovans than other Neanderthals are by approx. the following fractions (directly deduced from the raw affinities listed in fig. S6a.2):

  • 2% more than Mezhmaiskaya
  • 7% more than Vindija (avg.)
  • 9% more than El Sidrón
Feldhofer appears closer instead but this sequence was not used by the authors in most tests because it has too dubious quality.

In section 15 of the supplementary material, using complex methodology and lamenting the lack of a second Denisovan sample which would be most useful, they estimate a minimal 0.5% (Altai) Neanderthal introgression in Denisovans, with strong warnings that this could well be quite higher. I don’t know why they are not even considering a more direct approach, but I would dare to guesstimate the introgression to be close to 8% from the above raw data, assuming that there are no further complexities at play, such as other Heidelbergensis introgression in European Neanderthals, etc. The drift parameter (see above) does not seem to be one such complexity because Mezhmaiskaya is almost as drifted as Vindija yet it is consistently much closer, as it seems to correspond to its specific relatedness to Altai Neanderthals in mtDNA (and possibly also in nDNA if it is admixture what causes their pseudo-tree positioning closer to the root, what would be typical).

Note in blue is mine.

Mystery archaic genetic flow into Denisovans

The authors find that some 0.5-8% of the Denisovan genome appears to come from another hominin, which split from the human trunk even earlier.

We caution that these analyses make several simplifying assumptions. Despite these limitations, we show that the Denisova genome harbors a component that derives from a population that lived before the separation of Neanderthals, Denisovans and modern humans. This component may be present due to gene flow, or to a more complex population history such as ancient population structure maintaining a larger proportion of ancestral alleles in the ancestors of Denisovans over hundreds of thousands of years.


Later in the discussion section they ponder further the implications of this finding:

The evidence suggestive of gene flow into Denisovans from an unknown hominin is interesting. The estimated age of 0.9 to 4 million years for the population split of this unknown hominin from the modern human lineage is compatible with a model where this unknown hominin contributed its mtDNA to Denisovans since the Denisovan mtDNA diverged from the mtDNA of the other hominins about 0.7–1.3 million years ago41. The estimated population split time is also compatible with the possibility that this unknown hominin was what is known from the fossil record as Homo erectus. This group started to spread out of Africa around 1.8 million years ago42, but Asian and African H. erectus populations may have become finally separated only about one million years ago43. However, further work is necessary to establish if and how this gene flow event occurred.


Going to the detail of the matter (i.e. supplemental material sections 16a and 16b), one of the key details is that present-day Africans share more derived alleles with Neanderthals than with Denisovans. This can only be explained because Denisovans have other archaic ancestry prior to their apparent divergence from Neanderthals or (what is about the same) because Denisovans diverged themselves prior to the Neanderthal-Sapiens split, what is what the mtDNA (unlike the nDNA) suggests. However the difference, even if consistent across comparisons, is too small (a few percentage points) to be attributed to the later scenario.

This means that Denisovans appear to be at nDNA level some sort of an independent branch of proto-Neanderthals with some other but minor archaic admixture. Instead at mtDNA level they appear to be unrelated to Neanderthals and related instead to H. heidelbergensis (a detail not discussed in this paper because it is a too recent independent discovery).

There are still many details to explore but, in principle, it would seem that the Denisovan branch appears to be a divergent proto-Neanderthal one (maybe related to the Hathnora hominin, which looks very much Neanderthal) with lesser other archaic (H. heidelbergensis?) admixture, which nevertheless remained prominent in their mtDNA for whatever accidental reason.

Whether the H. heidelbergensis population of Atapuerca responds to this same profile (i.e. they were Denisovans too) or belongs instead to the “other archaic” population which introgressed in the Denisovan genome remains to be solved. So far we only know the mitochondrial lineage and this one may be misleading, as seems to be the case with the Denisova hominin.

Note in red is mine

Modern human genetic evolution

Benefiting from the high quality of the archaic genomes of Altai, the authors cataloged a long list of simple mutations exclusive to our species: 31,389 single nucleotide substitutions and 4,113 short insertions and deletions (indels). Additionally they found other 105,757 substitutions and 3,900 indels shared by 90% of their modern human sample of 1094 individuals.

They suggest some lines for future research in this regard, maybe focusing on genes known to influence brain development or regions that could show signs of positive selection. These preliminary lines of research are explored in SI-20, noticing potential selection in genes that affect the ventricular zone of the brain and cell proliferation in fetal brain development.

 

Siberian haploid DNA

A new study is available with plenty of data on the haploid genetics of Siberian populations with focus on Tungusic peoples.
Anna T. Duggan et al., Investigating the Prehistory of Tungusic Peoples of Siberia and the Amur-Ussuri Region with Complete mtDNA Genome Sequences and Y-chromosomal Markers. PLoS ONE 2013. Open accessLINK [doi:10.1371/journal.pone.0081605]
Maybe the most informative graphic is fig. 1, which shows the scatter of mitochondrial DNA:
Figure 1. Map of Siberia showing approximate locations of sampled populations and their basic haplogroup composition.
For the meaning of abbreviations, check table 1.
Typical NE Asian haplogroups like C and D are quite widely distributed, up to the point of becoming difficult to say much about them. Instead A is more concentrated (Nyukhza, Iengra, both of them Evenks, and Koryaks particularly), while Z does appear to show a similar pattern (but with presence among Kamchatka instead of Koryaks and a relevant distributon in NE Siberia (Berezovka and some Yakuts). 
Haplogroup B is rare instead, only showing up in Southern Yakuts. It must be mentioned in any case because of its relevance in the original peopling of America. 
G is not too common, with the partial exception of G1, which shows an Eastern Siberian concentration.
Y is concentrated among Nivkhs (no surprises here), while F seems most important in Yakutia (like B, it is not a typical Northern lineage but its bulk distribution lays further South).
West Eurasian lineages, marked in Brown are concentrated in the Evens of Nyukhza, as well as among some Yakuts. Their presence among Yakuts is easy to understand considering their partial Turkic ancestry but the Nyukhza even larger apportion seems to me derived of some other kind of contact with Altai and the steppe, although the authors seem to favor Yakut admixture instead.
Premonitory FAQ: 
Which is the difference between “M_N” and “Other”? 
No idea: ask the authors. But I’m quite positive that “Other” cannot mean L(xM,N) but rather “other M and N”. Speculatively, it could indicate the difference between some M and N sublineages they have tested for and others which they did not. It’s sloppy nomenclature in any case.
Y-DNA

[Important post-script note: excepted the basal SNP markers for C and N, which were tested for, all the haplogroups are defined based on STR markers, what may be wrong].

Table 4 lists the Y-DNA haplogroups for Evenks, Evens, Yakuts and Yukaghirs only. C3c1 is very dominant in the Tungusic populations: 87/127 among Evenks, 43/89 among Evens, but all the opposite among Yakuts (1/184) and rather weak also among Yukaghirs (2/13).
Yakuts are dominated by N1c (173/184), lineage that has also some presence among the other sampled populations: Evenks: 18/127 (Nyukhza and Iengra groups), Evens: 30/89 (particularly Sakkyryyr and Sebjan groups), Yukaghir: 4/13.
Q1 is found mostly among Yukaghirs (4/13) with a single Yakut other case.
N1b is also of some importance among Tungusic peoples: 18/127 among Evenks (Taimyr and Stony Tunguska) and 13/89 among Evens (essentially in Tompo).
C3* is found mostly among Nyukhza Evens (13/78), who also harbor most of the Western lineage I detected in the area (4/78). 
The other meaningful Western lineage spotted is, of course, R1a, which is found in two variants: R1a(xR1a1) is concentrated among Taimyr Evenks (3/18) with only another sample among Stony Tunguska Evenks (1/40). R1a1 instead is concentrated among Yakuts (4/184).
There are also erratics (isolated single-individual samples) of C*, J2, O and F*.
There is also other interesting material in the study but I can only extend myself so much. I strongly recommend reading it for everyone with interest in Siberian and related populations, be these Uralics, Native Americans or generally East and Central Asians.
 
22 Comments

Posted by on December 21, 2013 in East Asia, mtDNA, population genetics, Siberia, Y-DNA