Ancient DNA from Clovis culture is Native American (also Tianyuan affinity mystery)

16 Feb
Figure 4 | [c] (…) maximum likelihood tree. 
A recent study on the ancient DNA of human remains from Anzick (Montana, USA), dated to c. 12,500 calBP, confirms close ties to modern Native Americans, definitely discarding the far-fetched and outlandishly Eurocentric “Solutrean hypothesis” for the origins of Clovis culture (what pleases me greatly, I must admit).
While this fits well with the expectations (at least mine), there is some hidden data that has surprised me quite a bit: it sits at the bottom of a non-discussed formal test graph in which modern populations are compared with both Anzick and Tianyuan (c. 40,000 BP, North China). See below.
Morten Rasmussen et al., The genome of a Late Pleistocene human from a Clovis burial site in western Montana. Nature 2014. Pay per viewLINK [doi:10.1038/nature13025]


Clovis, with its distinctive biface, blade and osseous technologies, is the oldest widespread archaeological complex defined in North America, dating from 11,100 to 10,700 14C years before present (bp) (13,000 to 12,600 calendar years bp)1, 2. Nearly 50 years of archaeological research point to the Clovis complex as having developed south of the North American ice sheets from an ancestral technology3. However, both the origins and the genetic legacy of the people who manufactured Clovis tools remain under debate. It is generally believed that these people ultimately derived from Asia and were directly related to contemporary Native Americans2. An alternative, Solutrean, hypothesis posits that the Clovis predecessors emigrated from southwestern Europe during the Last Glacial Maximum4. Here we report the genome sequence of a male infant (Anzick-1) recovered from the Anzick burial site in western Montana. The human bones date to 10,705 ± 35 14C years bp (approximately 12,707–12,556 calendar years bp) and were directly associated with Clovis tools. We sequenced the genome to an average depth of 14.4× and show that the gene flow from the Siberian Upper Palaeolithic Mal’ta population5 into Native American ancestors is also shared by the Anzick-1 individual and thus happened before 12,600 years bp. We also show that the Anzick-1 individual is more closely related to all indigenous American populations than to any other group. Our data are compatible with the hypothesis that Anzick-1 belonged to a population directly ancestral to many contemporary Native Americans. Finally, we find evidence of a deep divergence in Native American populations that predates the Anzick-1 individual.

Haploid DNA
The Y-DNA lineage of Anzick is Q1a2a1* (L54) to the exclusion of the common Native American subhaplogroup Q1a2a1a1 (M3). Among the modern compared sequences that of a Maya is the closest one.

The mtDNA belongs to the common Native American lineage D4h3a at its underived stage (root). 
For starters I must explain that these underived haplotypes can only be found within mtDNA and never in modern Y-DNA (common misconception) because this one accumulates mutations every single generation, while the much shorter mtDNA does only occasionally. Hypothetically we could find the exact ancestor of some modern Y-DNA haplogroup in ancient remains but that would be like finding the proverbial needle in the haystack. On the other hand, finding the underived stage in mtDNA, be it ancient or modern, does not mean that we are before a direct ancestor but just a non-mutated relative of her, who can be very distant in fact.

Autosomal DNA

In this aspect, the Anzick man shows clearly strongest affinities to Native Americans, followed at some distance by Siberian peoples, particularly those near the Bering Strait. 

Figure 2 | Genetic affinity of Anzick-1. a, Anzick-1 is most closely related to Native Americans. Heat map representing estimated outgroup f3-statistics for shared genetic history between the Anzick-1 individual and each of 143 contemporary human populations outside sub-Saharan Africa. (…)
However Anzick-1 shows clearly closer affinity to the aboriginal peoples of Meso, Central and South America (collectively labeled as SA) and less so to those of Canada and the American Arctic (labeled as NA). No data was available from the USA. 
This was pondered by the authors in several competing models of Native American ancestry:
Figure 3 | Simplified schematic of genetic models. Alternative models of the population history behind the closer shared ancestry of the Anzick-1 individual to Central and Southern American (SA) populations than Northern Native American (NA) populations; seemain text for further definition of populations. We find that the data are consistent with a simple tree-like model in which NA populations are historically basal to Anzick-1 and SA. We base this conclusion on two D-tests conducted on the Anzick-1 individual, NA and SA. We used Han Chinese as outgroup. a, We first tested the hypothesis that Anzick-1 is basal to both NA and SA populations using D(Han, Anzick-1; NA, SA). As in the results for each pairwise comparison between SA and NA populations (Extended Data Fig. 4), this hypothesis is rejected. b, Next, we tested D(Han, NA; Anzick-1, SA); if NA populations were a mixture of post-Anzick-1 and pre-Anzick-1 ancestry, we would expect to reject this topology. c, We found that a topology with NA populations basal to Anzick-1 and SA populations is consistent with the data. d, However, another alternative is that the Anzick-1 individual is from the time of the last common ancestral population of the Northern and Southern lineage, after which the Northern lineage received gene flow from a more basal lineage.
The most plausible model they believe is “c”, in which Anzick-1 is close to the origin of the SA population, while NA diverged before him. However model “d” in which Anzick-1 is close to the overall Native American root but NA have received further inputs from a mystery population (presumably some Siberians, related to the Na-Dené and Inuit waves) is also consistent with the data. Choosing between both “consistent” models (or something in between) clearly requires further investigation. 

Tianyuan and East Asian origins
All the above is very much within expectations, although refreshingly clarifying. But there is something in the formal tests (extended data fig. 5) that is most unexpected (but not discussed in the paper). 
The formal f3 tests of ED-fig.5 a to e fall all within reasonable expectations. Maybe the most notable finding is that, after all, the pre-Inuit people of the Dorset culture (represented by the Saqqaq remains) left some legacy in Greenland, but they also show some extra affinity with several Siberian populations (notably the Naukan, Chukchi, Koryak and Yukaghir, in this order) before to any other Native Americans, including Aleuts). 
But the really striking stuff is in figs. f and g, where it becomes obvious that the Tianyuan remains of Northern China show not a tad of greater affinity to East Asians (nor to Native Americans) than to West Eurasians. Also two East Asian populations (Tujia and Oroqen) are considerably more distant than the bulk of East Asian peoples to Tianyuan but also to Aznick.
Extended Data Figure 5 | Outgroup f3-statistics contrasted for different combinations of populations. (…) f, g, Shared genetic history with Anzick-1 compared to shared genetic history with the 40,000-year-old Tianyuan individual from China.
This is very difficult to explain, more so as Tianyuan’s mtDNA haplogroup B4’5 is part of the East Asian and Native American genetic pool, and the authors make no attempt to do it. 
The previous study by Qiaomei Fu et al. (open access) placed Tianyuan’s autosomal DNA near the very root of Circum-Pacific populations (East Asians, Native Americans and Australasian Aborigines) but after divergence from West Eurasians:
From Qiaomei Fu 2013
They even had doubts about the position of Papuans (the only Australasian representation) in that tree, which they suspected an artifact of some sort.
Since I saw that graph (h/t to an anonymous commenter at Fennoscandian Ancestry) I am squeezing my brain trying to figure out a reasonable explanation, considering that the formal f3 test has almost certainly more weight than the ML tree made with an algorithm. 
My first tentative explanation would be to imagine a shared triple-branch origin for Tianyuan, East Asians and West Eurasians, maybe c. 60 Ka ago (it must have been before the colonization of West Eurasia), to the exclusion of other, maybe isolated, ancient populations, whose admixture with the ancestors of the Tujia, Oroqen and Melanesians (maybe via Austronesians?) causes those striking low affinity values for these.
This would be a similar mechanism to the one explaining lower Tianyuan (and generally all ancient Eurasian) affinity for Palestinians (incl. Negev Bedouins) and also the Makrani, who have some African admixture and (in the Palestinian case) also, most likely, residual inputs from the remains of the first Out-of-Africa episode in Arabia.
However to this day we have no idea of which could be those hypothetical ancient isolated populations of East Asia. In normal comparisons such as ADMIXTURE analysis the Tujia and Oroqen appear totally normal within their geographic context, but this may be an artifact of not doing enough runs to reach higher K values, according to the cross-validation test, much more likely to discern the actual realistic components. 
The matter certainly requires further research, which may well open new avenues for the understanding the genesis of Eurasian populations, particularly those from the East.

41 responses to “Ancient DNA from Clovis culture is Native American (also Tianyuan affinity mystery)

  1. andrew

    February 18, 2014 at 1:30 am

    Would Tianyuan as a 49-51ish West-East hybrid between a West Eurasian man with thin East Asian NRY-DNA patriline ancestry, and a full blooded East Eurasian woman resolve the issue?

  2. Maju

    February 18, 2014 at 2:39 am

    I don't see any reason to think so because, for the little I know from Tianyuan archaeological context, as well as from the only likely interaction route which is via Altai, the West Eurasian influences are proto-Amerindian in nature and carried “mode 4” technology with them. Your suggested date for the hypothetical admixture event (“49-51ish”, Ka ago I understand) is barely before the UP and the West Eurasian formation as a distinct population, so I cannot fully discard it but seems so unlikely that I'm most reluctant to accept it without any specific evidence.

    I rather think of some sort of frontier population, maybe of Ainu-like characteristics (remember that Ainu phenotype has often been compared with West Eurasians and “Australoids”, although they are most likely just “something else”), which was later displaced by the mainstream East Asian population, without noticeable admixture.

    I would also imagine other such isolate populations, of even older divergence, possibly living in another frontier zone, such as the areas looking to the Tibetan plateau or the Gobi, whose admixture into the Tujia and Oroqen causes them to look even less Tianyuan-like than most Eurasians. This part is at least as much of a mystery as the issue of both main Eurasian branches being equally distant from Tianyuan but I think both are (always tentatively) best explained by the presence of some more isolated populations stemming from various nodes in the Early Asian tree of humankind, which have left differential legacies but mostly have vanished as such by now.

  3. John Rudmin

    February 18, 2014 at 3:33 am

    “However to this day we have no idea of which could be those hypothetical ancient isolated populations of East Asia.”

    Here's just one possible element: Google the words “Oroqen Mamanwa mtDNA N11”, click on the first site you see, and then of course Control-F (find) “Oroqen” to find the comment.

    By the way, I have a question, speaking of…that site (zeta…) is one I commonly see when I google a language/ethnicity/archaeological culture. For some time I was afraid to click on it because some impression (or mis-impression?) I had sometime ago made me fear associating with one of “those” sites (i.e. the kind of people who trolled your blog last summer). I also had the same fear about eupedia, but seem to have been mistaken.
    So my question is: Which sites should definitely be caveat emptor? (If you don't want their names to cause your blog to show up by association on a google search, you can put them in a kind of logical code and I will probably figure out which ones you are talking about. 😉

  4. Maju

    February 18, 2014 at 4:47 am

    That's a pretty interesting information indeed: N11 in Yunnan (a likely corridor but also refuge in the early Asian human expansion context, with some spillover to Sichuan, Tibet, among the Dongxiang (a Mongolic population living North of Qinghai) and the Oroqen (other Mongols from much further NE). They also mention possible presence in Indochina but no specific data.

    It's the first time I see that site but I can tell you that Black Man and Ibra are among the best possible anthropological “forumers” you can find around. I used to be in a small but very erudite older forum with them before I began blogging in 2008 and learned a lot. I did have problems with the owner of that forum, Ren, because of various reasons complex to explain in a comment but eve Ren is a very knowledgeable person (just that arrogant, maniatical and sometimes absurdly sinocentric). That other forum closed one or two years ago for what I know but I'm glad that they have created another such space.

    Of course the presence of TerryT (a mere member for what I see) is a concern: he knows some stuff but he has ALWAYS to be “right”, doesn't know when to simply stop a discussion and his worst sin: he often puts words on your mouth or manipulate your ideas beyond recognition. He used to be a very active (hyperactive) commenter in this blog and its predecessor but he eventually became too much of an annoyance and when he began (again) to manipulate my words and near-stalk me with his “I am right” pretensions, I decided to ban him for good.

    He still tries to post here occasionally with that kind of stuff but, on what regards me, he's off limits for this blog. Sadly he is the main reason why I still have to keep the comment moderation because he won't give up.

    But I can give you all kind of guarantees and commendations in regards to Black Man and Ibra: they are great guys and very knowledgeable people in matters of anthropology and population genetics, from many years now. In addition to that, the fact that both are Asian or have Asian origins adds a refreshing non-Eurocentrism to their approaches.

    I will try to hold watch of that forum actually. I just did not know it even existed.

    “Which sites should definitely be caveat emptor?”

    Can't tell you really. I used to dwell in anthropology forums in the past and many have “issues” (Eurocentrism, excessive racialism, Zionism and that guy who copy-pastes everything that Dienekes posts) but it's sometimes hard to discern until you have been in them some time. But I have been away from all them for years now so I am not up to date.

    Maybe you could throw some names around and, if I know something, I can give you an opinion (always from some distance). I have no caveats about discussing names for sure. I wouldn't have a mouth if I was meant to remain silent.

  5. Ebizur

    February 18, 2014 at 11:42 am

    Maju wrote,

    “Oroqen (other Mongols from much further NE)”

    The Oroqen are not Mongols, though about half of them do live in Hulunbuir, the northernmost section of Inner Mongolia. Nearly all the rest live in Heilongjiang (“Black Dragon=Amur”) Province immediately to the east.

    The Oroqen are linguistically Evenk (Northern Tungusic); according to some researchers, the dialects spoken by the Oroqens in China are more similar to the dialects spoken by some Evenks in Russia than the dialects spoken by most of the Evenks in China (Solons) are to the dialects spoken by Evenks in Russia.

    The fact that the Elunchun-zu (Oroqen people) have been categorized as a separate ethnic group, whereas the Solons have been categorized as (and actually form the core of) the PRC's Ewenke-zu (Evenk people), is a fluke of state-sponsored ethnologic history.

    As far as I know, the word “Oroqen” (“Elunchun” in Chinese) is not properly an ethnonym, but rather an occupational descriptor meaning something like “reindeer keepers” or “people who have reindeer” (< Evenk oro(n) "reindeer" + Evenk -chi, a suffix that is used in the same way as -chi is used in Mongolian and Turkish to form occupational nouns, e.g. Temüǰin "Genghis Khan's personal name" < temür "iron" + -chi, i.e. "person who has iron" or "blacksmith"). I suppose that now, since it has been dignified with the official stamp of the PRC, an "Oroqen" may not necessarily keep reindeer, but historically, the word had no significance other than indicating Tungusic speakers who were in the habit of keeping reindeer. A similarly-formed word is used to refer to several other groups of Tungusic speakers who do not necessarily have an especially close relationship with the Oroqen; these groups merely share the two characteristics of being Tungusic speakers and traditionally (at some time) having kept reindeer as a means of making a living.

  6. John Rudmin

    February 18, 2014 at 12:45 pm

    Thanks. Well of course the main caveat emptor is the one I naively stumbled onto a few years ago, and even made a comment on it before I realized what it was about, would you believe! (It's name is basically a meteorological phenomenon.) But then, after that, I think the other sites I left alone were ones that, when I would google image something like “Hazara people”, I would see tons of pics of blue eyed specimens that someone had clearly posted to sell the idea of certain traits, rather than a random sampling. But in retrospect, that was probably just some enthusiastic commenter, and not indicative of an agenda for the entire blog.

  7. Maju

    February 18, 2014 at 1:32 pm

    “The Oroqen are not Mongols”…

    Oops, thanks for the correction.

    What you say about the peculiarities of Oroqen make them an even more instriguing population because, according to your description, they are more related to Siberian Tungusic peoples than to those from Manchuria. I wonder if there are some Siberian populations which would score also low for Tianyuan and (to lesser extent) Anzick affinity. I cannot find any such low-Anzick affinity population in the other graphs however, although there's a dot in the map, which probably represents the Nivkh (Sakhalin), which scores lower than the rest.

    In a previous mtDNA study (Duggan 2013), the Udege (Russian Manchuria) have very high scores of “M_N”, however it was not N11 but rather Japan-like clades such as N9b, M9, M7a and M8. However several other Siberian pops. have high frequencies of the “other” category, which must be M/N but untested for subhaplogroups.

    In the Mal'ta paper we can see how Siberians differ among them in their affinities to Native Americans and other populations, although the main divide seems to be between those who are mostly intermediate between East Asians and Native Americans and those who are rather intermediate between East Asians and West Eurasian (with tendency rather to MA-1 than to modern Westerners).

    In the Derenko 2012 paper, M11 was spotted rarely (n=3) and scattered in Siberia from Altai to Japan (according to my notes), in populations coded as FJ, QK and ZQ. This adds up to n=4 in fact so maybe the FJ is from Japan? Also they mention a new N11d sublineage among Teleuts. I don't have time right now to check all the data but maybe someone else can. In any case there is indeed some N11 (rare) in Siberia.

  8. Maju

    February 18, 2014 at 1:41 pm

    A meteorological phenomenon? You must be talking about that infamous Nazi hideout Stormfront. That's not an anthropological forum by any means but a political forum/cult with the worst intentions. For what I know, their only pseudoscientific interest on anthropology is to manipulate this respectable science in order to justify white racism.

    I guess you realized soon enough.

  9. John Rudmin

    February 18, 2014 at 8:58 pm

    Indeed. I had just googled some Indo-European culture, and a few words in the discussion piqued my curiosity, so I plunged in. After that blunder, I generally refrained from clicking on sites that produced google results consisting of pictures of racial types with captions ending in “-oid”. (As is sometimes seen on google results for s6.zetaboards.)

  10. Maju

    February 18, 2014 at 10:04 pm

    Hehehe! Pretty much understandable. I'm not fan of the “-oids” either, although there's some people who like that stuff and is perfectly fine anyhow.

  11. andrew

    February 19, 2014 at 2:11 am

    “Your suggested date for the hypothetical admixture event (“49-51ish”, Ka ago I understand) “

    I apologize for being unclear. I was referring to percentages of admixture in the individual with 49%ish West Eurasian admixture (with the sub-50% allowing for a father with very thin, but patrilineal East Eurasian ancestry) and 51%ish East Eurasian admixture from an East Eurasian mother. Thus, a first generation mixed ancestry individual with a predominantly West Eurasian father and a predominantly East Eurasian mother.

  12. Maju

    February 19, 2014 at 2:25 am

    OK, understood now. But I see no reason to think so: DNA of Western origin in the area is limited to yDNA Q in essence and it seems much more reasonable to understand this flow in the context of the proto-Amerindian expansion and the spread of mode 4 (blade tech, “Upper Paleolithic”) Eastwards what only happens since c. 30 Ka BP. If that would be the case Tianyuan would be closer to Native Americans than to West Eurasians and he is not. Also a similar pattern would be observed in relation to Mal'ta and it does not happen.

    Also in Altai, the obligate source of this process, mode 4 is from c. 47 Ka BP, i.e. before Tianyuan, and earlier was Mousterian, i.e. Neanderthal and/or “Denisovan”. Neanderthal admixture is also excluded because it was tested for in Tianyuan and is perfectly normal.

  13. Maju

    February 19, 2014 at 2:28 am

    PS- I wrote:

    “If that would be the case Tianyuan would be closer to Native Americans than to West Eurasians and he is not.”

    But I think it's better:

    “If that would be the case Tianyuan would be closer to Native Americans than to West Eurasians OR TO EAST ASIANS and he is not”.

  14. Matt

    February 19, 2014 at 1:13 pm

    In response to andrew, Uygurs and Hazara are not more closely related to Tianyuan than other populations, yet by conventional estimates are around 50:50 descent from recent East Asians and recent West Eurasians.

    If Tianyuan is descended in proportions of 50:50 West Eurasian and East Asian wouldn't we expect populations like Uygur and Hazara that also are to approach greater shared drift with it?

    Also @ Maju, this is very interesting spot! (if it holds up in future and isn't an artifact of sequencing or anything).

  15. Maju

    February 19, 2014 at 1:35 pm

    Good point, Matt.

    “if it holds up in future and isn't an artifact of sequencing or anything”…

    We'll see, of course, but the study is high quality and formal f3 tests seem the best way to detect affinity levels without artifacts caused by sampling strategies, algorithm's nature or endogamous hyper-drift in some populations. I can only imagine (hope?) that the team has actually spotted this issue and maybe they are already working on this anomaly through further tests and analysis.

    Being a tad optimistic, I would expect to see something more clarifying in a year or so, maybe even with further East Asian aDNA if goddess Fortuna smiles upon us. Cross your fingers.

  16. John Rudmin

    February 19, 2014 at 5:13 pm

    “…there's some people who like that stuff and is perfectly fine anyhow.” Sure, I myself have quite a weakness for the pastime of pondering ethic appearances. I find it almost bewitching to wonder at. I was just saying, those who too assertively CATEGORIZE appearances had me mistaking them for a potentially icky site.

  17. Kristiina

    February 20, 2014 at 5:37 pm

    I refer to my earlier posts on your blog.
    Oroquen mtDNA: D4 67.2%, D5 1.6%, C1a 7%/11.5% (Derenko)
    Oroquen mtDNA: A 2/44, B4 1/44, CZ 14/44, D(xD5) 7/45, D5 5/44, F1 2/44, G(xG2) 5/44, N 1/44
    Modern Oroquen yDNA: C3c 13/31, C3d (?)(xC3e, C3c) 6/31, K* 1/31, N1b 2/31, O* 1/31, O2 2/31, O3(xO3a3c) 5/31, O3a3c 1/31

    Oroquen seem to have an extremely high frequency of D. Oroquen D haplotypes are the following:
    D4l (?) 145-223-362-368 x3; Yakuts
    D4e4a* 223-291-362 x7; Yakuts
    D4g2a1 223-274-362 Barghuts, Khamnigans
    D4b1a1 223-319-362 x2; Barghuts
    D2c/D4e1 223-270-362
    D4j4 148-223-263-362 Barguts
    D2c/D4e2 129-223-274-291-311-362 Even

    It seems that they share haplotypes in particular with Yakuts. Anzick mtDNA was also D, more specifically D4h3a, but this haplotype is not shared with Oroquen. Parallel branches to D4h3 are found in Buryats, Barguts and Ulchi.

    It is interesting that they carry C1 which is typical of Native Americans, and, however, they are pulled away from Anzick. Their clade is of course the Asian C1a that is not found in Amerinds.

    The frequency of C3c is highest in Oroquen. Here you have some C3c frequencies: (Chong paper)
    Ewenki, Inner Mongolia, 19.3%
    Manchu, Heilongjiang, 12%
    Mongolian, Inner Mongolia, 17.39%
    Kyrgyz, Xinjiang
    Hezhe, Heilongjiang, 11.11%
    Oroquen, Inner Mongolia, 42%
    Outer Mongolian, 18,69%

    According to the Siberian haploid DNA paper, C3c1 is found in very high frequencies in Evenks and Evens, in moderate frequency in Yukaghir and detected also in Yakuts. N1b which is present in Oroquen is also found in Stony Tunguska Evenks and TOM Evens.

    If Oroquen were reindeer herders, they probably came from the northwest, and that would also explain why they share mtDNA with Buryats, Barguts and Yakuts.

    Does this support the idea that yDNA Q went first to America, and C3b only later on, as is usually assumed. Anyway, Tujia have the highest frequency of yDNA C in China, 31%, of which 19% is C3 and the rest is C*. I wonder if this means that yDNA C was not present in North China at the time of Tianyuan. Is it possible that Tianyuan man carried yDNA Q? In the article ”The Dentition of American Indians, the authors Christy G Turner II and G. Richard Scott (2007) argue that ”broadly speaking, the pre-Arctic ancestral homeland of Paleo-Indians must have been in north China, Mongolia and Southern Siberia. It is easy to envision the newly evolved Sinodonts quickly expanding into northeastern Siberia after they succeeded in domesticating the dog for hunting and hauling, perhaps drifting north out of China via the Vitim River system.”

    I think it is funny if in palaeolithic times yDNA Q was found in China and C in Europe.

  18. Maju

    February 20, 2014 at 11:15 pm

    Thanks Kristiina. Before my mind blows up, I though that maybe the best is to try to compare Tujia and Oroqen in the haploid lineages aspect and try to figure out if they have something in common and what is it.

    According to Bo Wen et al. 2004, the Tujia (three different samples) have the following frequencies for Y-DNA:

    Tujia1 (n=68) 10 C, 2 F*, 7 K*, 20 O3, 18 O3e, 5 O1, 6 O2a
    Tujia2 (n=38) 2 C, 1 D1, 9 K*, 15 O3, 4 O3e, 6 O1, 1 P*
    Tujia3 (n=49) 12 C, 1 D1, 4 K*, 15 O3, 11 O3e, 4 O1, 2 P*

    (C-M130 D*-YAP D1-M15 F*-M89 K*-M9 O3*-M122 O3e-M134 O1-M119 O2a-M95 P*-M45)

    For mtDNA (pops, 20 and 21 pooled, hgs simplified, see table 3 for further details): 9 A, 9 B4, 7 B5, 9 C, 19 D, 20 F, 5 G, 4 M*, 4 M7, 2 M9a, 4 N*, 4 N9a and 2 R9a.

    Do they have anything in common with Oroqen at all?

    “Tujia have the highest frequency of yDNA C in China, 31%, of which 19% is C3 and the rest is C*”…

    It may depend on the particular sample but roughly so.

    “I wonder if this means that yDNA C was not present in North China at the time of Tianyuan.”

    Then what was there from the Native side? I would say it was C3 and maybe other lineages we find in the Neolithic or later, such as N and maybe D. Just that the proto-Amerindian population was dominated by “Western” Q1 and, because of patrilocality, that remained that way in essence, in spite of dramatic admixture on the mtDNA and autosomal sides, when they arrived to Beringia. Even if it was not the only lineage at Beringia, being the most common one was surely enough to have all the chances to “win” the further founder effect “lottery” in America.

    “I think it is funny if in palaeolithic times yDNA Q was found in China and C in Europe.”

    That is probably true, although the C6 of Europe is related to the East Asian/American C3 like Q is related to H, i.e. they are different branches of the very first Eurasian colonization almost certainly. On the other hand the relation between Oriental Q and Western Q is much more recent, from the beginnings of the Upper Paleolithic, so about half the time.

  19. Maju

    February 20, 2014 at 11:44 pm

    On first look the most similar item I could find is a very comparable rate of D subclades:

    Oroqen (second sample): 16% D*, 11% D5 (ratio: 1,45)
    Tujia (pooled): 12% D*, 7% D5 (ratio: 1,71)

    D is suspected to be the oldest mtDNA haplogroup to reach the NE Asian frontier (it's only one c.r. mutation away from M-root, so it may well be even older than N and most M sublineages in East Asia+, excepted M9, M12'G, M21'Q and maybe some other). Maybe for that reason is so diverse and ubiquitous. I wonder if this macro-lineage or maybe even just some subclades of it (it's old and diverse enough to consider such distinctions) are the trail of the hypothetical inland mystery population.

  20. Kristiina

    February 21, 2014 at 4:29 pm

    What you said about mtDNA D is very interesting! Northeast Eurasia and America is full of D, and it has spilled out on a very large area.

    The groups that are particularly pulled away from Anzick include also Melanesians. According to Ebizur’s post on Dienekes blog “Haplogroup C2-M38 is found in as much as 50% of the male population of some islands of central Indonesia, besides being the Y-chromosome haplogroup of the majority of Polynesian males”. It may be significant that all these three groups, Tujia, Oroquen and Melanesians, carry high amounts of yDNA C.
    In addition to Tianyuan cave, Zhoukoudian caves also contain human remains (c. 24–29 ka), but it seems that their DNA has not been tested. ( In this Zhoukoudian paper the authors argue that ”Results show a morphological resemblance of the Upper Cave material to Upper Paleolithic Europeans. It is proposed that the Upper Cave specimens retain important aspects of modern human ancestral morphology, and possibly share a recent common ancestral population with Upper Paleolithic Europeans, in accordance with the Single Origin model of modern human origins.” They also argue that ”Although the Upper Cave (UC) specimens are clearly modern in their cranial morphology, determining their regional affinities has proven difficult. Results vary among studies and for each of the three crania. In his first report on the specimens, Weidenreich (1938–39) saw similarities with the European Upper Paleolithic material in the cranial and facial morphology of the older, presumed male specimen UC101. He considered this individual, dubbed the ‘‘Old Man,’’ as more archaic than even some of the ‘‘Cro-Magnons’’ in its low cranial vault and heavy brow ridges. Nonetheless, he also saw subtle similarities to East Asians in some of its facial features. Weidenreich considered the UC102 and 103 crania, both presumed females, to be of ‘‘Melanesian’’ and ‘‘Eskimoid’’ affinities, respectively. He concluded that the Upper Cave group was a highly heterogeneous ‘‘proto-Mongoloid’’ population, and not particularly similar to later East Asian groups.”
    ”In contrast to earlier descriptions and univariate analyses, most morphometric analyses of the Upper Cave material have found no similarities between these crania and recent East Asians. Rather, they have variably classified the Upper Cave remains with Australo- Melanesians, Africans, Polynesians, and ancient American populations.”
    In the end they say that ”The present analysis supports the hypothesis that the Upper Cave specimens retain important aspects of morphology that were ancestral to all modern humans, and that they represent members of an as yet undifferentiated early modern human population that expanded across Eurasia in the Late Pleistocene (see Lahr, 1995).”

  21. Kristiina

    February 21, 2014 at 4:32 pm

    These Zhoukoudian people may be more recent settlers compared to Tianyuan people. Could they be carriers of yDNA C? Is it so that some Palaeolithic Europeans were taller and much more robust than more recent dwellers? Although, I do not really know if Zhoukoudian people were tall. Perhaps they were just robust.
    Genetiker divides Anzick’s ancestry in his globe13 analysis as follows:
    •81.86% Indianid (“Amerindian”)
    •6.70% Eskimid (“Arctic”)
    •2.63% Paleo-Negrid (“West_African”)
    •2.47% Nordic (“North_European”)
    •1.94% Mediterranean (“Mediterranean”)
    •1.58% Sinid (“East_Asian”)
    Apart from the north Eurasian ancestry, Mal’ta bears a considerable Indian ancestry. According to Lazaridis et al. paper, Mal’ta shares a third of its ancestry with Kalash. Neither Anzick or Mal’ta have West Asian or East Asian ancestry or have it only in negligible amounts. I think that it means that this North Eurasian population originated north of India and was intermediate between West Asians and East Asians. This East Asian “Han” ancestry component must be related to yDNA O and stem from somewhere in Indo-China. However, it is true that we really need more ancient genomes from Tibet, South China and Indo-China and, of course, from everywhere else. 🙂

    Who knows, if originally, Q were Karitiana like, O Negrito like and C robust hunks and when they all mixed we got the modern appearance. However, I find it difficult to figure out what D men looked like, as Ainu, Tibetans and Bantus do not look very similar.

  22. Maju

    February 21, 2014 at 7:29 pm

    The anthropometric affinities depend on opinions. It's obvious that Zhoukoudian man is too dolicocephalous to be a modern East Asian but some facial traits have also been argued to be precursors of the modern East Asian archetype.

    It's possible, I guess, that there were several East Asian populations with different looks and genetic heritage before the modern dominant phenotype became so overwhelming. But this process of replacement must have taken place anyhow before the arrival of proto-Amerindians to Beringia, because they are also carriers of the “Mongoloid” phenotype (even if variably so). So the window for the consolidation of this differential phenotype is not that wide, maybe 40-20 Ka BP only.

    As for yDNA C, I insist that this macro-lineage is comparable to F, D and E in its likely time depth and therefore each subclade should be considered separately by the time we are discussing here. What you say about Melanesians could only make sense if other high C populations of the area (Wallaceans, Australian Aborigines and Polynesians) would also perform like them, something we do not know yet. In any case it is much easier to explain anomalous distance between Tianyuan and Wallacea than with the Tujia and Oroqen.

    However it is true that “low C” Papuans behave more normally, so I do understand what you suggest, just that I don't feel it as solid ground yet grouping all C so “happily”. Who knows, maybe you are right after all but we'd need more evidence.

  23. Maju

    February 21, 2014 at 7:48 pm

    “I think that it means that this North Eurasian population originated north of India and was intermediate between West Asians and East Asians. “

    Mal'ta 1 IS closer to West Eurasians and South Asians than to East Asians. Not much but significantly so. In ED fig. 5d-e all WEA and SA populations (except African admixed Makrani and Palestinians) are more MA1-like than most East Asians. Even the low end ones like Sardinians or Balochi are just like the bulk of East Asians (but they probably have African-like admixture via Neolithic inputs, as discussed before). Basques for example are clearly more Ma1-like than any East Asian and even Tuscans are too.

    The explanation I spouse is that Ma1 derives from the very first West Eurasian population and had not yet accumulated any East Asian admixture. Those populations which have high Ma-1 affinities may be because of extra admixture with the Early Siberian component via Indoeuropeans or proto-Uralics, while those which have low Ma-1 affinity (very few) must be because of extra African-like admixture via Neolithic, which may be genuinely African via Egypt (which influenced Kebaran and later Levant Neolithic developments) or also via residual early OoA elements in Arabia and nearby areas.

    The relationship of other populations with Ma1 has no relationship with their affinity to Tianyuan: Tujia and Oroqen behave here like other East Asians while Papuans are much closer to Melanesians also (nothing abnormal).

    Instead in relation to Anzick, Tujia and Oroqen behave not so extremely but similarly as to Tianyuan. But this anomaly does not exist between Papuans and Melanesians, so there's something specifically Tujia-Oroqen going on here and it may bear no relationship with yDNA C and certainly not with Ma1.

    If you need the paper, please email me with a request, I believe it is very interesting, especially this ED fig. 5, which I have only reported in the relevant fragment for reasons of space and clarity.

  24. Ebizur

    February 22, 2014 at 7:26 am

    Oroqen (n=6)
    5/6 = 83% C
    1 x C3-“North” [MRCA Daur/approx. 500 YBP]
    1 x C3-“North” [MRCA Xibo/approx. 500 YBP]
    1 x C3-“North” [MRCA {Oroqen + Xibo}/approx. 1,000 YBP]
    1 x C3-“North” [MRCA {Oroqen + {Oroqen + Xibo}}/approx. 2,000 YBP]
    1 x C3-“South” [MRCA {Tujia + Japanese}/approx. 6,500 YBP]

    Xibo (n=8)
    3/8 = 37.5% C
    1 x C3-“North” [MRCA Oroqen/approx. 500 YBP]
    1 x C3-“North” [MRCA Uyghur/approx. 2,000 YBP]
    1 x C3-“North” [MRCA {Han + Hezhen}/approx. 2,500 YBP]

    Hazara (n=22)
    8/22 = 36% C
    1 x C3-“South” [MRCA Hezhen/approx. 2,500 YBP]
    1 x C3-“North” [MRCA {Mongol + Daur}/approx. 2,500 YBP]
    6 x C3-“North” [MRCA {Hazara + {Mongol + Daur}}/approx. 3,500 YBP]

    Hezhen (n=6)
    2/6 = 33% C
    1 x C3-“North” [MRCA Han/approx. 2,000 YBP]
    1 x C3-“South” [MRCA Hazara/approx. 2,500 YBP]

    Daur (n=7)
    2/7 = 29% C
    1 x C3-“North” [MRCA Oroqen/approx. 500 YBP]
    1 x C3-“North” [MRCA Mongol/approx. 2,000 YBP]

    Dai (n=7)
    2/7 = 29% C
    1 x C3 proto-“South” [MRCA all C3-“South”/approx. 22,500 YBP]
    1 x C* [MRCA {{Papuan + {Papuan + Papuan}} + {Japanese + Brahui}} (C2?/C1a1/C1b)/approx. 41,500 YBP

    Papuan (n=13)
    3/13 = 23% C (coalesce at approx. 8,000 YBP)
    3 x C2? [MRCA {Japanese + Brahui} (C1a1/C1b)/approx. 40,000 YBP]

    Japanese (n=20)
    3/20 = 15% C
    1 x C3-“South” [MRCA Tujia/approx. 4,000 YBP]
    1 x C3-“North” [MRCA {Hazara x 7 + Daur x 2 + Mongol x 1 + Oroqen x 1 + Hezhen x 1 + Han x 1 + Xibo x 1}/approx. 13,000 YBP]
    1 x C1a1 [MRCA Brahui (C1b)/approx. 39,000 YBP]

    PRC Mongol (n=7)
    1/7 = 14% C
    1 x C3-“North” [MRCA Daur/approx. 2,000 YBP]

    Uyghur (n=8)
    1/8 = 12.5% C
    1 x C3-“North” [MRCA Xibo/approx. 2,000 YBP]

    Yakut (n=18)
    2/18 = 11% C
    1 x C3-“South” [MRCA Han/approx. 1,000 YBP]
    1 x C3-“South” [MRCA {Yakut + Han}/approx. 2,000 YBP]

    Tujia (n=9)
    1/9 = 11% C
    1 x C3-“South” [MRCA Japanese/approx. 4,000 YBP]

    Han (n=24)
    2/24 = 8% C
    1 x C3-“South” [MRCA Yakut/approx. 1,000 YBP]
    1 x C3-“North” [MRCA Hezhen/approx. 2,000 YBP]

    Burusho (n=20)
    1/20 = 5% C
    1 x C3-“South” [MRCA {{Hazara + Hezhen} + {Yakut + {Yakut + Han}}}/approx. 6,000 YBP]

    Brahui (n=25)
    1/25 = 4% C
    1 x C1b [MRCA Japanese (C1a1)/approx. 39,000 YBP]

    The HGDP Oroqen sample does exhibit the highest frequency of Y-DNA haplogroup C, but four of the five Oroqen C individuals belong to the “North” subclade of C3-M217 and are closely related to members of other populations who share their Transbaikalian/Manchurian origins (Daur, Xibo). These other populations of Transbaikalian/Manchurian origin do not exhibit any dearth of shared drift with Tianyuan relative to Yoruba and are actually above average for this statistic. By the way, the one non-C HGDP Oroqen belongs to an early branch of (former) haplogroup N1c, but coalesces to a common ancestor with an HGDP Daur approx. 1,000 YBP.

    The HGDP Tujia sample, on the other hand, includes only one individual who belongs to haplogroup C: a member of the “South” subclade of C3-M217 whose lineage coalesces with that of an HGDP Japanese individual approx. 4,000 YBP. One other Tujia individual belongs to proto-N, and coalesces to a common ancestor with a Cambodian individual approx. 2,500 YBP. The rest of the HGDP Tujia belong to subclades of O: 1 x O2a1-M95 (MRCA Miao/approx. 1,000 YBP), 1 x O1a-M119 (MRCA Han/approx. 4,000 YBP), 1 x proto-O3a2c1 (MRCA Daur/approx. 4,000 YBP), 1 x O3a2c1-M134 (MRCA Cambodian/approx. 500 YBP), 1 x O3a2c1-M134 (MRCA {Han + Han}/approx. 2,500 YBP), 1 x O3a2c1-M134 (MRCA {Mongol + {Xibo + Uyghur}}/approx. 4,000 YBP), 1 x O3 (MRCA {Han x 2 + Japanese x 2 + Hazara x 2 + She x 1 + Brahui x 1 + Lahu x 1 + Yi x 1 + Hezhen x 1 + Dai x 1}/approx. 12,000 YBP).

  25. Ebizur

    February 22, 2014 at 7:27 am

    None of the Y-DNA lineages in the HGDP samples of Oroqen and Tujia seems to be old enough to represent a private, peculiar lineage that could reflect a distinctive genetic inheritance from a population that would be an outgroup to Tianyuan. There is a particular C3-“South” lineage that is found in both the Oroqen and the Tujia HGDP samples, but it is also shared with an individual in the Japanese sample, and the Japanese sample does not exhibit any clear dearth of shared drift with Tianyuan. For the next step, we might consider the mtDNA of the Tujia and Oroqen in more detail.

    By the way, the HGDP Melanesian sample seems to consist of only four individuals, two of whom have been reported to belong to Y-DNA haplogroup K(xL, M, N, O, P) and two of whom have been reported to belong to Y-DNA haplogroup M. However, it seems likely that one of the supposedly K(xL, M, N, O, P) Melanesians actually belongs to haplogroup M1, and has been mistyped or misreported; in that case, the sample would contain 3/4 M1 and 1/4 K(xL, M, N, O, P). As for their mtDNA, three of the HGDP Melanesians belong to haplogroup B4a (probably a typical Austronesian subclade) with a coalescence age of about 7,000 YBP. The other Melanesian individual belongs to haplogroup Q1c, which coalesces with a pair of Papuan Q1 individuals approximately 21,500 YBP.

  26. Kristiina

    February 22, 2014 at 7:47 am

    Please do send me the paper. I would love to read it!

    It is true that what I said above are ideas. I know that yDNA C appears to be a very old lineage that is found everywhere, except Africa. The split between C3 and the rest is deep, and they might have taken different routes and evolved autosomally very different from each other. However, in East Asia we should fit the prehistoric gracile southern tropical type and more robust boreal dwellers with the genetic data available.

    In my opinion the problem with this West Asian and East Asian dichotomy is the fact that there appears to have been a third North Eurasian element or elements that expanded from East to West. They might have been carriers of yDNA C and P. In fact, it is true that what I proposed means that these ancient North Chinese were of western descent and influenced East Asian genetics and not the other way round, which, of course, happened later on.

    I understand West Asian mostly in terms of GIJ and E, and that’s why I cannot see an important connection between GIJE and Mal’ta/America. Lazaridis paper detected three components in Europe EEF, WHG and ANE. If analyzed, several interesting components would certainly be found also in East Asia. I am looking forward to that!

  27. Maju

    February 22, 2014 at 11:27 am

    With all respect, Ebizur, MRCA estimation is not rocket science – more like pseudoscience. What coalescence times give those methods to CF? Just to know what I have to multiply them for in order to get a reasonable estimate: it usually is x2 to x4.

  28. Maju

    February 22, 2014 at 11:31 am

    Also I remember we have sometime discussed (with you being in that discussion and acknowledging that fact or probability) that apparently the SRYs that work reasonably well for some larger better studied clades, do not work so well for other haplogroups and specifically C is one of those.

  29. Maju

    February 22, 2014 at 12:08 pm

    Sent you the copy already, Kristiina. If anyone else wants it please just ask.

    “In my opinion the problem with this West Asian and East Asian dichotomy is the fact that there appears to have been a third North Eurasian element or elements that expanded from East to West”.

    Your usage of the term “North Eurasian” actually complicates things because it makes impossible to discern a Western “North Eurasian” like MA1 from an Eastern “North Eurasian” element (such as the N1-related one) or a mix of both. By using that term you blend them all in a single mix and the reality of the Far North is more complex with clearly two distinct main sources in the genetic pool (Western and Eastern), plus the European-specific element in the case of Europe (related to Ma1 but also quite distinct).

    So in the end we have three basic Northern components: European, Central Asian (Ma1) and East Asian. But the former two can be considered as one in many analysis and are not always easy to discern (similar lineages: yDNA P/R and mtDNA U), so the West vs East simplification also makes some sense.

    If you look at the Ma-1 formal tests in this paper, Siberians form in essence a V-shape between East Asia and either Native Americans (one axis) or Ma1 (the other), so the European influence seems restricted to Europe (before historical times or at least before the Metal Ages). So in Siberia we can also think in terms of West (Ma1, ancient Central Asia) and East. Only in Europe a tripartite analysis seems important.

    “They might have been carriers of yDNA C and P.”

    These actually are East (C3) and West (P) again. And C has very weak penetration westwards, so initially the Eastern component remained stuck to the East (and only later with N1 moved westwards).

    The fact that there is some P* among the Tujia (and maybe other Tibeto-Burmans?) should not affect this discernment, if that's what troubles you, because it's more likely to be a remnant or offshoot of the very early time when P or proto-P back-migrated westwards to South Asia or MNOPS as a whole was “oscillating” between South and SE Asia. The fact that the Tujia overall are “anti-North” (particularly different from Tianyuan and Native Americans but neutral towards Ma1) should support this explanation.

  30. Maju

    February 22, 2014 at 12:09 pm

    “I understand West Asian mostly in terms of GIJ and E”…

    I don't think that's correct at all and this error is the source of your confusion.

    E probably only arrived in the Mesolithic (Kebaran), influencing from there Europe too, as well as directly via West Iberia (so it must be considered an African marker even if it has been with us for so long now).

    I on the other hand is specifically European (very clearly so), while R1a and R1b both appear to have their centers of expansion in West Asia, as it's probably the case of Q (alternatively South Asia, also at the origin of R/R1 as such).

    The fact that J (J1 and J2 with different distributions) is relatively dominant in West Asia should not conceal that other important Western (and NA) haplogroups also originated there: frequency does not tell us of origins, only basal diversity can give us that kind of information, regardless of whether the haplogroup is small in frequency. Sometimes small initial populations have large impacts in colonization processes (examples: the Irish or the recently discovered case of Guanches in Caribbean America).

    If instead of thinking of GIJE, you think in the more realistic terms of IJ+R1+Q+G as the real West Eurasian core (Y-DNA) and if you think also in mtDNA terms (U+R0+JT+N1+N2+X+M1 in our case), as I do, then you will understand everything better.

    “Lazaridis paper detected three components in Europe EEF, WHG and ANE. If analyzed, several interesting components would certainly be found also in East Asia. I am looking forward to that!”

    That would be interesting to find out but only the ANE component will be present in East Asia (mostly in Siberia, and that you can already see in this study) so the other components would be Eastern specific ones. ANE (Ma1) seems the only Western subcomponent to impact East Asia and even this one only mildly so, with greatest influence in Siberia/America. This I already knew because I understood that yDNA Q and mtDNA X, present among Native Americans and some NE Asians, originated in the West. With your misleading “GIJE” idea you will never see that.

  31. Maju

    February 22, 2014 at 12:14 pm

    PS- “GIJE” explains Neolithic in West Eurasia (however I must be originally European in any case and E ultimately African) but it does not help at all with Paleolithic, very especially the Early UP, which was the time of consolidation of the basic regional (“racial”) differences in Eurasia (independently of whatever complementary role played the Neolithic in each already distinct subcontinental region).

  32. Grey

    February 23, 2014 at 8:51 am


    “In my opinion the problem with this West Asian and East Asian dichotomy is the fact that there appears to have been a third North Eurasian element or elements that expanded from East to West.”

    I thought it was clear that the ANE went west to east (and into North America) first and then later on the direction reversed with NE Asians moving into Siberia and then east to west?

  33. Ebizur

    February 23, 2014 at 2:57 pm

    As for the mtDNA of the HGDP Oroqen and Tujia samples analyzed in the present study:

    Oroqen (n=6):
    1 x F1b: MRCA Uyghur (F1b)/approx. 4,000 YBP
    1 x J1c: MRCA Bedouin (J1c)/approx. 7,500 YBP
    1 x F1c: MRCA Yi (F1c)/approx. 8,000 YBP
    1 x M11: MRCA {Yi + Yi} (M11)/approx. 12,000 YBP
    2 x D3: MRCA Japanese (D4b)/approx. 20,000 YBP

    Tujia (n=9)
    1 x M7b: MRCA {Han + Miao} (M7b)/approx. 3,000 YBP
    1 x C4a: MRCA Yakut (C4a)/approx. 4,000 YBP
    1 x F1a: MRCA Japanese (F1a)/approx. 8,000 YBP
    1 x D4b: MRCA {Tu + Japanese} (D4b)/approx. 11,500 YBP
    1 x R11: MRCA Han (R11)/approx. 13,000 YBP
    1 x B4a: MRCA Naxi (B4a)/approx. 20,000 YBP
    1 x B4h: MRCA {Miao + {Uyghur + Han}} (B4)/approx. 21,500 YBP
    1 x B4a: MRCA {Naxi + Tujia} (B4a)/approx. 23,000 YBP
    1 x A5: MRCA {Miao + Han} (A5b)/approx. 23,500 YBP

    The only salient peculiarity is the sharing of a presumably Southwest Asian Neolithic mtDNA lineage that belongs to the J1c clade between the HGDP Oroqen and Bedouin samples.

    It also might be noted that 4/9 = 44% of the HGDP Tujia sample actually belong to the R11'B clade, the same clade to which the mtDNA of the Tianyuan individual belongs. This is a very high frequency of R11'B for an East Asian population.

    Of course, the extremely small sample sizes are unfortunate.

  34. Maju

    February 23, 2014 at 4:48 pm

    You can't calculate age estimates for mtDNA: mutation lapses are way too random. And in Y-DNA only when considering long chromosome stretches, not the usual SRYs.

    I have no idea why you have gone now all fanatic with MRCA, Ebizur, but it's a meaningless speculation, more so in this case when one population could well partly derivate from the other or both have shared admixture with a third “ghost” one.

  35. Maju

    February 23, 2014 at 5:08 pm

    “44% of the HGDP Tujia sample actually belong to the R11'B clade”

    That's not consistent with the source I mentioned above (→, in which the much larger Tujia sample (n=98) “only” has 16% B4'5 (and no R11).

  36. Ebizur

    February 27, 2014 at 10:55 am

    Maju wrote,

    “That's not consistent with the source I mentioned above (→, in which the much larger Tujia sample (n=98) “only” has 16% B4'5 (and no R11).”

    “Not consistent” is not a good way of describing this datum. It is a fact that the HGDP Tujia sample, which is the sample that actually has been tested in the present study and placed on that graph in “Extended Data Figure 5,” contains 4/9 = 44% B'R11 mtDNA according to Lippold et al. (2014). It is, of course, sensible to compare the HGDP Tujia sample with other published samples of Tujia as you have done, but that does not invalidate my observation: the actual sample of Tujia that has been tested and placed on that graph in Extended Data Figure 5 contains a high frequency of B'R11 mtDNA (same clade as Tianyuan) and only an average frequency of Y-DNA haplogroup C for an East Asian population. I already have mentioned my discontent regarding the extremely small sizes of the HGDP samples. If the apparent dearth of shared drift between the HGDP Tujia sample and the Tianyuan genome is not some artefactual illusion, then this is just another example of how weak an indicator of overall shared ancestry a haplogroup may be (especially when dealing with individuals or small samples).

  37. Ebizur

    February 27, 2014 at 10:55 am

    Maju wrote,

    “You can't calculate age estimates for mtDNA: mutation lapses are way too random. And in Y-DNA only when considering long chromosome stretches, not the usual SRYs.

    I have no idea why you have gone now all fanatic with MRCA, Ebizur, but it's a meaningless speculation, more so in this case when one population could well partly derivate from the other or both have shared admixture with a third “ghost” one.”

    The mutation rate should approach a certain average over time. I agree that no one knows this average rate with a great deal of precision at present, and it is true that the actual rate of mutation in any particular lineage will deviate from the ideal, average rate at random because genetic mutation is a stochastic process, so I do not take the TMRCA estimates as literal dates; I consider them as heuristics that suggest the relative lengths of branches on a phylogenetic tree. This has become important to me lately since dealing with some people who purposely misconstrue or misinterpret a phylogeny in order to force the phylogeny to fit a certain hypothesis. A series of TMRCA estimates of pairs of individual lineages is a much more precise and less biased method of population comparison than comparing haplogroup frequencies.

    To take an example from Western Eurasia: many people have used some populations' sharing high frequencies of haplogroup U or haplogroup H mtDNA to argue for a common origin of said populations. However, in reality, any set of populations' sharing of haplogroup U mtDNA is not necessarily as significant as any set of populations' sharing of haplogroup H mtDNA because haplogroup U is more than twice the age of haplogroup H, whatever that age might actually be (and I am not claiming to know that number). Populations dominated by subclades of haplogroup U still might share a common origin, but even if they do, that common origin potentially may be much more ancient than the common origin of populations dominated by subclades of haplogroup H. In other words, the definition of “haplogroup H” is relatively precise (high resolution), and the definition of “haplogroup U” is relatively imprecise (low resolution), so belonging to “haplogroup H” is more significant than belonging to “haplogroup U.” Saying that someone belongs to “haplogroup H” is rather akin to saying that someone belongs to “haplogroup K1” (a.k.a. “U8K1”) or “haplogroup U4b.” For a person in the haplogroup H section of the phylogeny, the equivalent to saying that one belongs to “haplogroup U” would be saying that one belongs to “haplogroup R0HV'R11B'P.” I am sick of seeing people take advantage of this irregularity in the phylogenetic nomenclature in order to present a distorted view of population relationships.

    In the case of the Tujia and the Oroqen, what could be the haploid traces of this hypothetical admixture with a third “ghost” population? Even ignoring the TMRCA estimates, the Y-DNA and mtDNA haplogroups of the Tujia are quite typical for an East Asian population, and the Oroqen only stand out for their having an extremely high frequency of Y-DNA haplogroup C3 alongside a sprinkling of some clades that appear with greater frequency in some populations further north or west (Y-DNA N1c, mtDNA J1c).

  38. Maju

    February 27, 2014 at 12:04 pm

    So do you have any reason to imagine that the small HGDP Tujia sample is somehow a distinct subpopulation of the Tujia? Are they from a different area? Do they have some other peculiarity than a random accumulation of haplogroup B'R11?

    ” If the apparent dearth of shared drift between the HGDP Tujia sample and the Tianyuan genome is not some artefactual illusion, then this is just another example of how weak an indicator of overall shared ancestry a haplogroup may be (especially when dealing with individuals or small samples).”

    Yes. Most DNA data, haploid or autosomal, makes much better sense in context: reverting the Nietzschean claim: it's for need of populations that there are so many individuals.

    Anyhow, my working hypothesis on the Tujia/Oroqen issue is that much of their ancestry is mainstream East Asian (EA1) but that they also seem to have some distinct ancestry (EA2) that other East Asians lack, inc. Tianyuan and Anzick.

    This could well be the product of an early differentiation of populations migrating north via Yunnan (EA 2), assuming that the mainstream EA1 population migrate via the coast (Vietnam), and this one EA1 also produced West Eurasians and the distinct Tianyuan population somehow.

    Trying to find a realistic scenario, the Toba catastrophe period may well be that one: it is some 25,000 years older than the West Eurasian colonization, 34,0000 years older than Tianyuan, 25-50,000 years more recent than the Neanderthal admixture episode, and it was big enough to have wreaked havoc among ancient Asian populations, possibly leaving some of them isolated here and there.

    I tend to associate the Toba episode with the explosion of mtDNA N/R (it must have “cleared” enough space to allow for that such secondary expansion). But, even if Toba is not implicated, the main link between East Asians, West Eurasians and Tianyuan is precisely that N/R macro-haplogroup (as well as its most plausible yDNA counterpart MNOPS), so I would think that EA1 is more strongly related to that N/R ancient population, while EA1 should be related to some M sublineages such as the already mentioned D.

    In this sense your finding on the HGDP Tujia sample's mtDNA could hypothetically cast some doubt but only if those HGDP Tujia happen to actually be a distinctive subpopulation, something yet to be demonstrated (and IMO unlikely). Otherwise I'd say that it's just a fluke in mtDNA results with no implications.

  39. Maju

    February 27, 2014 at 1:01 pm

    “The mutation rate should approach a certain average over time”.

    This doesn't actually work on mtDNA, whose branches have very irregular lengths. Ironically the most expansive super-starlike nodes, such as M and H, have very significantly shorter average length in their downstream branches, suggesting that haplogroup expansion somehow slows down (or even almost stops) effective mutation accumulation in mtDNA, which is extremely slower than in yDNA. I have dedicated some time to simulate it at home with pen and paper and the help of a decent mathematician friend and it makes all sense that the dominant lineage (root variant typically in a population which expanded suddenly, leaving an starlike signature) almost systematically displaces every new mutation by mere drift (unless the effective population is very small, when chances get more even for the ancestral and mutated states).

    This is because, unlike in yDNA, in mtDNA mutations happen only rarely and get consolidated even less frequently, being therefore strongly subject to population dynamics such as drift.

    It is therefore very possible that mtDNA haplogroups have been more or less “frozen” after expansions, what totally alters the “molecular clock” hypothesis into an unpredictable mess.

    In yDNA, considered in its whole length (or at least a significantly large fraction of it), is different and indeed a molecular clock could be achieved. But the SRY approach is not good enough for this. Why? Because instead of considering the whole or a large significant fraction of sites, it only considers a small number of them, even much smaller than the mtDNA chain's whole length.

    With less information we cannot get better results, not in this case at leat, and, even if in some haplogroups SRYs may actually give a glimpse of chronology (?) because of a decade of selection of such sites to detect the most informative ones, those cannot be just extrapolated to other lineages. I believe it was actually yourself who noticed this in relation to haplogroup C in an older discussion: that the usual SRYs, which may be diverse and somewhat informative for lineages like R1, have very low diversity in yDNA C, rendering them non-informative.

    So I would beg all geneticists willing to find a realistic molecular clock, to make SNP mass-comparisons of significant sections of the Y chromosome's full chain. Ironically this approach has only been done so far by amateurs, as far as I know, but it is the only way to reach the so much desired meaningful molecular clock.

  40. Maju

    February 27, 2014 at 1:01 pm

    “haplogroup U is more than twice the age of haplogroup H”

    I strongly question this. IMO H became “frozen” for the reasons explained above: drift in a large enough (but still sufficiently small) population. Precisely H and U have such extremely different branch lengths that they are almost the perfect example of why molecular clock approaches do not work well or even at all with mtDNA (see link above in this comment). IF the MCH would be correct, their branches would be approximately equal in length but nope, so the whole hypothesis fails precisely in this case.

    Additionally it is nearly certain that mtDNA H existed in Gravettian Russia (Sunghir: H17'27), at the very least. The pitiful HVS-1 methods of sequencing do not allow for detection of mtDNA H or even its precursors HV and R0 but a lot of aDNA sequences from the European Paleolithic fit with the HVS-1 signature of H, which is precisely nil. So far the only ones tested for enzymatic markers or coding region SNPs that have produced H are from the Magdalenian and Epipaleolithic periods but it is extremely likely that H was in Europe since very early in the Upper Paleolithic. Prove me wrong with effective testing methods if you can.

    Using the “molecular clock” method from the common root (R) and not from the much more dubious present results, which need of assumptions that I reject, we get (coding region mutations only):
    R→→→ U (also U5)
    R→→→→ H

    Therefore H should be only slightly more recent than U/U5 by the very logic of the “molecular clock”. Just that H became “frozen” within a massive starlike expansion, while U did not experience such a massive expansion and therefore mutated (or rather consolidated those mutations into surviving new sublineages) more freely in smaller populations in which Ne tended to 2.

    “For a person in the haplogroup H section of the phylogeny, the equivalent to saying that one belongs to “haplogroup U” would be saying that one belongs to “haplogroup R0HV'R11B'P.””

    Just R0. But H is fine because it is of almost the same age as U, if we play molecular chronologist from the shared root and not from the confusing present situation.

    Using as calibration the R0 node, and assuming it is as old as 60 Ka ago, then it's possible that R as such is around 65 Ka old and:
    →R0 →HV (55Ka) →→H (45 Ka)
    →U/U5 (50 Ka)
    →JT (50 Ka)
    →R11'B6 (50 Ka)
    →B (aka B4'5) (50 Ka)
    →P(50 Ka)

    Stretch it down a bit if you feel it more realistic but remember that R is only 6 coding region mutations downstream of L3 and just one under N. And L3 should be around 125 Ka old, what actually stretches each c.r. mutation in that segment to the equivalent of c. 10 Ka. So again it's not really regular (it cannot be) but there must have been a lot of “pruning” of novel branches that could not prosper under the shadow of their non-mutated mother lineages, much more numerous necessarily (as mutations in mtDNA happen only every many dozen generations).

  41. Maju

    February 27, 2014 at 2:15 pm

    “haplogroup R0HV'R11B'P.”

    Now that I reconsider this phrase, there is no such haplogroup: R0 (incl. HV), R11'B and P are three distinct basal sublineages of R.

    I also committed an error in the previous discussion of precisely these R sublineages. U is the only one among the major ones which is separated by more than one c.r. transition at the stem, actually three (and not just one as I drew in the above sketched tree). Then I also got confused by the date estimates, so I will rewrite it here (assumin age(R)=65 Ka and c.r. transition = 5 Ka):

    →R0 (60Ka) →HV (55 Ka) →→ H (45 Ka)
    →→→U/U5 (50 Ka).
    →JT (60 Ka)
    →R11'B6 (60 Ka)
    →B (60 Ka)
    →P (60 Ka)

    Modify the ages as you think they fit best but ALWAYS proportionally to mutation count from the shared root (R).


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: