Category Archives: East Asia

A review of haplogroup N (Y-DNA)

Haplogroup N (Y-DNA) is spread from the Baltic to the South China Sea being one of those rare genetic links between East and West Eurasia (other than ultimate common ancestry) and one of the two Y-DNA lineages which expanded across the Northern Eurasian continent (the other one being Q).
While it is apparent to me and many others that the lineage originated in East Asia and expanded first Northwards to Siberia and later Westwards to Europe. I have found sometimes reluctance to accept this fact or difficulty understanding why. Some of the data of this paper may be of help in this regard. It is also a good exercise for those learning to understand how haploid genetics can be decoded into a meaningful pattern that reveals key parts of the untold history of peoples. 
Hong Shi et al., Genetic Evidence of an East Asian Origin and Paleolithic Northward Migration of Y-chromosome Haplogroup N. PLoS ONE 2013. Open access → LINK [doi:10.1371/journal.pone.0066102]


The Y-chromosome haplogroup N-M231 (Hg N) is distributed widely in eastern and central Asia, Siberia, as well as in eastern and northern Europe. Previous studies suggested a counterclockwise prehistoric migration of Hg N from eastern Asia to eastern and northern Europe. However, the root of this Y chromosome lineage and its detailed dispersal pattern across eastern Asia are still unclear. We analyzed haplogroup profiles and phylogeographic patterns of 1,570 Hg N individuals from 20,826 males in 359 populations across Eurasia. We first genotyped 6,371 males from 169 populations in China and Cambodia, and generated data of 360 Hg N individuals, and then combined published data on 1,210 Hg N individuals from Japanese, Southeast Asian, Siberian, European and Central Asian populations. The results showed that the sub-haplogroups of Hg N have a distinct geographical distribution. The highest Y-STR diversity of the ancestral Hg N sub-haplogroups was observed in the southern part of mainland East Asia, and further phylogeographic analyses supports an origin of Hg N in southern China. Combined with previous data, we propose that the early northward dispersal of Hg N started from southern China about 21 thousand years ago (kya), expanding into northern China 12–18 kya, and reaching further north to Siberia about 12–14 kya before a population expansion and westward migration into Central Asia and eastern/northern Europe around 8.0–10.0 kya. This northward migration of Hg N likewise coincides with retreating ice sheets after the Last Glacial Maximum (22–18 kya) in mainland East Asia.

Hong Shi has previously produced very interesting materials and this is no exception, however I find the use of chronological guesstimates as if these would be objective findings and treated as part of the central discourse (and not the mere side note where they belong) a bit nauseating and a cause of confusion.

Figure 4. Proposed prehistoric migration routes for Hg N lineage.
(the pattern is correct but the dates are mere hunches, not any sort of objective facts)

Above we can see the reconstructed pattern of expansion of Y-DNA N in three phases. In my understanding the dates are not way off, although I can only imagine that there is still room for improvement, especially regarding the “red” phase. After all NO may have split c. 60 Ka ago and the main branch, O, c. 50 Ka BP – and not the mere 25-30 Ka that Shi calculated (in a previous study but mentioned again here).
But the really interesting part is not molecular-clock-o-logy but this:

Figure 3. Median-joining networks for sub-haplogroups of Hg N lineage using Y-STR alleles.

diagnostic mutations used to classify the sub-haplogroups are labeled
on the tree branches. Each node represents a haplotype and its size is
proportional to the haplotype frequency, and the length of a branch is
proportional to the mutation steps. The colored areas indicate the
geographic origins of the studied populations or language groups.

Here we can appreciate, with the labyrinthine limitations of the use of (too few?) STR markers, the apparent structure of the various haplogroups and paragroups under N. We can also see the STR diversity in numerical terms:

Table 3. Y-STRs diversity of Hg N sub-haplogroups.

Sadly the category “Han Chinese” is almost useless and one wonders why Shi et al. changed from the North/South polarity in the key paragroup N* to such a confusing terminology in N1.
In any case, it is quite evident that N arose in South China, spread, already as N1 to NE Asia and, later, some of that N1 (N1c mostly but also some N1b) spread Westwards reaching to Finland and other Eastern European populations. In the haplotype graph we can appreciate a distinct European-specific branch within N1b.

Update (Jul 28): some new findings (not considered in the study) and updated nomenclature.
See comments’ section for greater details. Special thanks to Palamede for his efforts in clarifying the matter.
Commercial testing company FTDNA has recently detected some new markers within haplogroup N1 that alter the phylogeny. A synthesis of these findings can be seen in this graph.
This new nomenclature was adopted by ISOGG but the study discussed here does not include it, using instead a 2011 nomenclature. Hence we must understand that:
  • N* and N1* remain as such
  • “N1a” (M128) is now known as N1c2a
  • “N1b” (P43) is now N1c2b
  • “N1c” (M46/Tat) is now N1c1
Therefore the N1 tree splits as:
  • N1a (new clade, P189)
  • N1b (new clade, L732)
  • N1c (new clade including all previously named subhaplogroups)
    • N1c1 (M46/Tat, former N1c)
    • N1c2 (new clade, L666)
      • N1c2a (M128, former N1a)
      • N1c2b (P43, former N1b)
As far as I could gather, N1(xN1c) is so far only clearly represented by two FTDNA-tested singletons: a Slovakian (N1a) and someone of Polish surname (N1b1). However I may be missing some details. Whatever the case it is possible that, unless more samples show up in these groupings the tree may be later reverted to the original state (or something in between) because isolated individuals or families do not haplogroups make. 
Also it is important to understand that commercial DNA testing companies have very unbalanced samples, clearly dominated by people of NW European (and to lesser extent other European) ancestry, what is not too useful when discerning what is where, producing sometimes the false impression of greater European diversity just because of greater number of samples.
On the other, hand the Hong Shi data reported above clearly shows a great number (and diversity) of East Asians within N1*, so the most likely conclusion is that the few Europeans within N1* are mere erratics within clades of East Asian origin, surely brought Westward by the overall N1 tide. 
So in essence the conclusions of the paper remain unchallenged.

Homo sapiens from Central China dated to 81-101 Ka BP

I just received notice (h/t David) of this most important finding and dating:
Guanjun Shen et al., Mass spectrometric U-series dating of Huanglong Cave in Hubei Province, central China: Evidence for early presence of modern humans in eastern Asia. Journal of Human Evolution 2013. Pay per viewLINK [doi:10.1016/j.jhevol.2013.05.002]


Most researchers believe that anatomically modern humans (AMH) first appeared in Africa 160-190 ka ago, and would not have reached eastern Asia until ∼50 ka ago. However, the credibility of these scenarios might have been compromised by a largely inaccurate and compressed chronological framework previously established for hominin fossils found in China. Recently there has been a growing body of evidence indicating the possible presence of AMH in eastern Asia ca. 100 ka ago or even earlier. Here we report high-precision mass spectrometric U-series dating of intercalated flowstone samples from Huanglong Cave, a recently discovered Late Pleistocene hominin site in northern Hubei Province, central China. Systematic excavations there have led to the in situ discovery of seven hominin teeth and dozens of stone and bone artifacts. The U-series dates on localized thin flowstone formations bracket the hominin specimens between 81 and 101 ka, currently the most narrow time span for all AMH beyond 45 ka in China, if the assignment of the hominin teeth to modern Homo sapiens holds. Alternatively this study provides further evidence for the early presence of an AMH morphology in China, through either independent evolution of local archaic populations or their assimilation with incoming AMH. Along with recent dating results for hominin samples from Homo erectus to AMH, a new extended and continuous timeline for Chinese hominin fossils is taking shape, which warrants a reconstruction of human evolution, especially the origins of modern humans in eastern Asia.

In other words: strong material evidence is quickly piling up in favor of a Homo sapiens “fast” colonization of Southern Asia (and as far NE as Hubei!) around 100 or at least 90 Ka BP. 
See also:

Korean petroglyphs at risk by reservoir

A group of very beautiful South Korean petroglyphs that seem to represent whale hunting and are dated some 6000 years ago are being damaged by a water reservoir that provides water for the city of Ulsan. 

The Bangudae petroglyphs, discovered in 1971, are submerged under water seasonally, raising great controversy in the East Asian country. It seems that even President Park is greatly concerned about them, something not too usual in a politician, while the Cultural Heritage Administration is demanding measures to protect the ancient rock art, namely to keep water levels low enough. 
However water utilities claim that it is impossible to meet such demands while providing water to the seventh largest South Korean city. The Ulsan city government is proposing to build a wall around the petroglyphs in order to protect them while keeping the water levels, this however would cause environmental damage to the area, disqualifying the site for UNESCO World Heritage protection schemes.
Source: cinabrio.over-blog[en/es] (incl. several pictures and press articles).

Synthesis of the early colonization of Asia and Australasia by Homo sapiens (haploid genetics)

Continuing with the joint series in Spanish language with David Sánchez at his blog Noticias de Prehistoria, I have just written an article on the early expansion of Homo sapiens in Asia and Australasia after the “out of Africa” migration. 
I have in the past explored this matter on this blog and its predecessor but there has been some time since I did it the last time. Therefore it may be interesting to share a synthesis of this updated review with the readers of For what they were…
As usual the review is built upon geographic reconstructions and an overly simple “molecular clock”, in the case of mtDNA only (which is the base of the interpretation), that merely counts coding region mutations from the most recent common ancestor (the L3 node), using the latest version of PhyloTree (build 15).
The result for mtDNA are the following five maps:
Map 1: the expansion of L3 sublineages from Africa to South Asia. Molecular time: L3+0 to L3+3. 
The big M star indicates the large M star-like explosion upon arrival to South Asia.
Map 2 represents the molecular time L3+4 (=M+1). There is an evident expansion in South Asia but also into SE Asia. 
The presence of M29’Q in Papua must be taken with some caution, as always that a single lineage is involved, what has low statistical significance.
Map 3 represents the molecular time L3+5, which corresponds to the coalescence of haplogroup N, as well as many M sublineages. There is a slowing down in the number of nodes sprouting at this “time”, so I would estimate it to correspond with Toba supervolcano (c. 74 Ka BP).
Map 4 represents the molecular time L3+6, which corresponds with the coalescence of R. The rhythm of expansion recovers and the colonization of Australasia seems by now quite statistically significant.
Map 5 represents the molecular time L3+7, which shows the first indications of expansion to NE Asia and Western Eurasia (the Neanderlands), while expansion in South Asia continues very strong (this dynamism of South Asian M lineages may explain why N and R had a limited impact in the subcontinent). 
I stopped here because I did not want to stretch too much the potential of my simplified molecular clock method, surely more likely to err as we move away from the reference point (L3 node) but the tendencies outlined in map 5 clearly continue and even increase at later “moments”. 
I also made a rough age estimate of the various maps, assuming map 2 to correspond to Jwalapuram (since c. 80 Ka BP) and map 5 to the earliest Aurignacoid cultures (Emirian, since c. 55 Ka BP or maybe a bit earlier). The result is:
  1. Arrival to South Asia: c. 93-83 Ka BP
  2. First expansion: c. 85-75 Ka BP
  3. Slowing down of the expansion (Toba) and N node: c. 77-67 Ka BP
  4. Reactivation of the expansion and clear arrival to Australasia: c. 69-59 Ka BP
  5. Expansion to less hospitable areas (NE Asia, the Neanderlands): c. 61-51 Ka BP
It is in any case a rough (yet quite coherent) estimate: there is no genetic equivalent of radiocarbon or other physical methods of age calculation.
I did not even try to make any time approximation for Y-DNA, whose expansion I just split in two phases. First what could well be the overall process of expansion from Africa into Tropical Asia (roughly comparable to mtDNA maps 1, 2 and 3):

And then the later expansions, divided in two maps for clarity (they should be roughly simultaneous processes):

General expansion of macro-haplogroups C and D (Y-DNA)
General expansion of macro-haplogroup F and its major descendant MNOPS (highlighted in a lighter, fuchsia shade).

The continuous arrows in these two maps should correspond in essence to mtDNA maps 4 and 5 and even later in time. The dotted arrows merely indicate some important but late processes since at least 30 Ka BP up to the Late Neolithic.
In all maps there is some uncertainty about the exact coalescing location of each clade or node but overall they should be at least approximate. Particularly uncertain are the original locations of mtDNA N and Y-DNA C and MNOPS but should all have coalesced somewhere between Varanasi and Guangzhou, so to say. 

Chinese neolithic site of Tianluo Mt.

Hemudu culture pottery
(CC by Editor at Large)
A 10-year long campaign of digs at the site of Tianluo Mountain  (Zhejiang, China) has come to an end and will provide abundant information on the Hemudu culture, being considered the best preserved site of this Neolithic population.
The site, accidentally discovered in an attempted well drill, was once a village with walls, food stores, paddy fields and even piles of rice husks, as well as ladders made from a single piece of wood, big houses for ritual activities, wood-carved ritual wares with birds, and wooden swords.
The local government invested more than 10 million yuan in a shelter to protect the site, which has been open to visitors since 2007.
Source: China Daily.
Leave a comment

Posted by on May 29, 2013 in archaeology, China, East Asia, Neolithic


Synthesis of the Spanish-language series on the expansion of H. sapiens (2)

One of the reasons I have been a bit too saturated and maybe not writing as much as usual is that I am collaborating in a series in Spanish language for the blog Noticias de Prehistoria – Prehistoria al Día.
I already mentioned last month the initial article[es] of the series by David Sánchez, which dealt with the African Middle Paleolithic (MSA, Lupembian, Aterian, etc.) We have not been idle in the meantime but actually wrote a number of other articles that may well be of your interest:
There is still a lot to do for the series to be complete but the time for a synthetic review in this blog is quite overdue. I will skip the brief intro to population genetics on the belief that most readers here have a decent idea, but the other three articles ask for due mention.

Expansion of H. sapiens in Africa (genetic viewpoint)

This is something that complements David’s analysis of the African MP and that to a great extent I dealt with already at my former blog Leherensuge. I like graphs and maps because they often tell more than just words:

Basic mtDNA tree of Humankind
Branch length is proportional to coding region mutations from root per PhyloTree v.15 (L0k excepted)

We can see in this graph two main “moments” of diversification or expansion:
  1. The L0 and L2-6 nodes, followed soon by the L1 and L0a’b’f’k nodes
  2. The L0a’b’f, L0d and L2’3’4’6 nodes
The latter may well be calibrated with the archaeological evidence for the arrival of H. sapiens (MSA) to Southern Africa (L0d), which may be as old as 165 Ka but shows a clear increase in density since c. 130 Ka. I’d rather lean for the later date, that is roughly coincidental with the beginning of the Abbassia Pluvial, which must have provided good opportunities for expansion also in more northernly latitudes (the other nodes).
The first expansion is harder to estimate but c. 160 Ka. is a time in which we can see some of the first signs of expansion of our species within Africa (Jebel Irhoud and the already mentioned first Southern African MSA) so it is a tentative date. 
The geography of both expansions should be as follows (based on the raw data of Behar 2008):

Approx. geography of the first expansion of H. sapiens
(Purple dotted area indicates the max. likelihood for ‘mtDNA Eve’ location)
Approx. geography of the second expansion of H. sapiens

I also mentioned the expansion of L3, which preludes the migration Out of Africa, but this was already discussed in this entry.

Arrival to Arabia and Palestine

While most of the entries I am doing for this series deal with the genetic aspects, in this case I worked mostly with the archaeology, recycling many materials that are readily available in this blog and achieving the following synthetic map (recycling one by Armitage 2011) as central element of the article:

In addition to reviewing the archaeological discoveries of the last few years (and few older ones) I also discussed the issue of Neanderthal admixture, which most likely happened in this phase, and the possibility of some L(xM,N) lineages found in Arabia being from this period (see here).

Synthesis of Asian Prehistory

The last article so far in the series, authored by David Sánchez, has been published just today and is a very good visual review of the complex archaeological record of most of Asia in the period that interests us (most Middle Paleolithic with marginal mention of the earliest UP of West Asia, Siberia and neighboring areas, which will be reviewed more in depth in later articles). Probably the maps say it all, although we must understand that they only consider the best known sites:

Prior to Toba event (120-74 Ka BP)
(open circles: human remains, dots: other archaeological sites)
(notice that the date of Narmada hominin is most unclear, what is not reflected in the map)
Blue: 74-45 Ka BP
(stars: Neanderthal sites, open circles: other human remains, dots: archaeological sites, black: previous map)
Red: 45-35 Ka BP
(stars: neanderthal sites, open circles: other human remains, dots: archaeological sites, black & blue: previous maps)
Green: later expansion of H. sapiens in Northern Asia
(stars: Neanderthals, open circles: other human remains, dots: other archaeological sites, black, blue & red: previous maps)

I must say that the design of the maps is not quite the way I would have done myself but is still interesting. Very especially I miss lots of info on post-Toba South Asia. Also the Altai transition is not really well explained in my understanding. On the other hand East Asia is full of details and the overall picture of the archaeology of the Eurasian expansion is well described nonetheless.

PS- from the commentaries by David at his blog, it seems clear that he gives for granted the occupation of South Asia after Toba and therefore he did not consider it important to mark any more recent sites in the subcontinent. 


    Fu 2013: new ancient mtDNA sequences and "molecular clock" madness

    It took me quite a while to get time to look at this study in some depth and when I finally did I must say I was rather disappointed. In any case the popular demand makes necessary to discuss it.
    Qiaomei Fu et al., A Revised Timescale for Human Evolution Based on Ancient Mitochondrial Genomes. Current Biology 2013. Pay per viewLINK [doi:10.1016/j.cub.2013.02.044]
    The study has two aspects: one, of great interest, which is the sequencing of a number of ancient remains, the other a complex and quite poorly explained and rendered speculation on how these sequences could be used to produce a refined molecular clock. 
    Ancient mtDNA sequences
    Most of the sequences used by Fu et al. in their molecular clock speculations are new and that part is very interesting:

    I have highlighted in lime green the new sequences, otherwise also noted by the marker b. It is of note that the “Crô-Magnon 1” sequence produced a C14 age of just a few centuries, being therefore removed from the collection. Other Crô-Magnon 1 remains produced no useful data. 
    The authors also decided to discard as possibly contaminated the UP sequence  from Pagicci Str. 4b. I have highlighted in red why they decided to do so: because the C→T misincorporation rate, characteristic of ancient remains, is too low, what makes contamination at least a serious probability. 
    So we have as new data for the Upper Paleolithic landscape in Europe that the people of Dolni Vestonice carried lineages U* (found also in Swabian Magdalenian) and U8, in the line of haplogroups K, U8a (Basque) and U8b (Eastern Mediterranean). Also some late UP and Epipaleolithic sequences from Oberkassel (Low Rhineland, Germany), Loschbour (Luxemburg) and Continenza (Abruzzo, Italy) are U5b variants, consistent with other findings from various parts of Europe. In Paglicci (Apulia, Italy), another sequence yielded U2’3’4’7’8’9, surely an extinct variant of the ancestor of U8 and U2 (among other lineages). No radiocarbon date is available for any of the Italian remains.
    In East Asia, Boschan, with B4c1a, provides one of the first Epipaleolithic sequences for the region. 

    Molecular clock madness

    The authors seem to intend, or so declare, to refine the molecular clock estimates by means of using these sequences as intermediate calibration references. Here I get the first big question: with all the literature on ancient DNA, why only these sequences? No idea.
    Then the contradictions arise. I believe that I have synthesized the most obvious ones in the following marginal annotations (in red) to their molecular clock estimates:

    Furthermore, the authors claim in the text that U5 is the oldest branch to diverge from U, however their TRMCA figure is of only 34.4 Ka BP (coding region), while Kostenki 14 has an age of 38 Ka BP and already carried U2, what really makes this claim extremely unlikely: U2 and its ancestor U2’3’4’7’8’9 should be considered the oldest U sublineage. 
    I do not understand either why they force age estimates for many lineages for which they have no working aDNA references and instead desist of estimating the age of lineages for which they have several calibration points, like U2’3’4’7’8’9 or B4’5 (aka B). 
    In brief: the claims of this paper on molecular-clock-o-logy are ill-explained, confusing, incoherent… a total mess. The raw data on ancient mtDNA is however good looking and of doubtless interest.