Ancient East Asian Y-DNA maps

RSS

Ancient East Asian Y-DNA maps

15 Dec

I’m fusing here data from two different and complementary sources:

Hui Li et al. Y chromosomes of prehistoric people along the Yangtze River. Human Genetics 2007. → LINK (PDF) [doi:10.1007/s00439-007-0407-2]
A 2012 study integrally in Chinese (so integrally that I don’t even know who the authors are → LINK) but whose content was discussed in English (after synthetic translation) at Eurogenes blog. I deals with a variety of ancient Y-DNA from the Northern parts of P.R. China.

Update (Dec 25): much of the Northeastern aDNA is also discussed in an English language study (h/t Kristiina):

Yinqiu Cui et al. Y Chromosome analysis of prehistoric human populations in the West Liao River Valley, Northeast China. BMC 2013. Open access → LINK [doi:10.1186/1471-2148-13-216]

Combining the data from both sources, I produced the following maps:

Neolithic (before ~4000 BP):

Metal Ages (after ~4000 BP):

Discussion

I find particularly interesting the first map because it outlines what seem to be three distinct ethnic (or at the very least genetic) regions in the Neolithic period:

A Central-South region dominated by O3
An Eastern area around modern Shanghai dominated by O1
A Northern region dominated by N

Later on, in the Metal Ages, a colonization of the North/NE by these O3 peoples seems apparent, followed, probably at a later time, by a colonization of the West (Taojiazhai).

We do not have so ancient data for the West but we can still see a diversity of lineages, notably Q (largely Q1, if not all), C (most likely C3, also in the NE) and N (also in the NE). While the arrival of O3 to this area was probably late, the arrival of R1a1a is quite old, however it is still almost certainly related to the first Indoeuropean migrations eastwards, which founded the Afanasevo culture in the area of Altai.

I find also very interesting the presence, with local dominance often, of N (including an instance of N1c) and Q in the Northern parts of P.R. China, because these lineages are now rather uncommon but are still dominant in Northern Asia, Northeastern Europe and Native America. The fact that they were still so important in the Northern Chinese frontier in the Neolithic and even in the Metal Ages should tell us something about their respective histories and, in the case of N, origins as well.

It is also notable that no D was detected anywhere. However the regions with greatest D frequencies like Tibet, Yunnan or Japan were not studied.

See also: Ancient Jomon mtDNA from Japan.

9 Comments

Posted by Maju on December 15, 2013 in aDNA, Bronze Age, China, East Asia, Iron Age, Neolithic, Y-DNA

9 responses to “Ancient East Asian Y-DNA maps”

Ebizur

December 16, 2013 at 1:58 pm

Han/Jiangsu, Anhui, Zhejiang, and Shanghai (Yan et al. 2011)

2/167 O1a-M119(xP203, M110) [-0.2% from Han average]
37/167 O1a1-P203 [+9.1%]
1/167 O1a2-M110 [-0.2%]
40/167 = 0.240 O1a-M119 total [+8.7%]

8/167 O2-P31/M268(xO2a-PK4, O2b-M176) [-0.2%]
1/167 O2a-PK4(xO2a1-M95) [-0.8%]
3/167 O2a1-M95(xO2a1a-M88) [-0.7%]
12/167 = 0.072 O2-P31 total [-2.0%]

5/167 O3-M122(xO3a-M324) [+1.3%]

5/167 O3a1-L127/KL1/KL2(xM121, M164, 002611) [+0.8%]
2/167 O3a1a-M121 [+0.4%]
27/167 O3a1c-002611 [-0.7%]
34/167 = 0.204 O3a1-L127 total [+0.4%]

3/167 O3a2-P201(xM159, M7, P164) [+0.1%]
3/167 O3a2b-M7 [-0.1%]
7/167 O3a2c-P164(xM134) [+0.9%]
10/167 O3a2-P201(xM7, M134) total [+0.7%]
18/167 O3a2c1-M134(xM117) [-0.6%]
29/167 O3a2c1a-M117 [+1.0%]
60/167 = 0.359 O3a2-P201 total [+1.0%]

99/167 = 0.593 O3-M122 total [+2.8%]

151/167 = 0.904 O-M175 total [+9.5%]
16/167 = 0.096 Y(xO-M175) total [-9.5%]

Han total/East China=167 + North China=129 + South China=65 (Yan et al. 2011)
27/361 = 0.075 C-M130
18/361 = 0.050 N-M231
12/361 = 0.033 Q-M242
7/361 = 0.019 D-M174
4/361 = 0.011 R-M207
1/361 = 0.003 G-M201
69/361 = 0.191 Y(xO-M175) total

5/361 O1a-M119(xO1a1-P203, O1a2-M110)
47/361 O1a1-P203
3/361 O1a2-M110
55/361 = 0.152 O1a-M119 total

18/361 O2-P31/M268(xO2a-PK4, O2b-M176)
5/361 O2a-PK4(xO2a1-M95)
9/361 O2a1-M95(xO2a1a-M88)
1/361 O2b-M176
33/361 = 0.091 O2-P31/M268 total

6/361 O3-M122(xO3a-M324)

8/361 O3a1-L127/KL1/KL2(xM121, M164, 002611)
3/361 O3a1a-M121
61/361 O3a1c-002611
72/361 = 0.199 O3a1-L127/KL1/KL2 total

6/361 O3a2-P201(xM159, M7, P164)
1/361 O3a2a-M159
7/361 O3a2b-M7
12/361 O3a2c-P164(xM134)
41/361 O3a2c1-M134(xM117)
59/361 O3a2c1a-M117
126/361 = 0.349 O3a2-P201 total

204/361 = 0.565 O3-M122 total

D-M174 is less common among modern Han Chinese than e.g. Q-M242, so, under an assumption of neutrality, one should expect D-M174 to be at least as peripheral as Q-M242 among human remains from ancient China.

I note that the greatest deviation of modern Han from the lower Yangtze region (central part of the eastern coast of China) from the average for modern Han (excepting the +9.5% deviation for O-M175 total) is the deviation for O1a1-P203 [+9.1%]. I would say that the data seem to support the hypothesis of the assimilation of some descendants of the local Liangzhu Neolithic culture(s) into the incoming Han population, but the Liangzhu are almost certainly not the primary patrilineal ancestors of modern Han anywhere in China, even within the geographical limits of the Liangzhu culture(s). I think it is important for you to consider these data, Maju, since you always have vehemently opposed any hypothesis of post-Neolithic population replacement.

Reply
Maju

December 16, 2013 at 3:19 pm

Thanks for your interest and complementary modern Y-DNA info, Ebizur.

“… you always have vehemently opposed any hypothesis of post-Neolithic population replacement”.

But I had to rethink my opinion already in the context of Europe, where some regions like Germany seem to have suffered repeated repopulation waves, while others like Portugal may have got less important but still significant partial post-Neolithic replacements.

However one of my old battle horses was questioning the idea of N→S Neolithic (and post-Neolithic) replacement and to some extent this data supports rather a S→N replacement if anything. Related to this is the issue of alleged expansion of the so-called Mongoloid phenotype in Neolithic or post-Neolithic times, which, with some possible exceptions, I still regard as unlikely in general.

It's true that in the Shanghai area there seems to have also been a W→E replacement but we can't be sure of the exact extent of the various genetic clusters, i.e. was the O1 zone limited to that pocket, extended through all the coast or what? I'd say that the rest of the data suggest it was quite limited but the certainty is slim. I bet some people will use (have already used) this datum of the ancient Shanghai area O1 as argument against the apparent expansion from the Malay archipelago of this lineage, supported by greatest basal diversity in that area, to which I have to say that it is much more likely to be an offshoot than the origin.

In general it seems quite clear in any case that the O3 expansion began in the Southern and Central Neolithic centers. It also seems clear that O1 and O2a did not experience a similar expansion, at lest not in any obvious manner and certainly not within the boundaries of modern China (although it's plausible that O2a did something similar in Indochina and nearby areas: Austroasiatic Neolithic and such). Less clear is the issue of O2b, not yet reported as such in any ancient DNA.

What this data offer are minimal dataset to delimit the speculation and that is in any case welcome and most useful.

Reply
Kristiina

December 18, 2013 at 10:48 am

Nice graphics!

It is a pity that in that Yangtze River paper they did not test other haplogroups except O lines. It would have been interesting to know if there was any C3 or C on the coast or Q, D or N lines in Taosi or Daxi sites.

Reply
Maju

December 18, 2013 at 2:05 pm

So “undetermined” means that they did not test for other non-O lineages. That's interesting, thanks – I was not really aware of it. If that's correct they probably did not test for O2b (or O2) either, a lineage that I also miss in this synthesis.

Reply
Ebizur

December 18, 2013 at 2:50 pm

Maju,

One problem is that the mtDNA profile of Han from the Shanghai area is just about as typically “northern” as is the mtDNA profile of e.g. Koreans from Seoul & Daejeon, South Korea. If the high frequency of Y-DNA haplogroup O1a in the Liangzhu Neolithic (and, to a lesser extent, in modern populations of the lower Yangtze region) were due to a migration from Maritime Southeast Asia, the other region of concentration of Y-DNA haplogroup O1a, we should expect the lower Yangtze region and Maritime Southeast Asia to share some affinity in mtDNA as well, barring the possibility of a sex-biased migration.

The nearest known outgroup to Y-DNA haplogroup O1 is O2-P31. O2-P31 exhibits a distributional pattern that is quite reminiscent of a serial founder effect emanating from northern East Asia toward Southeast Asia and South Asia, with the oldest branch (O2b) found almost exclusively in the circum-Japan Sea area, O2a(xPK4) found mainly in northern China, O2a1-PK4(xM95) found with low frequency throughout China, O2a1a-M95(xM88) found mainly from southern China through Southeast Asia and South Asia, and O2a1a1-M88 found almost exclusively in Southeast Asia (with a little in some enclaves of Southwest China). If O1 is originally from Southeast Asia, one must explain why its nearest outgroup, O2, seems to be originally from northern East Asia.

On the other hand, since the difference between the Y-DNA of modern Han of the lower Yangtze region and modern Han from other regions seems to be much less clear-cut than the difference between the ancient Liangzhu and other contemporary Neolithic cultures within what is now China, I suppose the rather small difference of modern Shanghai Han from modern Liaoning (South Manchurian) Han in regard to Y-DNA might be compatible with the observed degree of mtDNA differentiation.

Granted, the mtDNA profile of modern Han does become much more divergent from the “northern” average as one goes further south from the Yangtze; Han populations from e.g. Guangdong or Guangxi exhibit mtDNA profiles that are much more similar to nearby Tai-Kadai or Hmong-Mien-speaking minority populations than to Han from the lower Yangtze region or further north. However, these Han populations of Guangdong/Guangxi that are divergent from northerly Han populations in regard to mtDNA are generally much less divergent from other Han populations in regard to Y-DNA (and they have even less O1a1-P203 than modern Han populations of the lower Yangtze region, by the way).

Reply
Ebizur

December 18, 2013 at 5:10 pm

I composed a detailed reply to Maju's previous comment, only to have it lost irrevocably in cyberspace when I tried to publish it. How exasperating!

Anyway, as for O2-P31, it is a very old lineage, with the split between O2a and O2b dating back to about the time of the split between Q1a-MEH2 and Q1b-L275 or the split between O3a-M324 and O3*-M122(xM324). In all these cases, the former of the pair is a widespread and frequently occurring haplogroup with many well-defined branches, whereas the latter of the pair is an overall rare haplogroup whose internal structure has not been well clarified. However, O2b has become quite frequent in populations around the southern shores of the Sea of Japan while being practically nonexistent elsewhere, whereas Q1b and O3*-M122(xM324) are quite widespread but with low frequency in Western Eurasia and East/Southeast Asia, respectively.

By the way, the splits of each of these three pairs are slightly older than the split between R1 and R2.

Not too long (maybe 4,000 years) after the extremely ancient split of O2b from O2a, early O2a left a relictual branch in (now mostly northern) China. The much later (about doubly later, in fact) branch of O2a-PK4(xO2a1-M95) also has been confirmed only in China (though with a slight tendency toward the south, and I have some evidence to suspect that it is also found in at least Vietnam and Laos). All the widespread O2a in the rest of Southeast Asia and South Asia belongs to the O2a1-M95 branch (including its O2a1a-M88 subclade in Southeast Asia). It looks to me like O2a has an ancient Paleolithic (at least 20,000 YBP) origin in northern or central China, and it has spread mainly southwestward from that point of origin, possibly to a large degree within the last 10,000 years in the form of O2a1-M95 (perhaps a harbinger of the Neolithic in southern East Asia). I do not see any evidence to suspect that O2b has moved much from its point of origin in the past 25,000 ~ 50,000 years since its split from O2a. O2b is old enough to possibly have been among the earliest AMH colonizers of the circum-Sea of Japan region, though its present internal variance definitely suggests some sort of bottleneck or very limited population size followed by a more recent expansion.

I am really tired of people trying to subsume both O2a and O2b into some singular Neolithic expansion, when all the empirical data available have demonstrated that the split between these two haplogroups is older than e.g. the split between R1 and R2 (which together contain all extant representatives of haplogroup R). People should be considering the era of Mal'ta or Tianyuan, not the era of Hemudu, Yangshao, or Hongshan.

Reply
Maju

December 18, 2013 at 5:31 pm

“… barring the possibility of a sex-biased migration”.

We don't have enough data but I would not bar this possibility in any case. We have seen that happening elsewhere, right?

“O2-P31 exhibits a distributional pattern that is quite reminiscent of a serial founder effect emanating from northern East Asia toward Southeast Asia and South Asia”…

As you describe it, it should be as you say… but the reality as I know it (Shi Yan 2011, reflected in Wikipedia also) it is that O2a* has been found in 4.6% of South China (Han) and only very rarely in the North (0.8%) and East (0.6%), the same can be said of all the other clades mentioned, even O2b, which was only found in Southern Han (1.5%) in that study.

I know I'm walking on thin ice discussing Y-DNA frequencies with you, Ebizur, as it is your specialization. But that's what I know. Feel free to enrich me and Wikipedia with the details of your data, which I do not have (including Hainan and Indochina if possible).

Re. O1a, which is the center of your argumentation, O1a1 is still much much more common in the East (22%) than in the North (1.6%), although it seems also rather frequent in the South (12%). Instead O1a* is more evenly (and quite thinly) distributed between the three regions. O1a2 is most common in the South (3%) and absent in the North.

The high frequency of O1a (25% total of all O) in the East strongly suggests a partial continuity in that area, with a dilution (immigration) of the order of 60% or slightly more. Some 35% of the local yDNA O ancestry (151/167) seems Neolithic, with the rest probably corresponding to later arrivals (always considering only the male side of the equation).

“Granted, the mtDNA profile of modern Han does become much more divergent from the “northern” average as one goes further south from the Yangtze”…

Indeed.

“However, these Han populations of Guangdong/Guangxi that are divergent from northerly Han populations in regard to mtDNA are generally much less divergent from other Han populations in regard to Y-DNA (and they have even less O1a1-P203 than modern Han populations of the lower Yangtze region, by the way)”.

That would seem to be expected, because O3 was already dominant in the Yangtze (at least in Daxi) before Han expansion, with frequencies similar to those of modern Northern Han. In fact many of the Northern Han areas of today were then completely different in Y-DNA, much closer to Finns, Buryats of Yakuts than to any local modern population, be it Han, Altaic or Koreanic-Japanic.

What for me these maps suggest is an expansion from the various Neolithic centers inland, rather than North→South (if anything some South→North expansion is also quite apparent).

Reply
Maju

December 18, 2013 at 5:52 pm

“I composed a detailed reply to Maju's previous comment, only to have it lost irrevocably in cyberspace when I tried to publish it. How exasperating!”

Are you sure it was lost and not just awaiting moderation? There's another comment above, to which I just replied (below). Comment moderation continues active until Terry effectively renounces to spam this blog (a banishment is a banishment is a banishment is a banishment, paraphrasing the poet).

“I am really tired of people trying to subsume both O2a and O2b into some singular Neolithic expansion”…

Just in case, I do not think that, regardless that some secondary migrations may correspond with Neolithic dates. In fact I was wondering if the undefined “O” found in ancient Niuheliang and Hengbei could be it, but I could not find anything pointing me in that direction.

“O2b is old enough to possibly have been among the earliest AMH colonizers of the circum-Sea of Japan region”…

Possibly. But what happens with the presence of O2b towards the South? As I just mentioned (below) Shi Yan only found this lineage in Southern Chinese and, from memory, it's also found in Hainan, right?

Reply
Ebizur

January 11, 2014 at 6:58 am

For some reason, I did not receive the usual “your comment is pending moderation” notice after trying to post that previous comment here. Anyway…

While looking through the supplementary data of the study by Magoon et al. (2013), I noticed that their data support the finding of Yan et al. (2013) regarding former O2*-P31(xO2a-PK4, O2b-M176) Y-chromosomes: according to all cases that have been tested for the pertinent markers, O2* is not actually extant, and all cases previously labeled as O2* actually belong to either of two early branches of O2a, O2a*-F1462(xO2a1-PK4) or O2a1*-PK4(xO2a1a-M95).

Unfortunately, Magoon et al. 2013 have only two examples of O2a1*-PK4(xO2a1a-M95) in their data set: NA18638 (a Han Chinese in Beijing, China) and HG00457 (a Han Chinese in Hunan or Fujian, China). It may be said that this confirms only that O2a1*-PK4(xO2a1a-M95) is found in Han Chinese in both northern and southern China.

However, this study's data do expand the known distribution of O2a*-F1462(xO2a1-PK4) both toward the south (southern China and Vietnam) and toward the east (eastern Japan). They have eight cases of O2a*(xO2a1-PK4), including three Han from Beijing, two Han from Hunan or Fujian, a Kinh from Ho Chi Minh City, a Japanese from Tokyo, and a Dai from Xishuangbanna, Yunnan. If we group the individuals from Beijing and Tokyo as “northern” and the individuals from Hunan or Fujian, Ho Chi Ming City, and Xishuangbanna as “southern,” then it is an equal split (4 vs. 4), and the data are equivocal regarding the question of a northern or southern origin of haplogroup O2a.

In other words, I must revise my previous statement that O2a*(xO2a1-PK4) is a relictual branch now found mostly in northern China; it may be relatively frequent in Beijing and northern China in general, but it is distributed widely from at least southern Vietnam and Yunnan all the way to eastern Japan, and I suppose it would be prudent to say that the available data are currently insufficient to determine where this clade occurs with the greatest frequency.

Reply

For what they were… we are