A must read:
This paper is an almost total vindication of what I have been saying, specially in the last year since the infamous Balaresque paper was published and got all that undeserved media and blogs’ clout.
What I said back then can be found at my old blog Leherensuge:
Actually Busby et al. make reference to all these three papers once and again, however they seem to side almost totally with Morelli (whose research I applauded as well) and disagree profoundly with Balaresque. No wonder.
They used the Myres sample, enlarged specially for better coverage of Western Europe (fig. S2
). Sadly the demographic center of Paleolithic Europe is clearly undersampled (excepted Provence, well covered by Myres and dot in what are probably Toulouse and Santander respectively): not a single sample was taken in Perigord (Dordogne), Gascony or the Basque Country in what is probably a major shortcoming.
Molecular clock ‘not credible’
They also make reference to what is usually known as the molecular clock, with a quite negative remark of the methods used at present:
… we conclude that at the present time it is not possible to make any credible estimate of divergence time based on the sets of Y-STRs used in recent studies. Furthermore, we show that it is the properties of Y-STRs, not the number used per se, that appear to control the accuracy of divergence time estimates, attributes which are rarely, if ever, considered in practise.
In the discussion section they mention again this issue:
Dating of Y chromosome lineages is notoriously controversial [25,41–44], the major issue being that the choice of STR mutation rate can lead to age estimates that differ by a factor of three (i.e. the evolutionary  versus observed (genealogical) mutation rates [33,45]). Interestingly, despite the fact that Myres et al. and Balaresque used different STR mutation rates and dating approaches, their TMRCA estimates overlap: 8590–11 950 years using a mutation rate of 6.9 × 10−4 per generation, and 4577–9063 years using an average mutation rate of 2.3 × 10−3, respectively. Separately, Morelli calculated the TMRCA based only on Sardinian and Anatolian chromosomes, and estimated the R-M269 lineage to have originated 25 000–80 700 years ago) , based on the same evolutionary mutation rate [25,41] as Myres et al.
Leaves any casual (and even knowledgeable) observer quite perplex, right? The conclusion of the authors is clear: the molecular clock
can’t be trusted.
Even Dienekes admits it, saying that this paper could well be titled An epitaph for Y-STR.
Busby and colleagues followed the same methods as Balaresque but instead of considering R1b1a2 as a single amorphous haplogroup, they consider the various clades downstream of it as distinct entities:
We next calculated STR diversity for each population for the whole R-M269 lineage, and for the R-S127 and R-M269(xS127) sub-haplogroups, and investigated the relationship between average STR variance and longitude and latitude in exactly the same fashion as Balaresque. (…) We normalized latitude and longitude, and performed a linear regression between these values and the median microsatellite variance for the three R-M269 sub-haplogroups. We found no correlation with latitude (data not shown) and, contrary to Balaresque, we did not find any significant correlation between longitude and variance for any haplogroup.
The results are apparent in fig. 2
(left frequency, right variance in relation to longitude):
If anything the result is the opposite, showing a mild tendency for greater variance towards the West.
They explain the differences with Balaresque as follows:
The Balaresque dataset presents genotype data only to the resolution of SNP R-M269. Our results show that the vast majority of R-M269 samples in Anatolia, approximately 90 per cent, belong to the R-M269(xS127) sub-haplogroup. Removing these Turkish populations from the Balaresque data and repeating the regression removes the significant correlation (R2 = 0.23, p = 0.09; details in the electronic supplementary material and figure S2). These populations are therefore intrinsic to the significant correlation.
This is something I already noticed back in the day: that Balaresque’s bias blinded her to the subtleties of the downstream structure of the haplogroup, making a blank slate of all the clade.
Probably the apparent greater diversity observed in Turks and Armenians is caused by the addition of (1) great diversity of R1b1a2(xR1b1a2a1) plus (2) an also diverse (but clearly derived, even in Balaresque’s own data) backflow of European R1b1a2a1.
This backflow must be pre-Neolithic as far as I can discern, because since Neolithic the flow of people has been almost exclusively from East to West.
Another serious criticism they make about Balaresque is the use of an Y-search
dataset representing Ireland (surprisingly amateurish!) When compared with actual samples (Y-search relies on the good will of online reporters) the low diversity that Balaresque found
for Ireland vanished.
A ‘West Asian’ sublineage of R1b1a2?
This paper falls short of finding the defining SNP for such speculated sub-haplogroup but it does confirm the finding of Morelli 2010 of the Eastern or Anatolian bloc making up an STR-defined distinct clade of its own. My annotations on Morelli’s work:
What is most intriguing in my opinion is that, if this second haplogroup is confirmed, then R1b1a2 may have ultimately expanded from the Balcans, where most carriers of the core node seem to live today.
This could be consistent with the finding by Busby now of greatest frequency of R1b1a2* in Bulgaria and Romania (Morelli’s ‘Balcans’ are actually Serbia, where the lineage is rare).
Distribution of some sublineages
This paper also expands a bit our knowledge of the distribution of the most common (and best studied) sublineages under R1b1a2a1 (fig. 3):
|(a) R1b1a1a1a (S21), (b) R1b1a1a1b4 (S145), (c) R1b1a1a1b3 (S28)
It must be said here that the major known sublineages of R1b1a1a1b (P312/S116) are as follow (update: corrected Mar 15 2012):
- R1b1a2a1a1b2 (Z196) ··> most basally diverse among Basques and Gascons but also common among Catalans and other East Iberians and found as well among “French”, Bavarians and (it seems now) some Scandinavians (see comments)
- R1b1a2a1a1b3 (S28/U152) ··> see map (c) above
- R1b1a2a1a1b4 (L21/M529/S145, L459) ··> see map (b) above
In addition most R1b1a1a1b* (not yet classified as any sublineage) exists in SW Europe: in France and Iberia, where often makes up the majority of the Y-DNA pool. I have therefore argued that this lineage probably coalesced within the Franco-Cantabrian region, around which all sublineages fan out. However it is admittedly hard to explain the penetration into North Italy – but I cannot think of any better explanation because neither Italy nor Central Europe seem to host enough basal diversity to be considered potential homelands for R1b1a2a1a1b.
I have also argued that the “brother” haplogroup R1b1a2a1a (M405/S21/U106), shown in map (a) above, may be related to the somewhat distinct Hamburgian-Ahresnburgian
-Maglemösean techno-cultural complex of Northern Europe. The people of this cultural group surely saw their expansion favored by the end of the Ice Age.