Mitochondrial DNA revision and the Reconstructed Sapiens Reference Sequence (RSRS)

06 Apr
The authors and maintainers of PhyloTree have produced a new build (no. 14) with a long list of changes. However the most important part is probably in an associated paper:
The Copernican adjective is because they hope to replace the use of the rCRS (revised Cambridge Reference Sequence, with the highly derived haplogroup H2a2a1, and arbitrary yet convenient pick back in the day) with the Reconstructed Sapiens Reference Sequence (RSRS) which is also the haplotype of the matrilineal most recent common ancestor of extant Humankind, alias ‘Mitochondrial Eve’.
The RSRS is also quite directly comparable to the RNRS (the ‘Neanderthal Eve’, see Fig. S1), being separated by 122 mutations of the coding region and a few more of the control region (HVR). However while between the ‘Neanderthal Eve’ and actual Neanderthal sequences there are some 17 coding region mutations (14-21), in our species the distance to the most recent common ancestor is longer, 40-50 coding region mutations (most commonly, from memory) and even more when we count the HVR, as illustrated in figure S2:
Fig. S2 – Distances in Substitution Counts from the RSRS to Extant Haplotypes
This image is particularly interesting because it illustrates what the authors describe as indications for violation of the molecular clock: differences in the in the length of the various branches, which vary wildly from circa 40 to above 70, a difference of almost 2:1 between the extremes.
Oddly enough, in spite of M being closer to the root than N and having on average more mutations to present day sequences, the authors manage to somehow assign a younger age to this haplogroup than to its sister N. But in any case their estimates must be wrong overall because the age attributed to L3’4’6 (71 Ka BP), L3’4 (64 Ka) and L3 (67 Ka) are more recent than the known archaeological evidence for  the migration out-of-Africa which, at the latest, must have happened c. 90-80 Ka ago. The L3 node must be older than these dates therefore.
The authors also propose some nomenclature terminology, notably the use of superscript n for the nodal haplotypes (as opposed to just unclassified sequences, for which we will still use asterisk). That way we can more easily discern a nodal or root H haplotype (Hn) from a mere random unclassified H haplotype (H*). Other proposals for the nomenclature may be more debatable and in some cases they manage to violate them themselves in the supplemental materials.
Whatever my criticisms (everything is debatable and I love a good discussion), I must say that the PhyloTree team deserves my utmost appreciation and respect: before them understanding the human mtDNA landscape was a total mess, now it is possible even for amateurs with a keen interest like myself. Thanks a lot.

See also: Mitochondrial DNA and molecular clock.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: