Category Archives: global genetics

ENCODE: from mere protein-coding to true program-like understanding of the human genome

It is in the news these days: an intercontinental army of some 440 researchers have taken decisive steps to truly understand the human genome, our base program, helped by the much lower costs of genome sequencing achieved recently.
One of the most remarkable results is discarding that most of our genome is “junk DNA”. Until recently many though that only some 20% of the genome, the protein-coding segments, were meaningful, while the rest was useless “junk” mysteriously accumulated through the millennia. 
Nature is much more efficient than that, it seems, and the reality is that the remaining 80% of the genome is a maze of switches that actually regulate how cells, and the whole body, are built and maintained. A true biological program encoded in DNA.
The main product of this intercontinental effort is a threaded encyclopedia of the human genome, as well as three (freely accessible) articles in Nature:
In addition to all this you may want to visit the ENCODE site or read this one or this other article at SD, for example.
Leave a comment

Posted by on September 6, 2012 in Genetics, global genetics, human genetics


Cis-regulatory variability has almost no geographic structure

That is what a new paper has found:
Cis-regulatory elements are key pieces of the overall codification of the genome. The main finding of this paper is that they show almost no geographic structure, that virtually all variation in these key elements is inter-individual and not among populations.

From Supplemental Figure 1 – PC analysis

As you can see in the PCA plot, the inter-population variation is most subtle, almost impossible to discern with so much overlap. The lack of structure repeats for the other components.


CNV genetic variation in Chinese and others

A rather technical (but surprisingly easy to read) new paper that may have some interest for those interested in global and East Asian population genetics:
Haiyi Lou, Shilin Li et al., A Map of Copy Number Variations in Chinese Populations. PLoS ONE 2011. Open access.
They begin explaining what copy-number variations (CNVs) are:

Copy number variation (CNV) is a type of global genetic variations in human genome, defined as a segment of DNA larger than one kilobase presenting copy-number differences by comparison of two or more genomes. One single or co-effects of multiple genomic rearrangements such as deletion, insertion, duplication and unbalanced translocation are likely to cause CNVs. By changing gene dosage, interrupting coding sequences, and influencing neighboring gene regulation, CNVs can impact on gene expression and phenotypes.

There are several graphics of interest but I have selected this one:

Fig. 4 CNV sharing (click to expand)

Why? Because it highlights how minor racial or stock defining genetics markers are. Even in the ‘three races’ graph (A), the number of specific non-singleton (red) shared elements is very small, with most of the variation being shared across ‘races’ or restricted to individual (singleton) diversity. It is not much more what I share with another European than what I share with an African or Asian.
When you go to more specific ethnic differences (B), the ‘stock’ shared bloc becomes even smaller.
Interestingly, there is no particular intensity of sharing between Europeans and East Asians, each of them sharing more with Africans than with each other. This probably implies that the early Eurasian population that produced both was never truly differentiated from that of Africa.

PhyloTree Build #12

Just a quick heads up to report that the hyper-useful online mtDNA phylogenetic reference PhyloTree has been updated for the 12th time.
What is new? From the site:
Defining mutations updated: C4e, Z1a2, M12, M12a, M12b, M46, M50, D4a3a, I1, N9a6, N9a6a, A2f, A2f1, R0a, HV1, HV1b1, T2a1a1, R9c, B4a1a1a4, B4g, U5b1c2.
Newly added: L2a5, L2b’c’d, C1d1a, C1d1a1, M3c2, M12a1, M12a1a, M12a1b, M12a2, M18a, M26, M51a1a, M71c, D4h1b, W5a1a, W5a1a1, W5a1a1a, N8, R0a’b, R0b, HV1a’b’c, HV1a1b, HV1a2a, HV1a3, HV1b1a, HV1b1b, HV1d, HV4b, HV12a, HV12b, HV13, H3h, H1j1, T2a1a3, R9c1, F1a1c1, B5a1c, B5a1d, U5b1g, U5b2c2.
Multiple rearrangements/additions within: M7a2.
Relabeled: L2a1’2 -> L2a1-4, P8 <-> P9, K1ö -> K1f.
Acknowledgments: Norman Angerhofer, the EMPOP team.
1 Comment

Posted by on July 21, 2011 in global genetics, mtDNA


Selective sweeps lacking in human evolutionary history

There is a (quite mainstream) school in Genetics that just loves the idea of small genetic changes being so extremely adaptive that they quickly replace all or most other variants in frequency, clinging to fixation not by founder effect or drift but because of sheer adaptive power.
The followers of this school are plainly wrong. It was common sense before (I really never liked the idea at all: it’s plain silly, overly simplistic, sensationalist, irrational) but now it has been reasonably demonstrated: selective sweeps are extremely rare in the human genome, while small, gradual, less important shifts, maybe in dynamic equilibrium… are the norm instead.
Full story at Science Daily.

Is X-DNA lineage Neanderthal?

Upon the recent discoveries of Neanderthal and other archaic admixture in modern humans, it arose in the debates occasionally this almost forgotten matter of an X-linked haplotype that looked very much non-African. However Africa had only been sampled poorly back in that time and we lacked any confirmation on this haplotype being present or not in the Neanderthal genetic pool.
According to a new paper this X-DNA haplotype, known as B006, fits well with Neanderthal genetic data and is therefore a likely case of Neanderthal lineage among us.

Frequency of B006 worldwide

It was previously known that the lineage was most common among Europeans and Native Americans, though maybe most diverse in Central-NE Asia. Almost nothing was known then about its presence among South Asians or Australian Aborigines and the little knowledge of the African scatter (mostly among Burkinabe peoples, with one Ethiopian individual as well) has been confirmed, it seems.

But most importantly B006 shares both mutations divergent from the ancestral (Chimpanzee) form among all those analyzed in the Neanderthal genome. One is common to most modern human lineages but the other transition is unique to B006 and the Neanderthal lineage:

In the available Neandertal sequence (Green et al. 2010), there is information on 20 out of 35 dys44 polymorphic sites. These represent eighteen ancestral and two derived alleles, fully matching the corresponding sites of B006 (Table 1). One of the derived alleles, C of  rs6631517, is also shared with other dys44 haplotypes, whereas the second one, G of  rs11795471, is unique to B006 (the information on two remaining B006-polymorphisms is not available).
While not yet fully demonstrated (further B006 haplotypes should be considered along further Neanderthal ones as well), the idea, until now purely speculative, of this lineage being a Neanderthal one, clearly gains strength with this paper’s work.

Selection in looks?

A new research paper finds that many genes involved in appearance have rather clear indications of positive selection.
From the Abstract:


Here, we study the level of population differentiation among different populations of human genes. Intriguingly, genes involved in osteoblast development were identified as being enriched with higher FST SNPs, a result consistent with the proposed role of the skeletal system in accounting for variation among human populations. Genes involved in the development of hair follicles, where hair is produced, were also found to have higher levels of population differentiation, consistent with hair morphology being a distinctive trait among human populations. Other genes that showed higher levels of population differentiation include those involved in pigmentation, spermatid, nervous system and organ development, and some metabolic pathways, but few involved with the immune system. Disease-related genes demonstrate excessive SNPs with lower levels of population differentiation, probably due to purifying selection. Surprisingly, we find that Mendelian-disease genes appear to have a significant excessive of SNPs with high levels of population differentiation, possibly because the incidence and susceptibility of these diseases show differences among populations. As expected, microRNA regulated genes show lower levels of population differentiation due to purifying selection. 
While the paper does not seem to suggest why these patterns of inter-population differentiation are stronger in the genes of appearance mostly, I’d say that the selection involved is directly social and sexual and largely the same pattern by which races appeared and are maintained (to some extent). Intuitively people tend to favor statistically more those who look like themselves of people they know and have good opinion of. That way certain looks or family air are sustainedly favored within each population (the son who looks more like the father, the person who looks more sexy according to certain parameters, largely socio-cultural, etc.) producing eventually what we call races, more apparent than the actual underlying differences between populations, which are invariably much smaller than it looks. 

Malaria and sickle cell anemia do not correlate well in Asia

There is an interesting new article that confirms by statistical means the long held theory that sickle cell disease exists in a dynamic equilibrium with the deadly disease of malaria, at least in general.
What I find interesting is not so much the confirmation of a correlation in Africa and Europe but the lack of confirmation in Asia and Oceania, where malaria exists endemically too and, unlike the case of America, the discontinuity cannot be explained by loss of the sickle cell allele in the arctic latitudes (Beringia). In fact the peculiarities of the geographical distribution of sickle cell disease versus endemic malaria are very intriguing:

Fig. 1 (click to enlarge)

I understand that, in the case of Africa, the correlation is very strong, though not totally lineal. For instance the HbS (sickle cell) allele is lacking in wide areas of malaria endemism: The Horn, Southern Africa (including Bantu Mozambique but not Austronesian Madagascar), NW Africa…

So somehow these populations have managed to get rid of the sickle cell anemia in spite of being exposed to malaria in hyperendemic form. This alone is pretty interesting in itself, specially as it involves Bantu populations that are supposed to have expanded recently. 
In Europe we have two situations: Greece and Albania where sickle cell exists in moderate frequencies and the rest, where it is unheard of. While Greece is one of the malaria epidemic areas, there are others where the parasite has only been eradicated in the 20th century: Italy and Iberia specially. However no sickle cell exists there. This case is very suggestive, statistics apart, with the spread of Y-DNA E1b1b1a2 (V13), haplogroup that is most concentrated in the SW Balkans.
But most intriguing of all may be Asia. The presence of the allele in West Asia makes some good sense because of its prehistorical (and maybe also historical) interactions with Africa, so the HbS allele had all opportunities to arrive, really. 
But what about further East. We know of no particular “recent” genetic flow into South Asia that could explain the relatively high frequencies found in India, mostly in areas of lesser West Eurasian influence. The connections with Africa are even less likely (some small populations of recent African origin do exist but they cannot possibly have influenced the majority in such dramatic way in such short time).
It could be something arrived in the migration out of Africa of course, yet, on the other hand, further East there is no trace of the HbS allele. But malaria is found again in hyperendemic form in all SE Asia and Near Melanesia. Why? No idea. While the sampling is low density, it does seem like most relevant areas have been tested (map a), including Melanesia, Australian Aborigines, Indonesia and even Andaman.
So why these irregularities? Which are the founder effects implied? One can imagine that the northernmost historical malaria areas were free from the parasite in the Ice Age, but surely this was not the case of the Asian tropical belt. Did some sort of lifestyle, like choosing to dwell where marine breeze kept mosquitoes at bay (a usual common sense practice in tropical coasts as far as I know) help Eastern Eurasian founders prevent this disease, making the allele unnecessary? Was instead the HbS allele “stored” (by founder effect) in some South Asian demographic reservoir (and only there), gradually expanding to some nearby malaria affected areas but never participating of the colonization of the Far East by mere chance? 
What about the West, where the HbS allele is also mostly absent? Here at least one could argue that malaria was probably absent in the Pleistocene, so the HbS allele was logically selected against (even heterozygous types have some lesser disadvantages if malaria is not present to compensate for them).
It is a single SNP (rs334) which encodes all this matter, so it is not possible to trace it phylogenetically in any way.
Intriguing in any case.

Posted by on November 3, 2010 in global genetics, health, population genetics