Wednesday, April 22, 2015

Ominous news for the neutral theory nearly every week now

Ominous news for the neutral theory nearly every week now: Nature paper Nature paper yesterday found endogenous retrovirus (ERV) to be functional. We have a paper last week in Genomics providing experimental evidence for essentially no neutral SNPs "Collective effects of SNPs on transgenerational inheritance in Caenorhabditis elegans and budding yeast.", which provides more evidence for the conclusion we published last year "Scoring the collective effects of SNPs: associations of minor alleles with complex traits in model organisms.

Human endogenous retrovirus (HERV) proviruses comprise a significant part of the human genome, with approximately 98,000 ERV elements and fragments making up nearly 8%. One family, termed HERV-K (HML2), makes up less than 1% of HERV elements but is one of the most studied. 

The paper found HERV-K to be fully functional. By inference via good common sense, the whole ERV class should also be functional, which just needs time and effort to be found out. This inference for the ERV kind sequence is exactly like we consider the protein kind to be all functional. Despite the fact that the functions of probably ~80% of human proteins remain unknown but no one doubts that they have a function because we do know some proteins have functions. So, if one type of ERV has functions, which happens to be the most studied, should it not to be the null hypothesis that all ERVs have functions?

The popgen and molecular evolution field today, mostly made up of people who rarely do any bench work on DNA functions, still considers ~90% of human genome to be neutral junks. But how interesting and dramatic, a big chunk of these junks were turned into gold overnight by one paper!! More interesting and dramatic findings of the same kind are sure to come over and over again within the next two years until all popgen researchers abandon their neutral bandwagon and join their bench colleagues who are nearly all on the functional train since long time ago. 

Abstract of the paper:

Intrinsic retroviral reactivation in human preimplantation embryos and pluripotent cells
• Edward J. Grow, et al
Endogenous retroviruses (ERVs) are remnants of ancient retroviral infections, and comprise nearly 8% of the human genome1. The most recently acquired human ERV is HERVK(HML-2), which repeatedly infected the primate lineage both before and after the divergence of the human and chimpanzee common ancestor2, 3. Unlike most other human ERVs, HERVK retained multiple copies of intact open reading frames encoding retroviral proteins4. However, HERVK is transcriptionally silenced by the host, with the exception of in certain pathological contexts such as germ-cell tumours, melanoma or human immunodeficiency virus (HIV) infection5, 6, 7. Here we demonstrate that DNA hypomethylation at long terminal repeat elements representing the most recent genomic integrations, together with transactivation by OCT4 (also known as POU5F1), synergistically facilitate HERVK expression. Consequently, HERVK is transcribed during normal human embryogenesis, beginning with embryonic genome activation at the eight-cell stage, continuing through the emergence of epiblast cells in preimplantation blastocysts, and ceasing during human embryonic stem cell derivation from blastocyst outgrowths. Remarkably, we detected HERVK viral-like particles and Gag proteins in human blastocysts, indicating that early human development proceeds in the presence of retroviral products. We further show that overexpression of one such product, the HERVK accessory protein Rec, in a pluripotent cell line is sufficient to increase IFITM1 levels on the cell surface and inhibit viral infection, suggesting at least one mechanism through which HERVK can induce viral restriction pathways in early embryonic cells. Moreover, Rec directly binds a subset of cellular RNAs and modulates their ribosome occupancy, indicating that complex interactions between retroviral proteins and host factors can fine-tune pathways of early human development.

Thursday, March 12, 2015

DNA mutation clock proves tough to set, of course fully expected by us

As reported by the latest issue of Nature (DNA mutation clock proves tough to set), the dates calculated so far for the Out of Africa model is really a joke. As a key player in the field David Reich says:“The fact that the clock is so uncertain is very problematic for us,” he says. “It means that the dates we get out of genetics are really quite embarrassingly bad and uncertain.”

The author says: "A slower molecular clock worked well to harmonize genetic and archaeological estimates for dates of key events in human evolution, such as migrations out of Africa and around the rest of the world. But calculations using the slow clock gave nonsensical results when extended further back in time — positing, for example, that the most recent common ancestor of apes and monkeys could have encountered dinosaurs."

Of course, we have said repeatedly in numerous papers since 2008 that the mutation rate should not be calculated by using genetic distances that are really maximum distance.

Again, without a real understanding , or with a mistaken understanding, of the first result in molecular evolution, the genetic equidistance result, the field really has no clue about what they are doing. 

Monday, October 27, 2014

Why the surprising pattern of no genetic continuity between people living in the same area but from different periods of time? Think the flu virus!

I used three slides as shown below to illustrate the idea of informative DNAs in my talk in last month’s workshop on genome and evolution in Naples, Italy.

The antigenic sites in human influenza A virus mutate and turn over quickly, which is critical for their survival or escape from human neutralizing antibodies and hence responsible for flu epidemics. As shown in Figure 1, two amino acid positions in hemagglutinin (156 and 145, panel a and b) turned over several times within a 30 year period, while two others (138 and 194, panel c and d) stayed largely unchanged (Figure from Shih et al, 2007). 

The flu results illustrate two important points with regard to evolutionary dynamics of a genome that have so far been grossly overlooked by the evolution and popgen field. First, fast evolving or less conserved DNAs are also functional rather than neutral as they are essential for quick adaptive needs in response to fast changing environments. Second, fast evolving DNAs turn over quickly and can be shown to violate the infinite sites model.  Hence, they cannot be used for phylogenetic inference. If one uses the fast changing sites in a flu virus to infer the phylogenetic relationship of the virus isolates responsible for different epidemics in a past period of say 10 years, one would reach the absurd conclusion that each epidemic was caused by a distinct type of flu virus with no genetic continuity among them rather than just minor variations of the same type.

Mutation rates in humans are of course much slower than that in a flu virus. But just like a flu virus, there are also fast and slow changing sites (Figure 2). The time scales are different but the principle is the same.  The fast changing sites may turn over every few thousand years and in fact make up the majority of the observed variant sites in humans when properly examined by us (Figure 3). This is why the field of ancient DNA kept producing the absurd pattern of no genetic continuity between people living in the same area but from different periods of time. All of the published analyses have simply used the wrong sites that are equivalent to the fast changing antigenic sites in a flu virus. What one should be using are sites with very slow mutation rates, like 1 mutation every 50,000 years. We have been busy reinterpreting the published DNAs for several years now and hope to submit our work soon.

Figure 1. (a and b) Frequency changes at residue sites 156 (a) and 145 (b) were highly dynamic. (c and d) Sites 138 (c) and 194 (d) did not undergo major frequency change over time.

Figure 2. A priori model of evolutionary dynamics of human genomic DNAs.

Figure 3. Difference between slow and fast evolving sites. Shown are a piece of homologous DNA in three different individuals or species. In the fast evolving DNAs making up the vast majority of human genome, there is obvious and verifiable violation of the infinite sites model. These DNAs have abundant overlapped mutant sites where independent mutations have occurred on the same site in different individuals or species. 


Shih, C-C., Hsiao, T-C., Ho, M-S., and Li, W-H. (2007) Simultaneous amino acid substitutions at antigenic sites drive influenza A hemagglutinin evolution. Proc Natl Acad Sci U S A. 104:6283-6288.

Thursday, October 23, 2014

Surprises from the 45,000 year old Siberian Ust'-Ishim: why is he not closer to Africans than East Asians are?

The genome of the 45,000 year old Siberian Ust'-Ishim published yesterday in Nature (see John Hawks blog) again repeated the same absurd pattern of no genetic continuity between local people living in different periods of time. The Ust'-Ishim genome is no more related to the 24,000 year old Siberian MA1 than to living East Asians. But this kind of surprises is getting boring for me to mention in this blog.  (John Hawks said this in his blog: "This is not an isolated case, it is another example of what we see throughout the world: Ancient people represented by DNA that seem to have very little to do with the people who live in the same areas today. We're not finding the ancestors of living populations so much as we are finding branches of populations we did not know existed.")

A new kind of surprise is the failure to do all necessary studies or to present all relevant studies. One expect that the Ust'-Ishim genome should be almost 2 fold less distant to living Africans than East Asians are because he had 45,000 years less time to accumulate distance as shown in Figure 1A. But the paper made no mention of this key expectation from the Out of Africa model. 

It also makes no sense for Ust'-Ishim to be an outlier to living East Asians on a PCA plot (Figure 2) since the distance between Ust'-Ishim and East Asians should be almost 2 fold less distant than between certain pair of East Asians, again because Ust'-Ishim had 45,000 years less time to accumulate mutations/distance (Figure 1A).

Our results with the 1000 genomes data showed that East Asians CHS and Europeans GBR are equidistant to Africans LWK or YRI in fast evolving SNPs representing genome average (Figure 1B). This of course has nothing to do with mutation rate and time but represents maximum genetic distance and natural selection. We are going to soon analyse the Ust'-Ishim genome in the same way and we fully expect Ust'-Ishim to be equidistant or more distant to Africans than East Asians are, which would be the same pattern as our first blog post here in 2007 had shown for the Neanderthals. Now such a result would be truly inconvenient for the Out of Africa model, which is probably why it was left out in the paper. 

Figure 1

Figure 2

Thursday, September 11, 2014

More ancient DNA surprises from ASHG 2014 abstracts

Two interesting ancient DNA abstracts from the ASHG 2014 meeting.  Just like my last post here, the surprise is again (and again and again.....again....) that there is no genetic continuity between local people living today and those locals in the past, or between local people living in different periods in the past. 

Capture of 390,000 SNPs in dozens of ancient central Europeans reveals a population turnover in Europe thousands of years after the advent of farming. I. Lazaridis, W. Haak, N. Patterson, N. Rohland, S. Mallick, B. Llamas, S. Nordenfelt, E. Harney, A. Cooper, K. W. Alt, D. Reich.
   To understand the population transformations that took place in Europe since the early Neolithic, we used a DNA capture technique to obtain reads covering ~390 thousand single nucleotide polymorphisms (SNPs) from a number of different archaeological cultures of central Europe (Germany and Hungary). The samples spanned the time period from 7,500 BP to 3,500 BP (Early Neolithic to Early Bronze Age periods) and most of them were previously studied using mtDNA (Brandt, Haak et al., Science, 2013). The captured SNPs include about 360,000 SNPs from the Affymetrix Human Origins Array that were discovered in African individuals, as well as about 30,000 SNPs chosen for other reasons (that are thought to have been affected by natural selection, or to have phenotypic effects, or are useful in determining Y-chromosome haplogroups). By analyzing this data together with a dataset of 2,345 present-day humans and other published ancient genomes, we show that late Neolithic inhabitants of central Europe belonging to the Corded Ware culture were not a continuation of the earlier occupants of the region. Our results highlight the importance of migration and major population turnover in Europe long after the arrival of farming. * Contributed equally to this work.

Insights into British and European population history from ancient DNA sequencing of Iron Age and Anglo-Saxon samples from Hinxton, England. S. Schiffels, W. Haak, B. Llamas, E. Popescu, L. Loe, R. Clarke, A. Lyons, P. Paajanen, D. Sayer, R. Mortimer, C. Tyler-Smith, A. Cooper, R. Durbin.
   British population history is shaped by a complex series of repeated immigration periods and associated changes in population structure. It is an open question however, to what extent each of these changes is reflected in the genetic ancestry of the current British population. Here we use ancient DNA sequencing to help address that question. We present whole genome sequences generated from five individuals that were found in archaeological excavations at the Wellcome Trust Genome Campus near Cambridge (UK), two of which are dated to around 2,000 years before present (Iron Age), and three to around 1,300 years before present (Anglo-Saxon period). Good preservation status allowed us to generate one high coverage sequence (12x) from an Iron Age individual, and four low coverage sequences (1x-4x) from the other samples.   By providing the first ancient whole genome sequences from Britain, we get a unique picture of the ancestral populations in Britain before and after the Anglo-Saxon immigrations. We use modern genetic reference panels such as the 1000 Genomes Project to examine the relationship of these ancient samples with present day population genetic data. Results from principal component analysis suggest that all samples fall consistently within the broader Northern European context, which is also consistent with mtDNA haplogroups. In addition, we obtain a finer structural genetic classification from rare genetic variants and haplotype based methods such as FineStructure. Reflecting more recent genetic ancestry, results from these methods suggest significant differences between the Iron Age and the Anglo-Saxon period samples when compared to other European samples. We find in particular that while the Anglo-Saxon samples resemble more closely the modern British population than the earlier samples, the Iron Age samples share more low frequency variation than the later ones with present day samples from southern Europe, in particular Spain (1000GP IBS). In addition the Anglo-Saxon period samples appear to share a stronger older component with Finnish (1000GP FIN) individuals. Our findings help characterize the ancestral European populations involved in major European migration movements into Britain in the last 2,000 years and thus provide more insights into the genetic history of people in northern Europe.

Friday, August 29, 2014

Another ancient DNA surprise: history of the New World Arctic people

It has been widely noticed repeatedly that every ancient DNA research result has been a great surprise, starting from our 2008 paper or the first post on this blog back in 2007. The latest is a Science paper today The genetic prehistory of the New World Arctic. The surprises here are 1) again (and again and again.....again....) that there is no genetic continuity between local people living today and those locals in the past(>2000 years old); again and again ... replacement rather than regional continuity, following exactly the footsteps of the Out of Africa model superseding the Multiregional model. 2) no sex between people who lived side by side;"Elsewhere, as soon as people meet each other, they have sex," says Willerslev. "Even potentially different species like Neanderthals [and modern humans] had sex, so this finding is extremely surprising." (3) extreme low genetic diversity in mtDNA in ancient Paleo-Eskimos. "I can't remember any other group having such low diversity," says Willerslev. For quote by Willerslev, see see this news piece.

Well, just like we said in our post on the 400K year old Heidelbergensis DNA, it would be a complete surprise if the field of ancient DNA as it is presently practiced could produce any sensible and non-surprising result consistent with common sense and fossil and cultural records. When you use noninformative DNAs to do your analytic work, what can you expect other than meaningless trash.

Of course we are working hard to reinterpret these newly published DNA sequences and we should soon publish our results (constantly delayed by newly released DNAs needing reinterpretations) that should be a very pleasant and intellectually satisfying surprise to all. For example, as our new analysis shows, the iceman Otzi was indeed most closely related to the local living Italians as common sense would expect, rather than to remote island people the Sardinians as is now mistakenly concluded by the literature. 

Monday, August 18, 2014

Secrets of the creative brain

There was a recent good article on creativity,Secrets of the creative brain.  A related blog post, The Psycholpathology of Genius.

We are actively working on the genetic basis of complex traits and the most complex is obviously creativity and intelligence. According to the threshold theory, Creativity is not IQ only and a score of 120 is the threshold. Lower or much higher than than that may hurt creativity.

Some quotes from Secrets of the creative brain:

One possible contributory factor is a personality style shared by many of my creative subjects. These subjects are adventuresome and exploratory. They take risks. Particularly in science, the best work tends to occur in new frontiers. (As a popular saying among scientists goes: “When you work at the cutting edge, you are likely to bleed.”) 

I’ve been struck by how many of these people refer to their most creative ideas as “obvious.” Since these ideas are almost always the opposite of obvious to other people, creative luminaries can face doubt and resistance when advocating for them. As one artist told me, “The funny thing about [one’s own] talent is that you are blind to it. You just can’t see what it is when you have it … When you have talent and see things in a particular way, you are amazed that other people can’t see it.” Persisting in the face of doubt or rejection, for artists or for scientists, can be a lonely path—one that may also partially explain why some of these people experience mental illness.

One interesting paradox that has emerged during conversations with subjects about their creative processes is that, though many of them suffer from mood and anxiety disorders, they associate their gifts with strong feelings of joy and excitement. “Doing good science is simply the most pleasurable thing anyone can do,” one scientist told me. “It is like having good sex. It excites you all over and makes you feel as if you are all-powerful and complete.” This is reminiscent of what creative geniuses throughout history have said. 

Many creative people are autodidacts. 

Many creative people are polymaths, as historic geniuses including Michelangelo and Leonardo da Vinci were

Creative people tend to be very persistent, even when confronted with skepticism or rejection. 

Some people see things others cannot, and they are right, and we call them creative geniuses. Some people see things others cannot, and they are wrong, and we call them mentally ill. And some people, like John Nash, are both.