1)
The neutral theory never predicts that sequences under purifying selection are
equally suitable for building phylogenetic trees, even if just for the topology
part of it. The key concept of the neutral theory is that most observed natural
variants are not under purifying selection.
Most are neutral and some (very few) are beneficial. The new data on Y
chr invalidate the neutral theory. And
if the neutral theory is invalid, all molecular trees today would have no sound
theoretical basis. In fact, in our view,
the neutral theory was mistaken right from the start when it mistakenly
interpreted the genetic equidistance result that got the field started. That result was the best evidence for
purifying selection and absence of junk/neutral DNAs.
2)
Common purifying selection would lead to common shared sequences, which would
dramatically affect topology. (It may be easier for one to imagine shared
sequences due to positive selection. But it really is pretty easy to think the
same for purifying selection.) Thus, the
close similarity between human and chimpanzee in fast evolving sequences
including the Y, which are all under purifying selection, merely indicates
common purifying selection rather than common ancestry. Our recent paper in
Science China shows that when using slow evolving sequences not under
selection, chimp and human can be shown to belong to separate clades with all three
great apes in the pongid clade.
3) If population A has high genetic diversity
while B low in most genome sequences, the typical interpretation today is that
A evolved longer than B and gave rise to B.
But this topology could be completely reversed if most sequences are
under purifying selection with A under more relaxed selection than B. Here the true topology as revealed by the
slow evolving sequences may show that B evolved first and has higher genetic diversity
in the informative sequences. We will
soon have a paper to this effect.
4)
Nothing is truly neutral. All variants,
being random and disorderly in origin, have a deleterious aspect. A major effect variant causing great harm never
has a chance to behave as neutral and is negatively selected immediately within
one generation after it emerges. In
contrast, most minor effect variants would exist as neutral for a long time or
many generations before being negatively selected when the accumulation of too many
such variants exceeds the maximum tolerable level that an organism can tolerate. Simple organisms can tolerate more. Therefore, only slow evolving sequences that
have variant numbers still below the maximum tolerable level are informative to
tree topology as well as timing. That
more slow evolving and hence more conserved sequences have apparently observed
neutral variants may seem counter-intuitive but actually makes sense. Since
changes in the slow evolving sequences take long time, they may be too slow to
meet adaptive needs to be under positive selection. Given the apparent slow rate and absence of
positive selection, they are also unlikely to reach excess levels to cause harm
or be under negative selection.
5)
Nearly all the ‘surprise’ results reported at the ASHG2012 meeting can be
easily explained by purifying selection and the MGD. Iceman Otzi from ~5000 years ago was found not
to show similarity to Europeans today in Central Europe in most fast evolving
sequences, but rather to Sardinians, which is considered surprising. Also, Iceman is related to other Central
European farmers (but not hunter gathers) from 5000 years ago. Even more surprising is that the mtDNA of
Iceman does not resemble any humans today.
Well, all these are evidence for the MGD. The sequences under purifying selection 5000
years ago are of course expected to be very different from those of today. My graduate students are right now busy verifying
that the Iceman will be inseparable from Central Europeans today in slow
evolving sequences. Another surprise mentioned at the meeting was that the timings of out of Africa modern humans to arrive at Spain (very close to Africa) and at Australia (very far) are about the same ~45K years ago. In truth, these timing are based on genetic diversity levels from fast evolving sequences.
6)
The fact that the Y chr trees are in general agreement with those of mtDNA
trees and genome average trees merely indicates that all these sequences share
something in common in terms of relevancy to phylogeny. They may be equally informative or equally
non-informative to phylogeny. If any one
of these sequences is shown to be non-informative, it would mean the same for
all these sequences. Thus the fact the y chr is under strong purifying
selection means the same for mtDNA and genome average.