Showing posts with label chimpanzee. Show all posts
Showing posts with label chimpanzee. Show all posts

Wednesday, May 6, 2009

The last straw for Molecular clock

I sent the following email yesterday to a small groups of colleagues.

Dear colleagues with an interest in human evolution (students, postdocs, and professors):

I thought you may find the following two short comments informative and interesting (both are posted on the Internet). 

1. Molecular clock at best explains half the story on ‘genetic equidistance’ and at worst explains none.

http://thegoldengnomon.blogspot.com/2009_04_01_archive.html

http://precedings.nature.com/documents/1751/version/2

Summary:

The molecular clock is widely known to be problematic but has yet to be put to rest by a knock out punch.  Here, a newly appreciated feature of the original result that provoked the clock and remains the only ‘evidence’ for it was used to ring the death knell for the clock.  The clock is a completely mistaken explanation for the equidistance result (sister species are equidistant to a simpler outgroup) first found by Margoliash in 1963 and should never have been invented in the first place if people had fully understood the equidistance result.  The result has two features.  One is equidistance in terms of percent identity, which originally provoked the clock and remains the only ‘evidence’ for it.  The other is the overlap feature where most of the mutant positions relative to the outgroup are shared between the two sister lineages.  This feature has been completely overlooked for the past 46 years and fully contradicts the clock/neutral theory.  The correct and complete explanation for the equidistance result is the MGD hypothesis that I recently proposed.  Since the clock is totally invalid, results based on the clock are automatically invalid, including the human-chimp split of 5 million years and other important molecular dating reported so far that contradict the fossil record.

 

2.  Convergent evolution, rather than common ancestry, explains the sequence similarity between human and chimpanzees.

http://precedings.nature.com/documents/2123/version/1#comments

http://thegoldengnomon.blogspot.com/

Summary:

It discusses one of the best molecular facts (newly reported in Nature this year) that simply cannot be reconciled in any way with the sister grouping of humans and chimpanzees but fully supports the sister grouping of humans and pongids.  My new molecular dating based on the MGD hypothesis gave a human-pongid spilt time of 19.2 million years, in full agreement with the fossil record.

Would appreciate your feedbacks,

Cheers,

Shi Huang

cc

John Hawks, Milford Wolpoff, Jeffrey Schwartz, Elwyn Simons, David Pilbeam, Michel Brunet, Gen Suwa, Morris Goodman, Christian Schwabe, Laura Katz, Gunter Wagner, Eugene Koonin, Phil Skell, Jerry Harris, Blair Hedges, David Baum, Walter Fitch, Joe Daniel, Sudhir Kumar, Leigh van Valen, James Cai, Laurence Hurst, Tobias Warnecke, David Lambert, Jason Stajich 

Tuesday, May 5, 2009

Evidence for an ancient adaptive episode of convergent molecular evolution

Castoe et al., “Evidence for an ancient adaptive episode of convergent molecular evolution”  PNAS, April 29, 2009, published on line, doi: 10.1073/pnas.0900233106

 

Abstract: …..These results indicate that nonneutral convergent molecular evolution in mitochondria can occur at a scale and intensity far beyond what has been documented previously, and they highlight the vulnerability of standard phylogenetic methods to the presence of nonneutral convergent sequence evolution.

 

I left a comment on the preprint version of this paper at Nature precedings. http://precedings.nature.com/documents/2123/version/1#comments

 

Indeed, convergent evolution is extremely common. The best illustration of this is a phenomenon I termed ‘genetic nonequidistance to a more complex outgroup’. Thus, relative to a complex outgroup such as human, some sister species from a simple clade are not equidistant to human. The more complex sister species is always closer to human than the simpler sister species. In all five cases (except plants) examined where difference in complexity of the sister species can be inferred (octopus vs. cockle, Terebratulina vs. Lingula, bird vs. snake, dragonfly vs. louse, and smut vs. yeast), the more complex species always show greater sequence similarity to humans.

Because these sister species are separated from humans for the same amount of time, their different sequence similarity to humans must be due to convergent evolution. Thus, sequence similarity to complex species or humans cannot be used to infer closer genealogy with humans.

The sister grouping of chimpanzees and humans really has no other non-ambiguous support other than sequence similarity as measured by percent identity. The premise for this approach has now been nullified by the phenomenon of genetic non-equidistance to a more complex outgroup despite equidistance in time or genealogy. The same premise for grouping an ape (chimpanzee) with human to the exclusion of another ape (orangutan) would equally justify the obviously absurd grouping of human with a mollusk (octopus) to the exclusion of another mollusk (cockle), or with a brachiopod (Terebratulina) to the exclusion of another brachiopod (Lingula), or with a reptile (bird) to the exclusion of another reptile (snake).

The molecular clock hypothesis, i.e., vastly different species have very similar mutation rates, is a tautological interpretation of the ‘genetic equidistance’ result. It is falsified by the ‘genetic nonequidistance’ phenomenon as discussed above. I have recently come up with the ‘maximum genetic diversity’ (MGD) hypothesis to explain equally well both the equidistance and the nonequidistance phenomenon. See my paper posted here, “Inverse relationship between genetic diversity and epigenetic complexity”.

Below is a paragraph from one of my recent manuscripts discussing one of the best facts (newly reported in Nature this year) that simply cannot be reconciled in any way with the sister grouping of humans and chimpanzees but fully supports the MGD hypothesis and the sister grouping of humans and pongids.

Consistent with low genetic diversity in humans, human specific segmented duplications show lower copy number polymorphisms in humans than chimpanzee specific segmented duplications do in chimpanzees [54]. Similarly, those duplications shared among human, chimpanzees, and orangutans, or those shared among human, chimpanzees, orangutans, and monkeys are also less polymorphic in humans than in chimpanzees, indicating clearly that duplications that are shared because of common ancestry are less polymorphic in humans than in chimpanzees. In contrast, the duplications shared between human and chimpanzees are equally polymorphic in humans and chimpanzees. This unusual result contradicts the sister grouping of humans and chimpanzees, because both the MGD and the bottleneck hypothesis would predict lower polymorphism in humans if these duplications are shared because of common ancestry. However, it is fully consistent with the interpretation that the shared duplications between humans and chimpanzees are not due to common ancestry but are due to common selection of independent duplications. Common selection leading to shared sequences is well established [55]. The MGD hypothesis interprets many of the shared sequences between human and chimpanzees as a result of common selection rather than common ancestry. The similar selection pressure leads to similar levels of polymorphism. This result is thus one of the best that simply cannot be reconciled in any way with the sister grouping of humans and chimpanzees but fully supports the MGD hypothesis and the sister grouping of humans and pongids.

Ref:
54. Marques-Bonet T, Kidd JM, Ventura M, Graves TA, Cheng Z, et al. (2009) A burst of segmental duplications in the genome of the African great ape ancestor. Nature 457: 877-881.
http://www.publicacions.ub.es/refs/micoshumans.pdf

55. Bull JJ, Badgett MR, Wichman HA, Huelsenbeck JP, Hillis DM, et al. (1997) Exceptional convergent evolution in a virus. Genetics 147: 1497-1507.

Thursday, April 30, 2009

Molecular clock at best explains half the story on ‘genetic equidistance’ and at worst explains none

The genetic equidistance result (sister species are equidistant to a simpler outgroup) has been interpreted by a tautology, the molecular clock hypothesis, which says that vastly different lineages have very similar mutation rates.  The neutral theory was invented to explain the molecular clock by postulating that the vast majority of residue differences between species are neutral mutations. 

 

On surface, the similar mutation rate idea seems to explain the equidistance result in terms of percent identity.  But one fatal weakness with this idea that was pointed out in my previous paper is that there is no independent evidence for this idea.  In contrast, there are ample evidence against this idea.  That observation alone has in part led me to invent the MGD hypothesis as the correct interpretation for the equidistance result.  The MGD interpretation came to me from logical reasoning based on basic biological principles.  Thus, I had deduced an important feature of the equidistance result from an axiom before I had a full grasp of the complete story of equidistance.  That feature is: most of the residue positions differing between one sister lineage and the outgroup are also different between another sister lineage and the outgroup.  In other words, suppose that sister species A and B are equidistant to the simpler outgroup C, where A and B has separated for much longer time than the time of separation between C and the common ancestor of A and B.  We would observe that most of the residue positions that differ between A and C are also different between B and C.  Below, I illustrate this fundamental feature of the equidistance result by using the example of cytochrome c which was used originally in 1963 to discover the equidistance result, with the baker’s yeast as the outgroup to the sister species of drosophila and human. 

 


Yeast blastp against drosophila:

Identities = 67/104 (64%), Positives = 78/104 (75%), Gaps = 0/104 (0%)

 

Yeast  5    AGSAKKGATLFKTRCLQCHTVEKGGPHKVGPNLHGIFGRHSGQAEGYSYTDANIKKNVLW  64

            AG  +KG  LF  RC QCHTVE GG HKVGPNLHG+ GR +GQA G++YTDAN  K + W

Droso  5    AGDVEKGKKLFVQRCAQCHTVEAGGKHKVGPNLHGLIGRKTGQAAGFAYTDANKAKGITW  64

 

Yeast  65   DENNMSEYLTNPKKYIPGTKMAFGGLKKEKDRNDLITYLKKATE  108

            +E+ + EYL NPKKYIPGTKM F GLKK  +R DLI YLK AT+

Droso  65   NEDTLFEYLENPKKYIPGTKMIFAGLKKPNERGDLIAYLKSATK  108

 

 

Yeast blastp against human:

Identities = 66/102 (64%), Positives = 79/102 (77%), Gaps = 0/102 (0%)

 

Yeast  6    GSAKKGATLFKTRCLQCHTVEKGGPHKVGPNLHGIFGRHSGQAEGYSYTDANIKKNVLWD  65

            G  +KG  +F  +C QCHTVEKGG HK GPNLHG+FGR +GQA GYSYT AN  K ++W

Human  2    GDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGYSYTAANKNKGIIWG  61

 

Yeast  66   ENNMSEYLTNPKKYIPGTKMAFGGLKKEKDRNDLITYLKKAT  107

            E+ + EYL NPKKYIPGTKM F G+KK+++R DLI YLKKAT

human  62   EDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKKAT  103

 

 

Drosophila blastp against human:

Identities = 80/102 (78%), Positives = 87/102 (85%), Gaps = 0/102 (0%)

 

Droso  6    GDVEKGKKLFVQRCAQCHTVEAGGKHKVGPNLHGLIGRKTGQAAGFAYTDANKAKGITWN  65

            GDVEKGKK+F+ +C+QCHTVE GGKHK GPNLHGL GRKTGQA G++YT ANK KGI W

human  2    GDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGYSYTAANKNKGIIWG  61

 

Droso  66   EDTLFEYLENPKKYIPGTKMIFAGLKKPNERGDLIAYLKSAT  107

            EDTL EYLENPKKYIPGTKMIF G+KK  ER DLIAYLK AT

human  62   EDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKKAT  103

 

 

Overlap variant positions among Sc, Dm, and Hs:

 

Sc     6     GSAKKGATLFKTRCLQCHTVEKGGPHKVGPNLHGIFGRHSGQAEGYSYTDANIKKNVLW  64

Sc=Dm        G  +KG  LF  RC QCHTVE GG HKVGPNLHG+ GR +GQA G++YTDAN  K + W

Dm     6     GDVEKGKKLFVQRCAQCHTVEAGGKHKVGPNLHGLIGRKTGQAAGFAYTDANKAKGITW  64

Sc=Hs        G  +KG  +F  +C QCHTVEKGG HK GPNLHG+FGR +GQA GYSYT AN  K ++W

Hs     2     GDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGYSYTAANKNKGIIW  60

Overlap       xxx  xx  xx  x         x         x   xx   x        xx xxx

 

Sc     65   DENNMSEYLTNPKKYIPGTKMAFGGLKKEKDRNDLITYLKKAT  107

Sc=Dm       +E+ + EYL NPKKYIPGTKM F GLKK  +R DLI YLK AT

Dm     65   NEDTLFEYLENPKKYIPGTKMIFAGLKKPNERGDLIAYLKSAT  107

Sc=Hs        E+ + EYL NPKKYIPGTKM F G+KK+++R DLI YLKKAT

Hs     62   GEDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKKAT  103

Overlap     x xxxx   x           x x    xxx x   x

 

 

 

The above alignment data show that yeast is approximately equidistant to drosophila (67/104 identity) and to human (66/102 identity).  If one carefully compares the alignments, one would find that among those 36 residue differences between yeast and human, 31 are also different between yeast and drosophila. 

 

This nearly complete overlap in mutated residue positions in two separate sister lineages is one of the two fundamental features of the genetic equidistance phenomenon (the other is of course the equidistance in terms of percent identity).  However, it, dubbed the overlap feature, has been completely ignored or overlooked in the past 46 years.  The molecular clock interpretation and the neutral theory were invented based on a complete ignorance of this feature.  They would not have been invented in the first place if people had paid attention to the overlap feature because they are clearly contradicted by this feature.  It is astonishing that this obvious contradiction has never been recognized for the past 46 years in a large field of study that has produced several dozen members of the National Academy of Sciences but is nonetheless completely misled by a false paradigm.

 

The molecular clock and the neutral theory cannot predict a majority of all mutant residue positions between yeast and human to be also mutant positions between yeast and drosophila.  The predicted number is at best 20 residue positions, far short of the observed 31.  This is easily calculated as follows:

 

As the above alignment shows, drosophila and human differ at 22 of 102 positions.  Among these, 17 drosophila or human positions are also different from yeast.  So, of the total 31 mutant/changed positions between yeast and human that are also altered between yeast and drosophila, 14 could be assigned to changes occurred during the time period when the common ancestor of human and drosophila has been separate from the yeast lineage but has yet to split out human and drosophila.  After the split of human and drosophila, the chance for a residue to be different between yeast and the human lineage or between yeast and drosophila or between drosophila and human is approximately 22/81 = 0.27.  (There are only 7 residues that are absolutely conserved among all life forms that contan cytochrome c.  So the positions that are changeable are 88 - 7 = 81).  The chance for the same residue position to be altered in both the yeast-human comparison and the yeast-drosophila comparison is 0.27 x 0.27 = 0.073.  Together with the 14 shared mutant positions accumulated in the common ancestor lineage of drosophila and human, this means that there should only be 14 + 6 =20 residue positions that are altered in both the yeast-human comparison and the yeast-drosophila comparison.

 

To get to 31, we must invoke that there are only 28 residues that are neutral (22/28 x 22/28 x 28 = 17).  This means that the observed distance between yeast and human or between yeast and drosophila or between human and drosophila represents nearly the maximum possible.  But a maximum cap concept on genetic distance is entirely missing in the practical application of the molecular clock and the neutral theory.  That concept is nonexistent in the past until the recent MGD hypothesis.

 

In short, while the molecular clock and the neutral theory may predict half of the equidistance result (equidistance in terms of percent identity), they cannot predict or are contradicted by the other half of the result where most of the mutant positions relative to the outgroup are shared between the two sister lineages.  Therefore, the molecular clock and the neutral theory are not at all valid explanation for the equidistance result.  This way of invalidating the existing theory did not occur to me until recently, which is why I did not include it in my previous paper that refutes the molecular clock interpretation of the equidistance result.  Ref. Huang, S.  “The genetic equidistance result of molecular evolution is independent of mutation rates.”

 

The MGD hypothesis is the only viable and complete explanation so far for the equidistance result.  It has proven to be the correct one as it easily passes the highest standard for a scientific theory, i.e. to explain all relevant facts and to have not a single factual and logical contradiction.  The example here with cytochrome c should provide the actual data for the simplistic illustration of the MGD explanation as shown in Table 1 of my MGD paper (Inverse relationship between genetic diversity and epigenetic complexity).   

 

BTW, an example of how the molecular clock interpretation was used in practice by the field to produce the famous 5 million year divergence time between human and chimpanzees, as first reported in 1969. Wilson AC, Sarich VM (1969) A molecular time scale for human evolution. Proc Natl Acad Sci U S A 63: 1088-1093.

 

Wilson and Sarich wrote in their 1969 paper:

“Table 1 shows that the four primate hemoglobins are about equally distinct in sequence from that of the horse. Therefore, the hemoglobins of monkeys on the one hand, and those of the apes and man on the other, have changed to about the same extent

since these species last shared a common ancestor.  These results are neither unique nor surprising. Others have already recognized that protein molecules often appear to have evolved in a regular fashion with respect to time. The bulk of the available sequence information is consistent with the hypothesis that for any given protein, such as hemoglobin, the probability of an amino acid substitution occurring in a given interval of time is the same in every lineage.”

 

The above shows that Wilson and Sarich interpreted the equidistance result (horse is equidistant to the primates in hemoglobins) by assuming the same mutation rate in every primate lineage.  From there, they went on to calculate a 5 million year split for humans and chimpanzees.  Now, given that I have proven that the molecular clock interpretation of the equidistance result is completely false, it follows that any result based on such interpretation is automatically false.  Indeed, my calculation based on the MGD hypothesis gave a human-pongid split time of 19.2 million years (manuscript in preparation).   

 

Acknowledgements:


I thank my college classmate Dr. Wei Shen for discussion and correcting a mistake on the number of overlap residues in the first draft of this assay. 

 

Saturday, April 11, 2009

Exceptional convergent evolution in a virus

This paper showed nicely that common selection can lead to extensive identity in DNA sequences. Thus, sequence comparison cannot always be used for inferring time of separation. This is exactly the point made by my MGD hypothesis. When we see a human and chimp identity of ~98%, we must first ask how much of that is due to common selection. The MGD says that there is a lot. The data are completely consistent with a pongid clade with human as the outgroup. Common selection for evolution of high intelligence could lead to more identity between human and chimp than between human and orangutan.

Genetics. 1997 Dec;147(4):1497-507. Links
Exceptional convergent evolution in a virus.

Bull JJ, Badgett MR, Wichman HA, Huelsenbeck JP, Hillis DM, Gulati A, Ho C, Molineux IJ.
Department of Zoology, Institute of Cellular and Molecular Biology, University of Texas, Austin 78712, USA. bull@bull.zo.utexas.edu
Replicate lineages of the bacteriophage phiX 174 adapted to growth at high temperature on either of two hosts exhibited high rates of identical, independent substitutions. Typically, a dozen or more substitutions accumulated in the 5.4-kilobase genome during propagation. Across the entire data set of nine lineages, 119 independent substitutions occurred at 68 nucleotide sites. Over half of these substitutions, accounting for one third of the sites, were identical with substitutions in other lineages. Some convergent substitutions were specific to the host used for phage propagation, but others occurred across both hosts. Continued adaptation of an evolved phage at high temperature, but on the other host, led to additional changes that included reversions of previous substitutions. Phylogenetic reconstruction using the complete genome sequence not only failed to recover the correct evolutionary history because of these convergent changes, but the true history was rejected as being a significantly inferior fit to the data. Replicate lineages subjected to similar environmental challenges showed similar rates of substitution and similar rates of fitness improvement across corresponding times of adaptation. Substitution rates and fitness improvements were higher during the initial period of adaptation than during a later period, except when the host was changed.



Also on the same topic
Genetics, Vol. 181, 225-234, January 2009, Copyright © 2009
doi:10.1534/genetics.107.085225

Parallel Genetic Evolution Within and Between Bacteriophage Species of Varying Degrees of Divergence
Jonathan P. Bollback*,1 and John P. Huelsenbeck

* Department of Biology, Evolutionary Biology, University of Copenhagen, 2100 Copenhagen Ø, Denmark and Department of Integrative Biology, University of California, Berkeley, California 94720

1 Corresponding author: Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, King's Bldgs., W. Mains Rd., Edinburgh, EH9 3JT, United Kingdom.
E-mail: j.p.bollback@ed.ac.uk
Parallel evolution is the acquisition of identical adaptive traits in independently evolving populations. Understanding whether the genetic changes underlying adaptation to a common selective environment are parallel within and between species is interesting because it sheds light on the degree of evolutionary constraints. If parallel evolution is perfect, then the implication is that forces such as functional constraints, epistasis, and pleiotropy play an important role in shaping the outcomes of adaptive evolution. In addition, population genetic theory predicts that the probability of parallel evolution will decline with an increase in the number of adaptive solutions—if a single adaptive solution exists, then parallel evolution will be observed among highly divergent species. For this reason, it is predicted that close relatives—which likely overlap more in the details of their adaptive solutions—will show more parallel evolution. By adapting three related bacteriophage species to a novel environment we find (1) a high rate of parallel genetic evolution at orthologous nucleotide and amino acid residues within species, (2) parallel beneficial mutations do not occur in a common order in which they fix or appear in an evolving population, (3) low rates of parallel evolution and convergent evolution between species, and (4) the probability of parallel and convergent evolution between species is strongly effected by divergence.