Friday, December 31, 2010

Nature news today: Human remains spark spat

Below is my comment at the Nature website on the Nature news report today "Human remains spark spat" about the latest 400000 year old modern human fossils from Israel.

The authors' conclusion is obviously the same as being reported by the press: "We offer the most reasonable conclusion based on the statistical evidence: that they represent the same population as the Skhul and Qafzeh finds, thus pushing the date for that type of early man back to a much earlier time." And Skhul and Qafzeh finds are by all account modern Homo Sapiens.

The Out of Africa model is based on the neutral theory interpretation of genetic diversity shown by the fast evolving sequences such as mitochondrial DNA and whole genome DNA which are mostly non coding and hence fast evolving. However, fast evolving sequences most of the time have reached maximum genetic diversity over evolutionary time scale, according to the newly proposed Maximum Genetic Diversity hypothesis (1), and are therefore no longer linearly related to time of divergence and cannot be informative to phylogenetic relationships. The correct approach is to use slowest evolving sequences such as non-synonymous changes in the most conserved or slowest evolving proteins. Try it and see which, the “Out of Africa" or "the Multiregional", will get the last laugh.

1. Huang, S. (2010) The Overlap Feature of the Genetic Equidistance Result—A Fundamental Biological Phenomenon Overlooked for Nearly Half of a Century. Biological Theory, 5: 40-52.

Thursday, December 30, 2010

Others have also noticed the difference between fast and slow evolving genes in phylogeny inference

The anthropology blogger Dienekes at had a recent post on Dec 30, 2010 noticing the major difference between slow and fast evolving genes in calculating the time of origin for humans.

It is reassuring to see that others have come to the same conclusion as I have regarding the difference between slow and fast genes. The mere existence of the difference between slow and fast evolving genes is not predicted by the modern evolution theory, which is at best incomplete since it does not cover the situation when maximum genetic distance has been reached. But maximum genetic diversity hypothesis covers it and is therefore a more complete theory. The slow clock method based on this hypothesis has produced phylogeny results dramatically different from the populars ones. The results show that humans separated from the pongid clade 17.3 million years ago. For details, read a preprint here,

Conclusion: all existing molecular interpretations of life trees are based on fast evolving genes and are therefore incorrect. We will not have a correct interpretation until we have reanalyzed the data using our slow clock method based on the MGD hypothesis.

Tuesday, December 28, 2010

Latest fossil evidence in support of our molecular dating of modern human origin

As described by my blog post on May 25, 2010, our molecular results based on the MGD hypothesis suggest that modern humans have been a single species since ~2 million years ago (manuscript in preparation). The multiregional hypothesis is correct and the out of Africa hypothesis is mistaken. This week's report of 400000 year old remains of modern humans fully support our work.

Middle pleistocene dental remains from Qesem Cave (Israel)
1. Israel Hershkovitz1,*, Patricia Smith2, Rachel Sarig1, Rolf
Quam3,4,5, Laura Rodríguez6, Rebeca García6, Juan Luis Arsuaga4,7, Ran Barkai8,
Avi Gopher8,*
Article first published online: 23 DEC 2010
DOI: 10.1002/ajpa.21446

See yahoo news report on this find here:

Researchers: Ancient human remains found in Israel

By DANIEL ESTRIN, Associated Press – Mon Dec 27, 6:13 pm ET
Israeli archaeologists said Monday they may have found the earliest evidence yet
for the existence of modern man, and if so, it could upset theories of the
origin of humans.

A Tel Aviv University team excavating a cave in central Israel said teeth found
in the cave are about 400,000 years old and resemble those of other remains of
modern man, known scientifically as Homo sapiens, found in Israel. The earliest
Homo sapiens remains found until now are half as old.

"It's very exciting to come to this conclusion," said
archaeologist Avi Gopher, whose team examined the teeth with X-rays and CT scans
and dated them according to the layers of earth where they were found.
He stressed that further research is needed to solidify the claim. If it does,
he says, "this changes the whole picture of evolution."

The accepted scientific theory is that Homo sapiens originated in Africa and
migrated out of the continent. Gopher said if the remains are definitively
linked to modern human's ancestors, it could mean that modern man in fact
originated in what is now Israel.

Sir Paul Mellars, a prehistory expert at Cambridge University, said the study is
reputable, and the find is "important" because remains from that critical time
period are scarce, but it is premature to say the remains are human.
"Based on the evidence they've cited, it's a very tenuous and frankly rather
remote possibility," Mellars said. He said the remains are more likely related
to modern man's ancient relatives, the Neanderthals.

According to today's accepted scientific theories, modern humans and
Neanderthals stemmed from a common ancestor who lived in Africa about 700,000
years ago. One group of descendants migrated to Europe and developed into
Neanderthals, later becoming extinct. Another group stayed in Africa and evolved
into Homo sapiens — modern humans.

Teeth are often unreliable indicators of origin, and analyses of skull remains
would more definitively identify the species found in the Israeli cave, Mellars

Gopher, the Israeli archaeologist, said he is confident his team will find
skulls and bones as they continue their dig.

The prehistoric Qesem cave was discovered in 2000, and excavations began in
2004. Researchers Gopher, Ran Barkai and Israel Hershkowitz published their
study in the American Journal of Physical Anthropology.

Friday, November 26, 2010

Very good points by Nigel Goldenfeld and Carl Woese

I just finished reading a very good recent paper by Nigel Goldenfeld and Carl Woese, Life is physics: evolution as a collective phenomenon far from equilibrium.

Carl Woese of course is famous for defining the Archaea bacteria in 1977.

Below I comment on some of their writings which I found remarkable.

Goldenfeld and Woese: "In short, a unified view prevents the unnecessary multiplication of hypotheses that is the sure sign of a lack of fundamental understanding (think epicycles!)."

Yes. The field of evolution is full of ad hoc hypotheses that is good for only one or a few phenomena. The molecular clock hypothesis is one.

Goldenfeld and Woese: "Thus, in this picture, evolution is es- sentially synonymous with population genetics. Genes are assumed to be the only dynamical variables that are tracked, and are associated with a fitness benefit that is difficult to define or measure precisely, but is quanti- fied by a fitness landscape that describes how the pop- ulation fitness depends on the genotype[56–59]. Traits are simply associated with genes, and gene interactions are often ignored, or at best handled through the fitness landscape[59, 60]."

Indeed, evolution is widely treated as the same as population genetics. Michael Lynch has this at his website "Nothing in evolution makes sense except in the light of population genetics". But this is not true at all. The overlap feature of the genetic equidistance result is the best evidence for a clear distinction between population genetics and macroevolution.

Goldenfeld and Woese: "Not only is the Modern Synthesis afflicted by strong interactions, but its very foundation is questionable. The evident tautology embodied by “survival of the fittest” serves to highlight the backwards-looking character of the fitness landscape: not only is it unmeasurable a priori, but it carries with it no means of expressing the growth of open-ended complexity[90] and the generation of genetic novelty. Thus, the Modern Synthesis is, at best, a partial representation of population genetics, but this on its own is a limited subset of the evolutionary process itself, and arguably the least interesting one."

Indeed, the modern evolution theory is only a theory of population genetics, which is the least important and interesting aspect of evolution. The more important one is the growth of open-ended complexity, which has now been described by my MGD hypothesis.

Goldenfeld and Woese: "Thus, although complexity is hard to define precisely and usefully, we regard the defining characteristic of complex- ity as the breakdown of causality[138]. Simply put, com- plex systems are ones for which observed effects do not have uniquely definable causes, due to the huge nature of the phase space and the multiplicity of paths."

I could not agree more. No unique genes define complexity. All genes contribute to complexity by reducing their level of random mutations, as described by the MGD hypothesis. It is futile to try to find a few genes responsible for disorders of the complex brain, as has been demonstrated by the failure of the GWAS effort. We are currently studying the genetic causes of common disorders of the brain as a way of further proving the MGD hypothesis.

Wednesday, November 24, 2010

Kimura said that the strongest evidence for the neutral theory is the molecular clock

A recent comment from a reviewer of my primate phylogeny manuscript again said the same thing as Scott Page has said over at Nature Precedings where my manuscript is posted that “the concept of the moleculr clock has little bearing on molecular phylogenetic studies that do not enforce a molecular clock criterion in their analyses.” Thus, there is a strong consensus in the field that even if the molecular clock is implausible which nearly everyone admits, one can still use other methods to infer molecular phylogeny within the overall paradigm originally started by the molecular clock concept. It is a self-deceiving illusion in my opinion, and I have just inserted the following into my revised manuscript (under review) to dispel it in the clearest way possible.

All traditional molecular phylogeny methods are based on the neutral theory. Early methods made explicit use of the molecular clock idea. But since the molecular clock has been widely known as implausible today, other methods have also been developed that are supposed to not to depend on the molecular clock. However, these methods are still based on the neutral theory and the neutral theory is in turn based on the molecular clock, as admitted by Kimura and Ohta: “Probably the strongest evidence for the theory is the remarkable uniformity for each protein molecule in the rate of mutant substitutions in the course of evolution.” (1). Therefore, we can conclude that all traditional molecular phylogeny methods are either explicitly or implicitly based on the molecular clock. The non-existence of the molecular clock in macroevolution as demonstrated by the overlap feature of the genetic equidistance result is sufficient to deem all traditional molecular phylogeny methods invalid for macroevolution.

1. Kimura M, Ohta T (1971) Protein polymorphism as a phase of molecular evolution. Nature 229: 467-479.

Saturday, November 20, 2010

Laws of Biology

As is well known, there are no laws in biology as in physics, which is the reason I called an idea of mine the First Axiom of Biology. Recently, however, I noted that there are in the recent literature four laws of biology. One paper is titled The three laws of biology by Trevors and Saier.

One book is titled Biology’s First Law by McShea and Brandon, which got a book review in this week’s Science magazine.

Here are the three laws of biology according to Trevors and Saier:
The First Law of Biology: All living organisms obey the laws of physics and chemistry.
The Second Law of Biology: All living organisms consist of membrane encased cells.
The Third Law of Biology: All living organisms arose in an evolutionary process.

Here is Biology’s First Law, also called “Zero-Force Evolutionary Law” (ZFEL) according to McShea and Brandon:
ZFEL (special formulation): In any evolutionary system in which there is variation and heredity, in the absence of natural selection, other forces, and constraints acting on diversity or complexity, diversity and complexity will increase on average.

Here is what ZFEL supposed to mean:
Imagine a yard containing a number of trees, and imagine that the wind blows from each point of the compass with equal probability. Come autumn, the result will be an increase in the dispersal of the leaves over time. This, they suggest, is a zero-force state because there are no directional forces acting on the leaves. Yet there is a change over time (unlike the phenomenon described by the law of inertia in physics)—the leaves that were originally clustered about the trees become more dispersed. And if an evolutionary system is similarly in a zero-force state, it too will experience an increase in divergence over time.

If there are weaknesses to these laws, it would be formost in my opinion that they are all empirically based and not axioms or self evident. No one could have come up with these laws from priori reason without knowing a lot of biology details. But it seems that most fundamental laws should be simple and self evident. Newton's laws of mechanics are all self evident and Newton called them Axioms.

Second, a major flaw of ZEEL is that it equals random diversity with complexity. It fails to recognize my First Axiom of Construction or First Axiom of Biology that that random diversity must be suppressed in order for complexity/order to advance. Complexity means order which is intuitively obvious. The human brain is the most complex. It is also the most ordered as it is capable of tasks such as mathematics. Fallen leaves become more diverse with time but do not become more ordered or reach a higher level of complexity with higher degree of order. A junk yard become more diverse with time but do not become more ordered and complex. Evolution towards higher complexity is a process from disorder to order or a process of decreasing entropy. I have found that the entropy loss is reflected by the loss of randomness/mutations in the genetic building blocks.

Third, none of the Three Laws of Biology proposed by Trevors and Saier is necessarily true. It could be easily argued that it is realistically possible for exceptions to be found in the future. There may be already exceptions if we just take a look at the existing data from a different perspective and the popular perspective is based on assumptions rather than proven truth. We have not solved the mystery of evolution and the origin of life. What we know about life may look impressive in terms of amount of data but we really know very little about what they really mean.

Finally, these four laws fail as a scientific theory by the only standard that counts which is to explain facts and predict new experiments. For example, they all have nothing to say about molecular evolution phenomenon and cannot compet/replace the neutral theory. And yet the neutral theory is inadequate for macroevolution and fails to explain most key molecular macroevolution phenomenon. Without understanding evolution, one cannot understand biology. Without understanding molecular evolution, one cannot understand evolution. For a law of biology to have little to say about molecular evolution, it can only be a trivial law.

Thus, it is safe for me to conclude that the first real fundamental law in biology is the First Axiom of Biology.

Thursday, October 14, 2010

Models of micro- and macro-evolution and the overlap feature

I hope the following figure serves as an easy to grasp illustration of the evolution process according to the MGD hypothesis.

A. Microevolution

..........................................B ________________________ B


..........................................A ________________________ A

B. Macroevolution

..........................................C ________________________ C (more complex)


..........................................A ________________________ A

Figure 1. Evolutionary events according to the MGD hypothesis. A 10 amino acid peptide with each position represented by a number is used to illustrate the mutation events during evolution. X represents any amino acid. If two species have the same number at the same position, they share the same amino acid. If they each have X at the same position, they have different amino acid at that position. Positions 0-3 represent non-changeable residues due to functional or epigenetic restriction in all species. Position 0-5 represent non-changeable positions only in the more complex species C in the panel B model of macroevolution

Panel A. Microevolution. The ancestor species A’ undergoes microevolution and produces a pair of sister individuals A and B at some point during its evolution. Sister species A and B starts by sharing the same sequence for an othologous gene at the beginning of speciation. The genetic distance between of A and B gradually increases until reaching a plateau or maximum genetic distance (6 difference in this case).

Panel B. Macroevolution. The ancestor species A’ undergoes microevolution and gradually reaches some level of genetic diversity, during which time nearly every genome variation allowed within the MGD or compatible with the survival of A’ may have a chance to exist for a while. When one of these genome variations also happens to be compatible with a higher level of epigenetic complexity such as the genome as shown for sister individuals A and C at beginning of speciation, a punctuational increase in epigenetic complexity would take place in one of these sisters such as C. At the beginning of speciation sisters A and C have the same sequence for an othologous gene but only C has undergone an increase in epigenetic complexity, which in turn reduced the number of changeable positions from 6 to 4. After the epigenetic speciation phase at the beginning of speciation, the genetic microevolutionary phase immediately follows that would gradually create greater genetic distance between A and C. C can only undergo substitutions in a maximum of 4 positions while A can change in a maximum of 6. And the changeable positions of C largely overlap with those of A but not vice versa.

Based on this Figure, I can now better describe below the overlap feature and the distinction between the MGD hypothesis and the modern evolution theory, in a way that does not require readers to read my previous papers. This is part of an introduction section for a new manuscript that I am working on.

The genetic equidistance result, the most important and remarkable phenomenon of molecular evolution, shows that different species are approximately equidistant to a simpler outgroup in protein sequence similarity, as first reported by Margoliash in 1963 [1]. This result, together with those of Zuckerkandl and Pauling in 1962, inspired the molecular clock and in turn the Kimura neutral theory of macroevolution, which has been the foundation for the field of molecular evolution ever since its inception [2,3]. However, the molecular clock or constant substitution rate interpretation of the genetic equidistance result is in fact a tautology since it has not been verified by any independent observation and has on the contrary been contradicted by a large number of factual observations [4,5,6,7,8,9,10,11,12,13,14,15].

Recent work shows that the genetic equidistance result has in fact another characteristic, the overlap feature, which has been completely overlooked for nearly half-a-century [16]. A position where two or more species have each had a substitution event is termed an overlap position (Figure 1A, species A and B have 6 overlap positions). The genetic equidistance phenomenon minimally requires three species for sequence alignment, including two sister species and an outgroup that is not more complex. The overlap feature shows a large number of overlapped mutant positions where any pair of these three species is different in sequence. If after speciation, two species randomly accumulate substitutions with similar rate as assumed by the neutral theory, then the chance for a substitution in one species to occur at the same overlap position where the other species also has a substitution should largely follow chance or probability theory. Indeed, for microevolution of similar species such as among different strains of yeasts, the number of overlap positions relative to the total number of substitutions is small and consistent with probability calculation based on the neutral theory. However, for macroevolution of distinct species of different biological complexity such as yeast versus drosophila or orangutan versus human, the number of overlap positions is much greater than expected by chance. Thus, the overlap feature is one of the best pieces of evidence for a clear distinction between macroevolution and microevolution, where macroevolution is mostly about major changes in organismal complexity whereas microevolution is not.

The modern evolution theory consists of the Neo-Darwinian theory of natural selection and the neutral theory. The Neo-Darwinian theory is largely inadequate for understanding molecular evolution. As a result, the neutral theory, which trivializes natural selection and disconnects genotypes and phenotypes, was invented in order to at least have an ad hoc understanding of molecular evolution. While the neutral theory was originally only a population genetics theory, it was turned into a macroevolution theory by Kimura when he used it to explain the molecular clock, which treats macroevolution the same as population genetics [17]. However, as the overlap feature convincingly shows, the molecular clock and the neutral theory, while largely a complete theory for microevolution and population genetics, is not so for macroevolution and should never have been applied to macroevolution in the same way as to microevolution if people had not overlooked the overlap feature.

Unlike the modern evolution theory, the Maximum Genetic Diversity (MGD) hypothesis tightly unites genotypes and phenotypes and explains all the major facts of evolution in a coherent fashion via a single universal theme [7,18]. It is a necessary deduction of the First Axiom of Biology, which states that there exists an inverse relationship between genetic diversity and epigenetic complexity. Genetic diversity here is defined as percent position difference in aligned sequence in a homologous protein or DNA, which is largely contributed by point mutations. Epigenetic complexity is defined by the total number of cell types and epigenetic molecules, which is largely equivalent to our naïve notion of organismal complexity and consistent with an independent calculation of organismal complexity based on information theory [19]. It is common sense that genetic diversity cannot increase indefinitely with time and has a maximum limit being restricted by function or epigenetic complexity. A gene may function in many different cell types or epigenetic states. The more cell types in which a gene functions, the more functions it performs and the more functional constraints on the genetic diversity/mutation of the gene. The maximum genetic diversity of simple organisms is greater than that of complex organisms. The idea of functional constraints on sequence variation is also a well-accepted concept within the traditional theoretical framework. What is missing there however, which the MGD hypothesis now provides, is the intuitive idea supported by numerous facts and yet to be contradicted by any that different species have different functional/epigenetic constraints with simple species having less.
The maximum genetic distance concept of the MGD hypothesis appears superficially similar to the saturation idea within the traditional theoretical framework. However, there is a key difference. With the saturation idea, people can do corrections and change a 10% non-identity into 20% distance while still believing that such distance can go on to increase without a maximum cap, just that the increase is not linear and needs to be corrected for multiple hits. But the maximum cap concept says that the 10% non-identity is the maximum possible and will stay unchanged once reached. In other words, the saturation idea assumes that two sequences will continue to diverge, but at some point substitutions will overwrite previous substitutions, making the ability to discern continuing divergence difficult and requiring corrections. The MGD indicates that there is indeed a maximum divergence that can occur independent of saturation because of functional/epigenetic constraints. Saturation does not take into account the well known non-independence of basepairs in a sequence and the functional/epigenetic constraints on sequence, but MGD does take this into account and thus provides a stronger theoretical framework.

The MGD hypothesis defines macroevolution and microevolution differently from the standard definition and considers them distinctly different. Macroevolution involves major changes in epigenetic complexity or organismal complexity while microevolution does not although it may involve minor epigenetic changes without a major effect on complexity. Macroevolution involves a fast and punctuational epigenetic event whereas microevolution is largely a slow process of random point mutations. Macroevolution, as shown for two splitting species A and C in Figure 1B, automatically includes microevolutionary mechanism as part of the speciation process since the events following the punctuational epigenetic change are largely microevolutionary. Thus macroevolution consists of two different phases with distinctly different mechanisms, the epigenetic punctuational phase and the subsequent genetic microevolutionary phase (Figure 1B). Thus the MGD hypothesis but not the modern evolution theory predicts both punctuation and stasis at the level of epigenotypes and in turn at morphological levels, which is well supported by the major patterns of the fossil record [20].

While molecular changes in the epigenetic phase of macroevolution certainly involves some DNA changes such as chromatin reorganizations and gain/loss of genes encoding epigenetic molecules, such changes are obviously also about epigenetic reorganizations. The MGD hypothesis includes the proven virtues of the modern evolution theory as a component specific for microevolution (Figure 1A) as well as for the microevolutionary phase of macroevolution over time scales that are not yet long enough for a slow evolving gene to reach maximum distance (Figure 1B). For the epigenetic phase of macroevolution towards greater complexity, however, the MGD hypothesis suggests that the genetic diversity of the more complex species would be reduced by the epigenetic change (Figure 1B). In contrast, the modern evolution theory assumes that the same mechanism applies to both macroevolution and microevolution and that there is no suppression of mutations accompanying an increase in organismal complexity. It also implicitly assumes no limit on genetic distance/diversity no matter how long evolution has been going on or how fast mutation rate has been for certain fast evolving genes. These unproven assumptions implicitly negate the First Axiom of Biology and are intuitively implausible and have met with only contradicting factual observations. .

The overlap feature of the genetic equidistance phenomenon is one of the best pieces of evidence for the MGD hypothesis and against the modern macroevolution theory [16]. It is a well-established observation as well as intuitively sensible that most of the changeable or non-constrained positions in a gene in a complex species are also changeable in a less complex species (Figure 1B). When MGD has been reached during macroevolution, most of the changeable positions in any species would have undergone substitutions, thus leading to a large number of overlap positions close to the total number of actual substitutions and much greater than expected by chance (Figure 1B, showing 4 overlap positions out of 4 total actual substitutions in species C). Here, some of the non-changeable positions in a complex species would overlap with the changeable positions in a simple species (Figure 1B, position 4-5). These non-changeable positions do not undergo substitution during evolution but sequence alignments may not reveal this. In fact, the modern evolution theory treats sequence difference between two species as equally contributed by substitutions in each species, which is true only for microevolution (Figure 1A) or for the microevolutionary phase of macroevolution when the MGD of the more complex species has not yet been reached (Figure 1B). For macroevolution over long time when fast evolving genes have reached MGD, the MGD hypothesis suggests that sequence difference in these fast genes between two species of different complexity is a reflection of the MGD of the simple species and is largely caused by substitutions in the simple species since the complex species would undergo less substitutions (Figure 1B, the maximum distance between A and C is 6 and is equal to the MGD of the simple species A).

For microevolution of similar species over long evolutionary time so that MGD has been reached, one also observes a large number of overlap positions close to the total number of actual substitutions and much greater than expected by chance (Figure 1A showing 6 overlap positions out of 6 total actual substitutions). In contrast, for microevolution of short evolutionary time scale or for slow evolving genes prior to reaching MGD, the number of overlap positions relative to the total number of actual substitutions would be small and consistent with calculation from probability theory within the traditional theoretical framework. For example in Figure 1A, if only 3 substitutions have occurred in each species A and B out of 6 changeable positions, the number of overlap positions can be calculated as 3/6 x 3/6 x 6 = 1.3, which is much smaller than the total number of actual substitutions.

The MGD hypothesis is unusual because it actually consists of two different components, one is a genetic mechanism for microevolution and for the microevolutionary phase of macroevolution and the other is an epigenetic mechanism for the epigenetic punctuational phase of macroevolution. But these two are not disconnected and are in fact tightly linked by an inverse relationship as described by the First Axiom of Biology. This relationship dictates that one must be suppressed in order for the other to advance. Thus, point mutations are good for microevolution and for leading to a genotype suitable for a higher level of epigenetic complexity, which is necessary for the epigenetic phase of macroevolution to take place, but must be suppressed if increase in epigenetic complexity is to be maintained during the microevolutionary phase of macroevolution (Figure 1B). Conversely, epigenetic complexity must not increase in order for point mutations to take care of adaptive changes.


1. Margoliash E (1963) Primary structure and evolution of cytochrome c. Proc Natl Acad Sci 50: 672-679.
2. Zuckerkandl E, Pauling L (1962) Molecular disease, evolution, and genetic heterogeneity, Horizons in Biochemistry; Kasha M, Pullman B, editors. New York: Academic Press.
3. Kumar S (2005) Molecular clocks: four decades of evolution. Nat Rev Genet 6: 654-662.
4. Huang S (2009) Molecular evidence for the hadrosaur B. canadensis as an outgroup to a clade containing the dinosaur T. rex and birds. Riv Biol 102: 20-22.
5. Huang S (2008) Ancient fossil specimens are genetically more distant to an outgroup than extant sister species are. Riv Biol 101: 93-108.
6. Huang S (2008) The genetic equidistance result of molecular evolution is independent of mutation rates. J Comp Sci Syst Biol 1: 092-102.
7. Huang S (2009) Inverse relationship between genetic diversity and epigenetic complexity. Preprint available at Nature Precedings
8. Pulquerio MJ, Nichols RA (2007) Dates from the molecular clock: how wrong can we be? Trends Ecol Evol 22: 180-184.
9. Laird CD, McConaughy BL, McCarthy BJ (1969) Rate of fixation of nucleotide substitutions in evolution. Nature 224: 149-154.
10. Jukes TH, Holmquist R (1972) Evolutionary clock: nonconstancy of rate in different species. Science 177: 530-532.
11. Goodman M, Moore GW, Barnabas J, Matsuda G (1974) The phylogeny of human globin genes investigated by the maximum parsimony method. J Mol Evol 3: 1-48.
12. Langley CH, Fitch WM (1974) An examination of the constancy of the rate of molecular evolution. J Mol Evol 3: 161-177.
13. Li W-H (1997) Molecular evolution. Sunderland, MA: Sinauer Associates.
14. Nei M, Kumar S (2000) Molecular evolution and phylogenetics. New York: Oxford University Press.
15. Avise JC (1994) Molecular markers, natural history and evolution. New York, NY: Springer.
16. Huang S (2010) The overlap feature of the genetic equidistance result, a fundamental biological phenomenon overlooked for nearly half of a century. Biological Theory 5: 40-52.
17. Kimura M (1968) Evolutionary rate at the molecular level. Nature 217: 624-626.
18. Huang S (2008) Histone methylation and the initiation of cancer, Cancer Epigenetics; Tollefsbol T, editor. New York: CRC Press.
19. Jiang Y, Xu C (2010) The calculation of information and organismal complexity. Biology Direct 5: 59 doi:10.1186/1745-6150-1185-1159.
20. Gould SJ, Eldredge N (1993) Punctuated equilibrium comes of age. Nature 366: 223-227.

Saturday, August 14, 2010

Some paper concludes that complex organism has more functional bases

The conclusion of a recent paper last week, in particular this last sentence from the paper’s abstract, says: "This suggests that, rather than genome size or protein-coding gene complement, it is the number of functional bases that might best mirror our naïve preconceptions of organismal complexity." (1)

The paper is still in part based on the neutral theory that is not true in my opinion for macroevolution. Thus, it concludes that human has about 90% neutral bases while only 10% constrained functional bases. But my study based on the First Axiom of Biology suggests that human has only about 0.1% neutral bases, equivalent to the number of SNPs we find in humans (manuscript in preparation).

So, if the paper’s methodology and interpretation are in part based on the neutral theory, it would not be appropriate to consider its conclusions valid or meaningful. But it is still interesting to see that someone could somehow come to a conclusion that is at least partly in line with the First Axiom of Biology, which says that genetic diversity is inversely related to epigenetic complexity (2). The more complex the organism, the less random variation in the building blocks such as DNA. Or, the more complex the organism, the more the functional bases, and the less the neutral bases.

1. Meader, S., Ponting, C.P., and Lunter, G. Massive Turnover of Functional Sequence in Human and Other Mammalian Genomes, (2010) Genome Research. Published on line August 6, 2010.

2. 1. Huang, S.(2009) Inverse relationship between genetic diversity and epigenetic complexity. Preprint available, Nature Precedings;

Here is the abstract of the paper:
Despite the availability of dozens of animal genome sequences, two key questions remain unanswered: first, what fraction of any species‟ genome confers biological function, and second, are apparent differences in organismal complexity reflected in an objective measure of genomic complexity? Here, we address both questions by applying, across the mammalian phylogeny, an evolutionary model that estimates the amount of functional DNA that is shared between two species‟ genomes. Our main findings are, first, that as the divergence between mammalian species increases, the predicted amount of pairwise shared functional sequence drops off dramatically. We show by simulations that this is not an artefact of the method, but rather indicates that functional (and mostly non-coding) sequence is turning over at a very high rate. We estimate that between 200 and 300 Mb (~6.5 – 10%) of the human genome is under functional constraint which includes 5-8 times as many constrained non-coding bases than bases that code for protein. By contrast, in D. melanogaster we estimate only 56-66 Mb to be constrained, implying a ratio of non-coding to coding constrained bases of about 2. This suggests that, rather than genome size or protein-coding gene complement, it is the number of functional bases that might best mirror our naïve preconceptions of organismal complexity.

Thursday, July 15, 2010

New fossil finds on ape-monkey common ancestor supports my molecular dating while contradicts all others'

New Oligocene primate from Saudi Arabia and the divergence of apes and Old World monkeys. Iyad S. Zalmout, William J. Sanders, Laura M. MacLatchy, Gregg F. Gunnell, Yahya A. Al-Mufarreh, Mohammad A. Ali, Abdul-Azziz H. Nasser, Abdu M. Al-Masari, Salih A. Al-Sobhi, Ayman O. Nadhra, Adel H. Matari, Jeffrey A. Wilson & Philip D. Gingerich. Nature 466, 360–364 (15 July 2010)

This new paper on a common ancestor of ape-monkey from 29-28 million years ago in Nature this week fully supports my molecular dating as found in this preprint here (, while contradicts all other previous dating results on the ape-monkey divergence time. The dating in my paper is 29.7 million years (MY). Dating results by others using completely different, in my view flawed, methodology are self-conflicting and either too late (23 MY) or too early (30-35 MY). Any methodology that can turn solid factual data like DNA into conflicting interpretation of reality has of course self-proven itself false.

My paper has been through a number of submissions and I have not seen a valid criticism on the key points of the paper. The paper is now in the hands of a journal editor and below is part of my cover letter explaining my paper, which should help people understand the difference between my method/result and others’ and why others’ are incorrect from both a theoretical point of view and a practical point of view that they are contradictory to the new fossil finds.

Molecular phylogeny methods are based on the neutral theory. But the neutral theory should never have been invented in the first place for macroevolution if people had not overlooked the overlap feature of the genetic equidistance result that originally inspired the molecular clock and in turn the neutral theory. More on this exceptional new finding, see my newly published paper, Huang S (2010) The overlap feature of the genetic equidistance result, a fundamental biological phenomenon overlooked for nearly half of a century. Biological Theory 5: 40-52.

Since the neutral theory has no concept of a maximum distance, all these methods include a large amount of sequence alignment information that are non-informative and hence contribute to the high noise level that can sometimes overwhelm the signal. Also, these methods require false or uncertain assumptions that treat macroevolution the same as microevolution. It is thus fully expected that these methods cannot possibly produce a true molecular phylogeny of macroevolution. In fact, they have all self-proven themselves incorrect by repeatedly turning solid factual data into conflicting interpretations of reality, one of which must be false. The data in molecular phylogeny is just sequence facts and cannot possibly be wrong so long one is not making sequencing errors. Thus the only way to produce a false result or conflicting results in molecular phylogeny is through an incorrect method, including any method that does not have any correct means and principles of identifying only the informative data. One good example of endless conflicting results produced by the existing methods is the position of tarsiers, which some studies group with prosimians whereas others with simians, despite the fact that all these studies used the same kind of method but just different set of genes. A correct method should have ways of selecting the informative data and either produce only correct results or no results if informative data are not available.

The manuscript here used a new method ‘the slow clock’ to resolve key questions of primate phylogeny. Since the method has no uncertain assumptions and uses only informative genes, the slow evolving ones not yet reaching maximum genetic distance, it is immune to turning factual data into false or conflicting interpretation of reality. Indeed, the new primate molecular phylogeny here produced by the new method matches closely with the original interpretation of the fossil records by paleontologists.

You don’t really need to be reminded of this of course but I still suggest that you must use the highest standard of science to compare my story versus the existing theory/methodology. One must judge a story only by how contradiction-free it is. It is really the minimum scientific standard and the only practical way to distinguish a scientific truth from a religious belief. Thus, you only need to give me one single contradiction to my theory/methodology for me to withdraw my manuscript. That by the way should also be the only scientific way to reject the manuscript. I am eager to have the reviewers to help me either improve the manuscript or thrash it.

The following novel results in my manuscript are the contradictions to the existing theory/methodology but are not to mine. You and the reviewers must express your views on these results or offer an explanation if it happens to be different from mine. 1) Chimpanzee is closer to orangutan or gorilla than human is in DNA. 2) Slow and fast evolving genes produce different phylogeny. 3) The clock/neutral theory is a mistaken interpretation of the equidistance result of Margoliash. 4) Given 3), the existing interpretation of ape-human relationship is based on false theory and cannot possibly be true. One simply cannot imagine that the field has been all along on the right track when a major mistake had been committed right from the start. Neither can one imagine that the field can continue business as usual now that the mistake has finally been caught after nearly half of a century. Essentially no conclusion or interpretation on macroevolution from the molecular evolution field in the past half of a century can be regarded as correct or conclusive. 5) The fact that octopus is closer to human than cockle is, or that bird is closer to human than snake is in DNA contradicts the existing theory/methodology. The fact that chimpanzee is closer to human than orangutan or gorilla is therefore cannot be used to infer closer genealogy to human, just like one cannot infer closer genealogy between bird and human than between snake and human.

Finally, the newest version of my paper has a lay abstract as follows:

Primate phylogeny: molecular evidence for a pongid clade excluding humans and a prosimian clade containing tarsiers

Author Summary:

Molecular phylogeny methods are based on the neutral theory of evolution, which was originally inspired in a large part by the genetic equidistance result of Margoliash in 1963. But the neutral theory should never have been invented in the first place for macroevolution if people had not overlooked the overlap feature of the genetic equidistance result. Therefore, the field of molecular phylogeny has been on the wrong track all along, and a complete reevaluation of all molecular phylogeny results is in order. The maximum genetic diversity hypothesis is a more coherent and complete account of evolution, and was here used to resolve key questions of primate phylogeny. The analysis shows that humans are genetically more distant to orangutans than African apes are and separated from the pongid clade 17.3 million years ago. Also, tarsiers are genetically closer to lorises than simian primates are, suggesting a tarsier-loris clade to the exclusion of simian primates. The validity and internal coherence of the primate phylogeny here were independently verified. The results as a whole show a remarkable and unprecedented concordance between molecules and fossils.

Monday, July 5, 2010

Paper on the overlap feature is now published

The Overlap Feature of the Genetic Equidistance Result—A Fundamental Biological Phenomenon Overlooked for Nearly Half of a Century no access
Shi Huang
Biological Theory Winter 2010, Vol. 5, No. 1: 40–52.

Tuesday, May 25, 2010

A sister relationship between European and an East Asian-African clade

I gave a talk on May 14, 2010 at the Institute of Paleontology and Paleoanthropology, Chinese Academy of Sciences, titled "Molecular evidence for the multiregional hypothesis of modern human origin and a sister relationship between European and an East Asian-African clade". Professor Xinzhi Wu, one of the major proponents of the multiregional hypothesis, was my host. see this website for the talk announcement:

I am working on the manuscript. The key findings are 1) for non-synonymous mutations or SNPS in 190 slow evolving genes, the greatest genetic diversity or distance was found within Europeans. In contrast, for synonymous mutations in these same genes or for mutations in non-coding regions, the greatest diversity was found in Africans, a well known result reproduced in our study. But as shown by the MGD hypothesis, only slow evolving genes that are in the linear range of accumulating mutations and not yet reaching maximum distance are informative to phylogeny. 2) the distance in slow evolving genes between Europeans and non-Europeans translates into a divergence time of ~2.4 million years, in good agreement with the first fossil of the genus homo. Thus, homo has been a single species since 2.4 million years ago, supporting the multiregional hypothesis. The largest distance between Asian and African is similar to the deepest distance within Europeans and translates to 2.16 million years of separation. The deepest distance within Asians is similar to that within Africans and corresponds to 1.86 million years of separation, well consistent with the first migration out of Africa ~1.9 million years ago. 3) Europeans show distinct SNP structure in non-synonymous SNPs from non-Europeans.

The recent Neandertal genome paper in Science, contrary to the claim by the paper, fully falsifies the out of Africa plus interbreeding model. First, the paper found Neandertal to be 4-6 times more distant to chimpanzee than extant humans are, fully supporting my earlier paper on all known informative fossil sequences (1). These authors are masters of cherry picking like most in the field and downplay this finding by stating it is due to sequencing errors. If an unexpected result is due to such errors, can one have any confidence on their other results? Second, from Figure 5a of that paper, it is shown that some fast evolving sequences shows a distance between a modern European (ref genome) and Neandertal to be, say 0.22, while a distance between two modern Europeans (the ref genome and Venter genome) to be 0.157. Since the separation between European and Neandertal according to the present paradigm is 360K years while the separation between two Europeans is at most ~60k years, given a distance of 0.22 between European and Neandertal, we would expect to see a distance of less than 0.037 between two Europeans. Well, such anomaly was simply ignored by the paper. Finally, the paper notes the striking finding that equal intermixing of Neandertals with Europeans and Asians did not translate into equal physical resemblance of Europeans and Asians to the Neandertals. Why such disconnection between genotype and phenotype? All modern molecular biology, as well as the MGD hypothesis and common sense, have repeated shown that there is an inseparable unity between genotypes and phenotypes.

So the bottom line is that the Neandertals were the ancestors of Europeans and have mostly nothing to do with Asians. We are analyzing the Neandertal genome to find direct evidence for this. The small amount 0.1-0.4% similarity of Asians with Neandertals are due to convergent evolution, just like the greater similarity between Europeans and Asians in fast evolving genes are due to the same process. Or the greater similarity between humans and chimpanzees than between humans and orangutans are due to the same process, where chimpanzees and orangutans belongs to the pongid clade to the exclusion of humans.

1. Huang, S. (2008) Ancient fossil specimens are genetically more distant to an outgroup than extant sister species are. Rivista di Biologia / Biology Forum 101: 93-108.

Tuesday, March 16, 2010

Genetic equidistance as shown by a list of most conserved proteins

I am reading a popular textbook on bioinformatics, “Bioinformatics and Functional Genomics” 2009 by Jonathan Pevsner, in preparation for a course I am going to teach. On the topic of evolution and phylogeny, the book shows a table of a list of 20 most conserved proteins found in yeasts, worms, and humans. The table was taken from an old paper by the Bork group (1). The original table was meant by those authors to demonstrate protein conservation. They have no intention to show the genetic equidistance result and made no comment of it. But note how uniformly striking that yeast is approximately equidistant to worms and humans in all 20 proteins listed. Those proteins happen to be among the most conserved, although the equidistance phenomenon is not related to how conserved a protein is. I cite this table as a piece of independent evidence to support my claim that the genetic equidistance result is a nearly universal feature of all proteins (2). People simply cannot avoid encountering it.

1. Copley RR, Schultz J, Ponting CP, Bork P. (1999) Protein families in multicellular organisms”, Current Opinion in Structural Biology, 9:408-415.

2. Huang S. (2008) The genetic equidistance result of molecular evolution is independent of mutation rates. J Comp Sci Syst Biol; 1:092-102.

Friday, February 19, 2010

Africans and East Asians are strikingly similar

The maximum genetic diversity (MGD) hypothesis considers the molecular clock and the neutral theory incorrect for macroevolution. Thus the genetic relationship or SNP diversity data among human races has yet to be correctly interpreted. We are presently working on a correct one based on the MGD hypothesis.

This week Nature published genome sequences of a few South Africans. "Complete Khoisan and Bantu genomes from southern Africa" By Stephan C. Schuster et al. Nature Feb 18, 2010, 463: 944

The paper showed a few African pictures as shown above (a, KB1/Tuu speaker; b, NB1/Hoansi; c, TK1/hoansi) It strikes me that they look just like Chinese if one disregards the curly hair and skin color. I post two Chinese pictures for comparison (D, famous 1980 oil painting "Father" by Luo Zhongli; E, a random Chinese photo from the internet).

Based on facial features, it looks extremely reasonable that Africans and East Asians belong to a clade to the exclusion of Europeans. Below is a comparison of morphological features among the three major human populations or races:



East Asians

Eye color brown



Skin color black-brown



Hair color black



Full lip



Nose wide/low



Cheekbones large



Chin less protruding



Broad face



Mandible angle



Teeth larger



Less hairy



Brow ridge

Large (primitive)

In between


Skull length/shape

Long and narrow

In between

Short and wide

Shovel teeth


In between (10%)


It is now up to us and the MGD hypothesis to come up with a story that would show complete harmony/unity between morphological features and molecular genetic features. Stay tuned!

Monday, February 8, 2010

The universe evolves from simple to complex, concludes a physicist

I recently read the book "The Wrinkles in Time" by the Noble Laureate physicist George Smoot. I found his view of evolution of the universe from simple to complex to be in complete harmony with the facts of biological evolution. It seems to me there is a universal law of evolution that underlies all changes with time, regardless of life or non-life. There has to be one if nature is a coherent whole, which it obviously is. A few quotes from the book follow:

"[Steven] Weinberg muses... 'The more the universe seems comprehensible, the more it also seems pointless.' I must disagree with my old teacher. To me the universe seems quite the opposite of pointless... The more we learn, the more we see ... there is an underlying unity to the sea of matter and stars and galaxies ... we are learning that nature is as it is not because it is the chance consequence of a random series of meaningless events; quite the opposite. More and more, the universe appears to be as it is because it must be that way; its evolution was written in its beginnings-in its cosmic DNA, if you will.”

“There is a clear order to the evolution of the universe, moving from simplicity and symmetry to greater complexity and structure.”

“Accidents and chance, in fact, are essential in developing the overall richness of the universe. In that sense (although not in the sense of quantum physics), Einstein had the right idea: God does not play dice with the universe. Though individual events happen as a matter of chance, there is an overall inevitability to the development of sophisticated complex systems. The development of beings capable of questioning and understanding the universe seems quite natural. I would be quite surprised if such intelligence has not arisen many places in our very large universe."

“My speculation, however, is that because things become simpler as we near the moment of creation, there was only a limited range of possibilities; indeed, perhaps only one, with everything so perfect that it could have been no other way.”