Friday, December 11, 2009

What CNE means for the neutral theory?

A new paper of interest this week:
McEwen GK, Goode DK, Parker HJ, Woolfe A, Callaway H, et al. (2009) Early Evolution of Conserved Regulatory Sequences Associated with Development in Vertebrates. PLoS Genet 5(12): e1000762. doi:10.1371/journal.pgen.1000762

Some general features of CNEs are well summarized by this paper on CNE (conserved non-coding elements). It says: “These elements (CNE) appear to be largely absent in invertebrates.” Another paper said: “(CNEs of invertebrates) are less frequent and are smaller in size than in vertebrates.”

Kimura, a founder of the neutral theory, said: “The neutral theory also asserts that most of the intraspecific variability at the molecular level (including protein and DNA polymorphism) is essentially neutral.” (1) It is interesting to note that this new paper managed to avoid the word ‘neutral theory’ completely, the theory most relevant in interpreting sequence similarity. This recent paper (2) is another one that managed doing that.

Most CNEs would be such sequences that differ between vertebrates and invertebrates and should therefore be neutral rather than functional if the neutral theory is correct. But since CNEs are functional, we have no choice here but to conclude that the neutral theory is incorrect. In fact, CNEs are just the latest evidence among many that deem the neutral theory correct only in the domain of microevolution or population genetics dealing with identical or very similar species.

From the neutral theory perspective, here is how it interprets genetic distance in a typical sequence such as cytochrome c. Human is closer to mouse than to chicken. But the closer distance between mouse and human is merely due to short time of divergence. If given ~245 more million years, human and mouse would have a distance similar to what is now observed between human and chicken. This interpretation by the neutral theory is of course the basis for all molecular trees inferred from genetic distances.

But this perspective cannot explain among many things the case regarding vertebrates and invertebrates. Both groups appeared at about the same time during the Cambrian explosion. So if time is the only variable for sequence diversity, conserved sequences found in one group should be similarly conserved in the other group. But this is clearly not the case. As your work and many others (2) show, sequences in vertebrates are more conserved than in invertebrates. For example, vertebrates have a largest observed distance of ~20 aa difference between lamprey and fish in cytochrome c. But within drosophila or insects it is ~30 aa difference.

To say that invertebrates evolve faster will not work because there are just too many pieces of evidence against that. For example, yeast is equidistant to drosophila and human in cytochrome c (36/102 aa difference). Such approximate equidistance holds for nearly all homologous genes among yeast, drosophila and humans (3).

The only sensible explanation to me that has no contradictions is to say that vertebrates are more complex than invertebrates and can tolerate far less random mutations. Many of the neutral sequences in invertebrates become non-neutral in vertebrates, and hence we have far more CNEs in vertebrates. The maximum number of neutral positions in invertebrates is higher than in vertebrates according to the Maximum Genetic Diversity hypothesis (4, 5). What are causing the conservation of CNEs in vertebrates? It is those extra functional constraints only found in vertebrates. More complexity demands more functions from a sequence. Complexity is well known to be not linked with absolute amount of sequences but is with how sequences are used (epigenetics). CNEs are epigenetic elements. There are more ways of using a given sequence in complex organisms, which puts extra constraints on the variability of the sequence.


1. Kimura M. DNA and the neutral theory. Phil Trans R Soc Lond B 1986; 312:343-54.

2. Halabi N, Rivoire O, Leibler S, Ranganathan R. Protein sectors: evolutionary units of three-dimensional structure. Cell 2009; 138:774-86.

3. Huang S. The genetic equidistance result of molecular evolution is independent of mutation rates. J Comp Sci Syst Biol 2008; 1:092-102.

4. Huang S. Inverse relationship between genetic diversity and epigenetic complexity. Preprint available at Nature Precedings 2009;

5. Huang S. Histone methylation and the initiation of cancer, Cancer Epigenetics. New York: CRC Press, 2008.

No comments: