Tuesday, November 24, 2009

On the standard for truth and the role of communication in disseminating scientific ideas

Someone wrote an article on the talent of Darwin in amassing support for and in communicating the idea of Darwinism.

Sessions, Stanley and Macgregor, Herbert. (2009) The Necessity of Darwin. Available from Nature Precedings, http://hdl.handle.net/10101/npre.2009.2887.1

I responded by saying that ideas are more important than amassing facts and communication. And I got a counter response that ideas are cheap and communication is key. I responded with the following:

You have some good points but you overlooked the fact that the opposite of your points are also true. Like all fundamental things in nature, seldom is a thing true without its opposite also true. I am referring in my earlier message one kind of idea, the greatest ones, since we are talking about Darwin and Mendel, whereas you are referring to another kind, good or average ones. I here argue that we are both correct.

You are right in the sense that an idea must explain detailed facts for it to be a scientific rather than a philosophical one. I am thinking the countless non-mainstream theories of evolution that one can easily discard for one simple reason, they explain few details and are just general empty claims. No question that Darwin is singly important in making the idea of natural selection a scientific theory. But I would like to point out here, as it is commonly overlooked by biologists, that amassing a lot of support data is not the same as proving the idea right. While it is very important to amass data to support an idea, a quantitative difference in amount of data is less critical if no one has explained all relevant facts without contradiction. So long the idea is not completely true, to amass 99% of all relevant facts to support it is only slightly or quantitatively better than to amass 1%. A true idea only cares about 100%. Darwin clearly did not explain all relevant facts as he had a chapter in his book on the difficulties with his theory. Faced with such difficulties, another person would easily either bury his idea or narrow the domain of relevance for his idea.

An idea that explains 99% of all relevant facts is only slightly or quantitatively better than another idea that explains 1%. But they are all qualitatively the same: they are all false or at least incomplete. So long an idea does not explain 100% of all relevant facts or has one single exception within its domain of relevance, it is not a completely true idea. The difference between truth and falsehood is not quantitative but qualitative. The idea that the standard of true idea should be without a single contradiction is from priori reason and universally adopted by mathematicians. But it is presently unpopular among evolution biologists or biologists in general. So it is worthwhile here to explain why we have no other choice but to use that standard. (a single contradiction is really equivalent infinite number of contradiction. Once you allow one, you can of course allow 2, and in turn 3, 4, … all the way to infinity. Nature’s phenomena are unlimited and a theory with a single contradiction today could easily meet a few hundred in a few generations)

Let’s grant that nature is a coherent whole governed by laws, which we must since that is the reason for science to be possible. Let’s say there are 100 independent phenomena in nature. Law 1 explains 99 of them but left one out because it is a contradiction. So we have law 2 to explain the single contradiction. If law 2 happens to contradict or negate law 1, which could happen since the phenomenon explained by law 2 contradicts law 1 (in real world, we have the neutral theory negating Darwinism), which one then is the correct law? This may have two possible solutions. One is to say that neither is correct and we can then back that up by finding a true law that takes care of all 100 phenomena. This is why I said 99% is not much better than 1%. You may ask how can a wrong law explain 99%? Easy, by chance and mistaken explanation. E.g, the molecular clock or neutral theory has been used to explain close to 99% of all macro-evo phenomena and no one would consider it possible for those explanations to be all false although every one knows the theory has contradictions. But at least to me at this point in time, the new maximum genetic diversity (MGD) hypothesis has made it blatantly clear that all the data can be totally reinterpreted in a different and much better way or contradiction free way. The molecular clock should never have been invented in the first place. And if people had used the standard of no single contradiction, we could have easily prevented the contradiction-laden clock/neutral theory to last for nearly half a century. Since I have offered the MGD as an easy target to shoot down (just one contradicting detail is enough), I am not the kind of people who applies high standard only to others. And the MGD should prevent people from taking the easy excuse that biology or evolution is special or should be made an exception to the standard of no single contradiction.

The second way is to say that both laws are correct but just applies in different domains of relevance. So here each law has in fact explained everything in its domain of relevance in a contradiction-free manner. Now this does not contradict the idea that there should be only one universal law. All one needs to do is to combine the two opposite laws as two different aspects of a single universal law. Thus, Darwinism explained a lot of things but failed a lot too. One can either say it is completely wrong as an idea for the whole domain of evolution (here a single contradiction is sufficient to justify that saying) or one can say it is completely true within its domain of relevance (microevolution). Thus, the MGD includes Darwinism as a largely true account of microevo and can in turn claim to be a 100% correct theory for the whole domain of evolution, at least before a single contradiction to show up to ruin it.

In this day and era, an idea is seldom completely wrong but a common mistake of people is to apply the idea out of its proper domain of relevance. The standard of 100% contradiction-free is absolutely necessary as it is the only way to know whether a theory is completely true within its domain of relevance. It is the only way we can have multiple 100% correct laws of nature with each taking care of a small domain. We should never value a law that explains 99% as better than another one that explains the remaining 1%. Those two laws are either both false (as a general law for the whole domain) or both correct (as a narrow law for a smaller domain of relevance). A law is either 100% correct or false. There is no such thing as a 99% correct law!!

When we do find ourselves faced with a 99% correct law, we should first see if we can narrow the domain of relevance and hence make the law 100% correct in a smaller domain. If we cannot do that, then we have no choice but to search for a replacement that is 100% correct. The coherence of nature guarantees that we will find if we try. Nature is such that it would take a miracle to find a contradiction to its law if man has accurately stated that law. In contrast, it would take no effort to find a contradiction to a falsely formulated man-made law. The fact that no biological laws or patterns have been found to be 100% correct simply means we are searching for patterns in the wrong place where a pattern is not supposed to be there in the first place. The molecular clock, neutral theory, and Darwin’s theory all have great virtues for their respective specific domain of relevance where they are 100% correct. It is only when we force these domain specific theories upon some areas of nature where they don’t belong that we observe contradictions.

So, an important job for any theorists is to know where your theory does not belong. The only way to know this is to stick to the standard of 100% contradiction free. The worst theory possible is a tautological interpretation of a single phenomenon that has in its domain of relevance only one single phenomenon. The greatest theory is of course one that includes all phenomena of nature within its domain of relevance. Between the two, we have countless domain specific theories that are all 100% contradiction free within their respective domain but will of course meet contradictions and hence are false once they step outside of their bounds.

Now back to ideas and communication. A great idea speaks for itself and its best friend is time rather than communication. Good communication of course will not hurt. But in reality it could help little. A great idea is great often because it is destructive to an existing paradigm. The paradigm will resist change to protect its self-interest, often for personal and practical reasons like jobs and grants, which is totally understandable and should not be criticized too much as no one is superhuman and a paradigm is almost never completely wrong in the modern science era. Planck is very sharp in famously saying that a revolutionary idea never converts the old but just waits for them to die and for the new generation to take it up. Thus, communication is often irrelevant to great ideas. People can always choose and have always chosen to refuse to be talked into if they know the acceptance of the new could only hurt them personally.

Regardless of what you say about Darwin’s talent in communication, he will be judged by time on how true his idea is. If his idea is true and great, the past 150 years of history would only serve to invalidate your point about his talent in communication. But I fully share the notion that Darwin’s contribution is singly critical in making evolution a branch of science and a household concept. Science has generally benefited greatly from his effort in advancing the idea like no one else in his time.

Poor communication can only ruin a cheap/average idea but it can do little real damage because we can always have 5 or more people to hit upon the same idea in a short span of time. The idea of natural selection was independently invented 5 times in a short span of 40 years, by Wells, Blyth, Matthew, Darwin, and Wallace. But I do see that Darwin was unique among the five in seeing and advancing the unlimited potential for natural selection that clearly Blyth did not see or refused to see. But I think that the jury is still out whose version of natural selection will prove to be correct. All the data of experimental evolution in the past have proven neither wrong for microevolution but are anyway only capable of resolving issues of microevolution.

This brings up a novel perspective to view just how impossible for the idea of these 5 inventors of natural selection to turn out to be completely true. With a total ignorance of the molecular mechanisms of heredity, i.e., genetics and epigenetics, it would have to easily qualify as a miracle for any single human being to invent an idea of evolution that could explain every key detail of evolution from phenotypes down to molecules. (it would be a miracle for even a geneticist to invent a true idea of evolution if he is totally ignorant of epigenetics) Now imagine such a miracle not only has to happen once but 5 times in a short span of 40 years. Well, miracle never happens in nature as far as I am concerned or science is concerned. The natural way of advancing science is for ideas to come in accordance with knowledge/data collection. If epigenetics is a part of evolution (there is no doubt it should be even if based on pure priori reason alone), the natural and non-miraculous timing for the birth of a complete theory is of course today when epigenetics is finally at the cutting edge of advancing biology. Unless of course there is a third way of heredity to be discovered in the future (highly unlikely), history will no doubt remember today as the best and only time in human history that human could realistically come to a true understanding of evolution.

Friday, November 13, 2009

Real reason for the endless production of conflicting results on tarsiers

A couple of weeks ago, a new paper by Chatterjee et al. appeared on primate phylogeny that groups tarsiers with prosimians.

“Estimating the phylogeny and divergence times of primates using a supermatrix approach” Helen J Chatterjee, Simon YW Ho, Ian Barnes, and Colin Groves
BMC Evolutionary Biology 2009, 9:259 doi:10.1186/1471-2148-9-259

I sent the following comment titled “Real reason for the endless production of conflicting results on tarsiers” to the Journal’s website:

On the position of tarsiers, Chatterjee et al wrote: “The majority of molecular evidence supports the latter grouping [4,10-13] (grouping tarsiers with higher primates), although a large number of molecular studies still provide support for the Prosimii concept [14-18].”

When a method or technique can lead to two opposite results repeatedly and seemingly endlessly while only one of the two can be true, it is time to ask whether something is fundamentally missing with our method (all existing popular methods are slightly different from one another but are fundamentally the same kind). Let us start from the very beginning and examine the assumptions for our method. The key assumption for all sequence similarity based methods is that sequence dissimilarity always correlates with time of divergence. Well, is this true? We don’t have to be a specialist to know that this is sometimes true and sometimes not. Thus, for our method to be able to produce accurate and uncertainty-free result, it must take into account the reality that sequence dissimilarity sometimes does not correlate with time of divergence. Many of the sequence comparisons are not informative and should and must be excluded from our method. When they are not as is the case with all existing methods, they contribute to the high noise level that can sometimes overwhelm the signal. It is by accident that these methods sometimes give correct results and sometimes wrong results and no one knows why the difference or when to view such a result correct and final. Therefore, we have a peculiar non-scientific situation: no one is taking anyone else’s results as the final say. Never mind that we only have one true phylogeny of life on Earth. Once you know it, it is done and no more work needed. The existing methods are perfect for keeping some of us employed forever but will never give us truth. Truth is not judged by a quantitative difference in the number of studies that support it versus those against it. The correct method should produce zero number of studies that is against truth or should be immune to the production of conflicting results.

Data + method = result. The data here in molecular phylogeny is just sequence facts and cannot possibly be wrong so long one is not making sequencing errors. Thus the only way to produce a false result or conflicting results in phylogeny is through an incorrect method. Since all existing methods are perfectly capable of producing false results and have all in practice produced false results or conflicting results, it is another simple proof that the existing popular methods are simply incorrect.

By the own admission of the leading experts, the existing popular methods are flawed in the sense that they can easily produce incorrect results that are totally out of the hands of the scientists:
“Unlike the case in physics, the predictive power of a model in biology is quite low. It seems to us that if the prediction (e.g., a phylogenetic tree reconstructed) of a model is correct in 80% of the cases, it is a good model at least at the present time.” From Masatoshi Nei and Sudhir Kumar, 2000, Molecular Evolution and Phylogenetics, (p85):

When a result is only 80% certain, it can be completely wrong. We either know or we don’t know. Knowing with 80% certainty or anything less than 100% means we don’t know. We are much better off without it because it often leads the non-specialists into the wrong idea that we know with 100% certainty. Does not everyone in academic think that we are 100% certain that chimp is closest to humans when in fact we are only 80% certain and can therefore be completely wrong? When they then act and work based on that knowledge (they have been doing just that for years now), should we feel perfectly comfortable for misleading them into that?

Of course, nothing we know says that biology has to be different from physics. The present situation merely means that we have much to learn. When we know better, we should be able to have a model or method that is correct in 100% of the cases. Until then, some of what we are doing is just kidding ourselves. I have now offered the slow clock method as the best candidate for a method that takes into account all reality and is capable of 100% certainty (1). While the result of Chatterjee et al., like many others, does support my result on tarsiers using the slow clock method (1), I do not view their result as confirming mine, because their method is flawed. By using the same kind of method, another group could easily produce a result opposite of theirs by just picking a new set of genes (this of course has been done many times already). I of course do not view such result as valid contradiction to mine just like I do not view the result of Chatterjee et al. as valid support.

A flawed method automatically qualifies its result as meaningless, regardless whether the result happens to be consistent with reality or not. The definition of a flawed method is simply that which can turn a perfectly solid set of factual data into a false interpretation of reality. Any method that has produced conflicting interpretations has of course automatically self-proven itself false. The present situation we have with tarsiers is just one of many that says flatly it is time for a fundamental change in our method of interpreting sequence data.

Huang, S. (2009) Primate phylogeny: molecular evidence for a pongid clade excluding humans and a prosimian clade containing tarsiers. Available from Nature Precedings, http://hdl.handle.net/10101/npre.2009.3794.1