[cellml-discussion] from Wired: "The end of theory: the data deluge makes the scientific method obsolete"

Nicolas Le novère lenov at ebi.ac.uk
Wed Jun 25 10:04:08 NZST 2008


James,

During the 80s, it became clear that the best predicting tools in 
life-science were based on algorithms using implicit rules, namely the 
formal neural networks and hidden markov models (there are more now). You 
feed them with as much data possible, and they get better and better. The 
problem is that using a cascading neural network to predict protein 
secondary structure does not tell-us anything about the folding of the 
proteins. It is tremendously useful, but it did not stop structural 
biologists to keep on looking for the rules governing protein folding, and 
therefore life (for the readers unaware, protein folding is still the big 
mystery of biochemistry. If the cell had to test all the possibilities to 
choose for the most stable, it would take more than the age of the universe)

Regarding models, one could argue that the only valid model of any living 
entity is at the quantum level. Only then can we claim not having hidden 
some complexity below the carpet. Is such a model useful? Heuristic? 
(does-it allow to ask more questions?) or is-it a mere description? And 
that is absolutely not specific to life-science. Do-we need to design a 
molecular dynamic model of water molecules to understand the influence of 
the moon on the tidal waves? The good model is the model that can 
discriminate between two explanations for a phenomenon, and tag one of them 
as having a higher degree of verisimilitide. In our age of Systems Biology, 
it is often a model of a processus of level n based on mechanisms of level 
n-1.

That said, we already spit out similar amount of data than Google. After 
the first month of existence, the 1000 genomes project has produced more 
than the complete Genbank. It is anticipated that by the end of the year it 
will produce the same each week. Although it will be tremendously useful 
for human health (by providing a reference variability map) that does very 
little to help-me understand that modulating the noise in a neural network 
changes the frequency of firing. For that, a simple model of "integrate and 
fire" neuron, simulated with a variable noise function is much better (and 
this IS useful. From epilepsy to Parkinson, many disorder symptoms are 
related to noise).

The article quote "All models are wrong, but some are useful." I absolutely 
disagree on the first part, because it is absolute. Some models are better 
than others to answer a particular question. That is all we need. Those 
models can be simple or complex, small or big.

James Lawson wrote:
> Hi all,
> 
> Thought I'd see what you guys and girls think of this article, and its 
> relevance to CellML, systems biology etc.
> 
> http://www.wired.com/science/discoveries/magazine/16-07/pb_theory
> 
> "This is a world where massive amounts of data and applied mathematics 
> replace every other tool that might be brought to bear. Out with every 
> theory of human behavior, from linguistics to sociology. Forget 
> taxonomy, ontology, and psychology. Who knows why people do what they 
> do? The point is they do it, and we can track and measure it with 
> unprecedented fidelity. With enough data, the numbers speak for 
> themselves."
> 
> Some of my thoughts: when we have bioinformatics servers processing 
> similar amounts of information to the Google servers, then we'll need to 
> rethink how we do things. The question is, how long will that be? And is 
> information that encodes a model more useful than information that just 
> codes data, considering that the model can produce more information?
> 
> Kind regards,
> James
> 
> _______________________________________________
> cellml-discussion mailing list
> cellml-discussion at cellml.org
> http://www.cellml.org/mailman/listinfo/cellml-discussion


-- 
Nicolas LE NOVERE,  Computational Neurobiology,
EMBL-EBI, Wellcome-Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
Tel: +44(0)1223494521, Fax: 468, Mob: +44(0)7833147074  Skype:n.lenovere
http://www.ebi.ac.uk/~lenov, AIM: nlenovere, MSN: nlenovere at hotmail.com


More information about the cellml-discussion mailing list