CellML Discussion List

Text archives Help


[cellml-discussion] Using CellML to represent huge CellML models:Has anyone worked on this already?


Chronological Thread 
  • From: david.nickerson at nus.edu.sg (David Nickerson)
  • Subject: [cellml-discussion] Using CellML to represent huge CellML models:Has anyone worked on this already?
  • Date: Tue, 24 Apr 2007 10:27:50 +0800

> I am working on developing a CellML model (using external code) of
> transcriptional control in yeast which is 23 MB in size. I hope to
> eventually do a similar thing for organisms which have much more
> complicated sets of interactions, in which case this size may grow
> substantially.

so you have 23MB of XML? Cool! Even combining all my models I have less
than 7MB, and even then I'm sure that figure includes some simulation
results.

I guess an interesting test would be uploading it to the model
repository to see how that handles such a large model (presuming you
have a CellML 1.0 model).

> If anyone on this list is interested in similar problems (I presume
> similar issues come up in a range of systems biology problems, whether
> you are working with CellML or SBML), I would welcome your feedback and
> suggestions, and perhaps we could collaborate .

I really have no idea what an transcriptional control in yeast model
looks like, but my initial thought would be to abstract out any similar
math and import common declarations - I'm guessing you have already done
this if its possible.

> This creates some unique issues for CellML processing tools:
> 1) Just parsing the CellML model (especially with a DOM-type parser
> which stores all the nodes into a tree, but probably with any type of
> parser) is very slow.

it might be interesting to look at doing some simple task to check the
performance of DOM vs SAX based tools? I have found in the past that
with 500MB "fieldML" files that the SAX parser used in CMGUI was quite
fast at parsing the file - especially if you go from a gzip compressed file.

> 2) The CellML model might not all fit in memory at the same time,
> especially if the model gets to be multi-gigabyte. It might be possible
> to make use of swap to deal with this, but if the algorithms don't have
> explicit control over when things are swapped in and out, it will be
> hard to work with such a model.

I think if you have a model getting that large then there needs to be
some serious thinking about how to handle such models...but generally
can't you just let the OS worry about swapping in and out as required?
Or would you expect a customised scheme for a particular application to
be more efficient?

> C) Another leaner API, read-only CellML API (perhaps based off the same
> IDLs, but with certain functionality, like the ability to modify the
> model, or set mutation event listeners, unavailable). We could add a
> SAX-style event dispatcher instead, to allow users to save any
> information they do want from extension elements, which will also not be
> kept in the model. Comments, white-space, and so on would all be
> stripped unlike in the current CellML API implementation. Tools which
> are currently using the full CellML API but only require read-only
> access (e.g. the CCGS) might be able to just 'flick the switch' and
> benefit from the leaner API.

This would probably be beneficial even for those of us without such
large models - especially if it is as easy as flicking a switch to swap
between the complete and restricted implementations.


Andre.




Archive powered by MHonArc 2.6.18.

Top of page