CellML Discussion List

Text archives Help


[cellml-discussion] Auto-generate HDF5 from CellML?


Chronological Thread 
  • From: david.nickerson at gmail.com (David Nickerson)
  • Subject: [cellml-discussion] Auto-generate HDF5 from CellML?
  • Date: Wed, 12 Nov 2008 18:16:54 +0800

> I'm considering HDF5 for my storage needs in simulating a CellML model under
> multiple parameter scenarios. HDF5 is designed for efficient storage,
> retrieval, navigation and subsetting of huge data sets [1], with annotation
> [2]. I plan on storing both raw and post-processed data, so that if I detect
> problems at a higher level, I can go back and look at details and possibly
> re-
> run those simulations. David Nickerson described a similar approach in an
> earlier post [3].
>
> However, setting up the data structure with annotations for physical units
> and
> such is quite time-consuming. On the other hand, the CellML representation
> holds all the required information. It would be very helpful to
> auto-generate
> an HDF5 data structure to hold output from simulations of CellML models.
>
> Such a tool should be fairly easy to write for someone familiar with both
> HDF5
> and CellML, and would apply to all possible CellML models. I guess it would
> be
> overly restrictive to make an output format part of the CellML metadata
> specification. However, offering a standard output format would save
> duplication of effort and make it easier to share simulation results for
> further visualization and analysis.
>
> I'd like the opinions of the CellML regulars, in particular whether anything
> similar has been discussed previously.

I'm not aware of this coming up for discussion in the past.

I certainly agree that there is little point duplicating data from the
CellML model, although when using unversioned model documents the link
between simulation outputs and input CellML models can become quite
tenuous. If you are building up a large collection of simulation data
for which you need to maintain a strong link to the input models
(which I think you do) you'll probably want to look into such issues
quite a bit. This is something PMR2 will address (I hope), although,
for use now, revision numbers in a subversion repository would
probably be sufficient.

As a side note, I am envisioning that in the long term such simulation
data would ideally be stored using FieldML (http://www.fieldml.org)
which underneath is likely to provide several options for the high
performance persistent storage (with HDF5 being one of the options
that crops up quite frequently). Unfortunately, I'm unsure what sort
of time frame a fieldML based solution might become available...

As for generic generation of HDF5 data structures from CellML models,
I think this would need some thinking :) Is there a generic way to
define a useful HDF5 data structure for any given CellML model? I'm
not sure...

Do you imagine a tool which for a given CellML model (or perhaps more
realistically for a given chunk of CellML simulation metadata) will
produce essentially a template HDF5 data group with standardized
structure. Then your simulation tool would grab that data group and
populate the simulation results.

Or would some kind of simulation data storage and retrieval service
sitting on top of the CellML API be more what you are after? Then I
guess that would allow for different underlying persistent data stores
to be utilised...


David.




Archive powered by MHonArc 2.6.18.

Top of page