CellML Discussion List

Text archives Help


[cellml-discussion] Representing stochastic models in CellML


Chronological Thread 
  • From: ak.miller at auckland.ac.nz (Andrew Miller)
  • Subject: [cellml-discussion] Representing stochastic models in CellML
  • Date: Wed, 23 Apr 2008 17:31:26 +1200

Hi all,

I have spent some time looking into how we might be able to represent
stochastic models in CellML. This is something that would be useful to
ensure we have properly contemplated for CellML 1.2. I have pasted in
the notes I wrote on this below.

Please let me know if you have any suggestions, comments, or criticisms
of the below document. At some point, this will obviously need to be
transformed into a more robust proposal, but for now, I just want to
make sure we keep the option open to use the CellML 1.2 core to
represent stochastic models.

Best regards,
Andrew

-----

Overall goal:
Describe a framework which can be used in CellML 1.2 to represent a
range of
different systems which require stochastic differential equations to
describe.

Constraints:
Do not want to describe the procedure for solving the model in core
CellML,
only the underlying mathematical / statistical model.
Want to express the model in such a way that the procedure is computable
from the model.
Want there to be only one interpretation of the model.
Want the representation to be abstract enough that it is meaningful for a
number of different fields, and not just chemical equations in a well
stirred vessel.
Want the representation to work naturally when mixed with systems of
ordinary
differential equations.

Use cases:
Chemical reactions under the Chemical Master Equation model of Gillespie:
We need to split these into separate species. This is a Poisson process,
so there are simple ways to represent it.

It is more efficient to represent models using Weiner processes when
this
there are large enough numbers of molecules to justify this but ODEs are
not being used.

However, Poisson and Weiner processes are both Levy Processes, that is,
they have stationary independent increments, are zero at time zero, and
are cadlag. This is not necessarily a good thing because some things we
want to model might have memory of past events or a time dependence.

How we can represent this in CellML:
For the continuous case, integral equations for the increment in terms of
built in processes like Weiner and Poisson processes (I don't believe
there
is a clean way to represent increments in MathML).

Implementations will need to identify these and work out the
distribution of
the time until the next event (good implementations might be able to
perform
symbolic algebra to work this out, but most implementations would probably
just recognise common cases like Poisson distributions with arbitrary
parameters, and deal with expressions involving a Weiner process by
sampling
from the increment distribution in each time step), which could be put
into a
slightly generalised Gibson-Bruck type of framework where we store the
time of
the next event.

None of this helps for non stationary independent processes, however.

How could this be related to the standards:
It has been proposed that CellML 1.2 have a core specification which
describes
the basic way of representing the mathematical structure in very general
terms, and secondary specifications be used to narrow CellML down to
specific
subsets which can be implemented in their entirety by a actual software
packages.

Core CellML 1.2 should be general enough to represent concepts like
probability distributions (MathML allows new operators to be defined,
so we
could create ones for our base types of stochastic process). The ODE IV
secondary specification would not allow stochastic models, while we would
have another alternative secondary specification which extended the ODE IV
one to allow certain limited types of stochastic model (limited to
types we
know how to solve).

In terms of interaction with the typing system, in a stochastic system we
have both reals and real-valued random variables. Once we have one random
variable in our system, this will propagate to everything else which
is affected by it, so most of the model will technically be a random
variable.
However, we want to be able to easily mix stochastic models with
existing ODE
IV models to create hybrid models, so we don't really want the required
datatype to change just to connect up a random variable to a non-random
variable. For this reason, I don't think it is worthwhile to consider a
random variable of a certain type a different datatype, and instead we
would
require tools to deduce this information if they require it.




  • [cellml-discussion] Representing stochastic models in CellML, Andrew Miller, 04/23/2008

Archive powered by MHonArc 2.6.18.

Top of page