CellML Discussion List

Text archives Help


[cellml-discussion] pcenv development priorities


Chronological Thread 
  • From: ak.miller at auckland.ac.nz (Andrew Miller)
  • Subject: [cellml-discussion] pcenv development priorities
  • Date: Wed, 01 Nov 2006 13:43:54 +1300

Alan Garny wrote:
> Ok, here are some more information about the test I have run:
>
> - I used 10^-7 and 10^-9 for the relative and absolute tolerances,
> respectively.
> - The CVODE library was compiled using -O2. Note, though, that the Borland
> C++ compiler is probably not the best way around (unfortunately).
> - I asked for the trans-membrane potential to be plotted every millisecond
> worth of cardiac activity. That means: doing some graphical stuff (by doing
> so, the trans-membrane value is kept in memory, so that the user can 'play'
> with the graph later on, export to CSV, etc.), as well as giving a chance to
> the rest of the system to do some other critical stuff (i.e. prevent COR
> from taking over the whole system), which is somewhat time consuming (see
> difference between simulation and computation time).
>
> Now, as for the test CellMl file I used, well see it attached...
>
> Finally, I have just re-run the test using 10^-6 for both the relative and
> absolute tolerance. Here are the results:
>
> Simulation time: 1022.007 s (i.e. ~17 min 2 sec)
> Computation time: 872.539 s (i.e. ~14 min 32 sec)
>
> And then 10^-6 for the relative tolerance and 10^-8 for the absolute one:
>
> Simulation time: 1076.629 s (i.e. ~17 min 56 sec)
> Computation time: 915.625 s (i.e. ~15 min 15 sec)
>
> Alan.
>
Hi Alan,

I have been looking into why my code is running slower than what you are
getting (I have done cache simulation, with -O2 compiled code).

It is worth noting that about 30% of total time is spent in gesl, part
of the dense solver, due partly to numerous L1 cache misses accessing
col_k around line 174, and 185. I could use the gcc prefetch builtins,
but that will only make a difference (on x86 architectures) if I set up
my compiler to use SSE instructions (which will break support on Intel
chips for anything earlier than a Pentium 3, and AMD chips prior to AMD
XP6).

1) Are you using single or double precision? If it is the latter case,
have you got 80 bit spills (_FPU_EXTENDED) mode turned on, or is the FPU
in full 64 bit mode?
2) Are you using an unmodified CVODE integrator, or have you optimised it?
3) Are you using the dense linear solver, or some other solver?
4) Are you using the NVECTOR serial implementation from SUNDIALS (from
within SUNDIALS), or have you written your own wrapper around your
internal data structures?
5) What processor are you targeting? Have you allowed your compiler to
use MMX / SSE / SSE2 / 3DNow! / SSE3 operations?
6) Could you send me the disassembled generated code (or the machine
code if it is easier, as long as you tell me where the entry points are
supposed to be), so I can compare it against the result of my code
generation => compile process?

Best regards,
Andrew





Archive powered by MHonArc 2.6.18.

Top of page