- From: ak.miller at auckland.ac.nz (Andrew Miller)
- Subject: [cellml-discussion] pcenv development priorities
- Date: Wed, 01 Nov 2006 09:44:01 +1300
>
This is with the main program / SUNDIALS compiled with -O2, and the
>
generated code compiled with -O3 -ffast-math. I have disassembled the
>
computation functions in the -O3 -ffast-math code, and it looks
>
reasonable, there are no CALL instructions anymore (the built-in exp and
>
log from gcc get inlined). I therefore doubt that differences in the
>
quality of the generated code is the cause of the problem. It is
>
possible that Alan has managed to get the better benchmark by compiling
>
CVODE with -O3 -ffast-math or other optimisations.
Turning -O3 when compiling SUNDIALS actually makes it worse, presumably
because it increases the code size and therefore the number of cache
misses. Compiling everything with -O3 -fomit-frame-pointer -ffast-math gave
real 26m40.323s
user 26m35.056s
sys 0m2.660s
Recompiling everything with -O2 -fomit-frame-pointer -ffast-math:
real 25m4.259s
user 24m58.534s
sys 0m3.040s
>
Another possibility
>
would be that his CellML 1.0 Ten Tuscher model behaves differently. Yet
>
another possibility would be that the differences could be arising from
>
the structure of the CVODE stepping loop, or differences in some
>
parameters given to the solver. My stepping loop looks like this:
>
https://svn.physiomeproject.org/svn/physiome/CellML_DOM_API/trunk/CIS/sources/CISSolve.cxx,
>
>
see function SolveODEProblemCVODE.
>
I have run the version of the model from Alan with the integrator code
compiled using -O2 -fomit-frame-pointer -ffast-math (generated code
compiled -O3 -ffast-math), and I got the following...
real 22m25.521s
user 22m21.168s
sys 0m2.400s
I will look into where the time is being spent, to see if this can be
improved.
Best regards,
Andrew
- [cellml-discussion] pcenv development priorities, Andrew Miller, 11/01/2006
Archive powered by MHonArc 2.6.18.