CellML Discussion List

Text archives Help


[cellml-discussion] Describing rules for translating expressions into arbitrary languages


Chronological Thread 
  • From: ak.miller at auckland.ac.nz (Andrew Miller)
  • Subject: [cellml-discussion] Describing rules for translating expressions into arbitrary languages
  • Date: Mon, 30 Apr 2007 10:50:43 +1200

Hi,

I am looking at refactoring the CCGS into several components, as
discussed in an earlier e-mail. As part of this, I am looking at how I
can separate out the language specific parts of code generation. At this
stage, I am focusing on how expressions get generated, rather than
entire assignments. This will then be combined with code to generate the
procedural steps required to evaluate a model. Writing a program to
generate code for a new language will then be as simple as iterating
through the procedural steps, writing out assignments of expressions
into variables, in addition to supplying all the language specific glue
to the integrator.

I have defined a file format specification, called MAL (or
MathML-language mapping) designed to contain all the information needed
to generate expressions for a specific programming language. I would
welcome any feedback anyone may have on the specification. I would be
particularly interested in hearing if you can think of some extension to
the language which is needed to support generation for a certain
language. The specification follows...

MAL Format is intended as a succinct but complete description of how to
translate expressions from MathML into the syntax of another programming
language. It is intended to be both simpler but more powerful (within the
problem domain it is trying to address) than more generic approaches such as
XSLT.

Format:
The format consists of a series of tags. Each tag has a series of
alphanumeric
characters(the tag name), followed by a collon and a space (": "),
followed by a
series of characters (the tag value). The tag is terminated by a
carriage return
or line-feed character, and the next tag starts at the first character which
isn't a carriage return or line feed.

Where line-length formatting transforms (such as for FORTRAN 77), a
post-processing stage must be used to achieve this. The reason for this
design
decision is that expressions alone do not determine line length.

The following tags are defined:

Name: opengroup
Value: A string which can be appended before another string to force that
string to have the highest precedence.
Examples:
opengroup: (
Sets the open group string to be (, which is the open group character in
languages like C.

Name: closegroup
Value: A string which can be appended after another string to force that
string to have the highest precedence.
Examples:
closegroup: )
Sets the close group string to be ), which is the close group character in
languages like C.

Name: The name of any MathML operator.
Value: A string describing the format. This string shall start with a
description of operator precedence in the target language, and then
describe
a pattern for generating the target language expression.

A precedence description is specified between #prec[ and ]. The following
precedence descriptions can be used:

#prec[n(m)] where n and m are integers between 0 and 1000. Sets the outer
precedence to n (this is a precedence score for the resulting expression),
and the inner precedence to m (this is a precedence score below which
operands must be if they are not to require opengroup / closegroup strings
around them.

#prec[n] where n is an integer is a shorthand for #prec[n(n)]

#prec[H] is a shorthand for #prec[1000(0)].

In an operator description, character sequences which are not matched
below
are written directly out to the output mathematics.

#expri reference the recursive expansion (according to the rules
in the MAL file) of the ith operand, where i is a positive integer. The
highest i value present also acts as the number of operands which must be
present in the MathML to avoid an error.

#exprs[text] expands to the concatenation of each consecutive operand
after
expansion according to the rules. The string text intervenes between
operands,
but is not added before the first operand or after the last.

#logbase expands to the expansion of the logbase element contents. This is
only valid for log. If no logbase element is found, the string 10 will be
inserted.

#degree expands to the expansion of the degree element contents. It is
only
valid for root. If no degree element is found, the string 2 will be
inserted.

#bvarIndex expands to the text of the bvarIndex annotation (as
retrieved by
the AnnotationSet supplied to MaLaES) on the source of the bound variable
referenced.

#uniquen (where n is an integer) expands to a globally unique integer.
If uniquei
(for the same i) is used more than once in the same line, it refers to the
same number. However, a different number is generated each time a rule is
processed.

#lookupDiffVariable (only valid on diff) finds the ci associated with the
diff (differentiation of something other than a variable is not
supported by
this form, and will result in an error), and then finds the source
variable
associated with that ci. It then asks the supplied AnnotationSet for the
degreeiname, where i is the degree of the diff.

#supplement causes all subsequent output to be put into the supplementary
stream, instead of the main output stream.

Name: unary_minus
Value: unary_minus works just like the MathML operator elements
described above.
However, the MathML operator minus is only processed according to the
minus
rule if it has two children. If it has one child, it is processed
according
to the unary_minus rule. If it has any other number of children, an
error is
raised.

I also have created a complete example, describing how to generate C
expressions:

opengroup: (
closegroup: )
abs: #prec[H]fabs(#expr1)
and: #prec[20]#exprs[&&]
arccos: #prec[H]acos(#expr1)
arccosh: #prec[H]acosh(#expr1)
arccot: #prec[1000(900)]atan(1.0/#expr1)
arccoth: #prec[1000(900)]atanh(1.0/#expr1)
arccsc: #prec[1000(900)]asin(1/#expr1)
arccsch: #prec[1000(900)]asinh(1/#expr1)
arcsec: #prec[1000(900)]acos(1/#expr1)
arcsech: #prec[1000(900)]acosh(1/#expr1)
arcsin: #prec[H]asin(#expr1)
arcsinh: #prec[H]asinh(#expr1)
arctan: #prec[H]atan(#expr1)
arctanh: #prec[H]atanh(#expr1)
ceiling: #prec[H]ceil(#expr1)
cos: #prec[H]cos(#expr1)
cosh: #prec[H]cosh(#expr1)
cot: #prec[900(0)]1.0/tan(#expr1)
coth: #prec[900(0)]1.0/tanh(#expr1)
csc: #prec[900(0)]1.0/sin(#expr1)
csch: #prec[900(0)]1.0/sinh(#expr1)
diff: #lookupDiffVariable
divide: #prec[900]#expr1/#expr2
eq: #prec[30]#exprs[==]
exp: #prec[H]exp(#expr1)
factorial: #prec[H]factorial(#expr1)
factorof: #prec[30(900)]#expr1 % #expr2 == 0
floor: #prec[H]floor(#expr1)
gcd: #prec[H]gcd_multi(#count, #exprs[, ])
geq: #prec[30]#exprs[>=]
gt: #prec[30]#exprs[>]
implies: #prec[10(950)] !#expr1 || #expr2
int: #prec[H]defint(func#unique1, BOUND, CONSTANTS, RATES, VARIABLES,
#bvarIndex)#supplement double func#unique1(double* BOUND, double*
CONSTANTS, double* RATES, double* VARIABLES) { return #expr1; }
lcm: #prec[H]lcm_multi(#count, #exprs[, ])
leq: #prec[30]#exprs[<=]
ln: #prec[H]log(#expr1)
log: #prec[H]arbitrary_log(#expr1, #logbase)
lt: #prec[30]#exprs[<]
max: #prec[H]multi_max(#count, #exprs[, ])
min: #prec[H]multi_min(#count, #exprs[, ])
minus: #prec[500]#expr1 - #expr2
neq: #prec[30]#expr1 != #expr2
not: #prec[950]!#expr1
or: #prec[10]#exprs[||]
plus: #prec[500]#exprs[+]
power: #prec[H]pow(#expr1, #expr2)
quotient: #prec[900(0)] (int)(#expr1) / (int)(#expr2)
rem: #prec[900(0)] (int)(#expr1) % (int)(#expr2)
root: #prec[1000(900)] pow(#expr1, 1.0 / #degree)
sec: #prec[900(0)]1.0 / cos(#expr1)
sech: #prec[900(0)]1.0 / cosh(#expr1)
sin: #prec[H] sin(#expr1)
sinh: #prec[H] sinh(#expr1)
tan: #prec[H] tan(#expr1)
tanh: #prec[H] tanh(#expr1)
times: #prec[900] #exprs[*]
unary_minus: #prec[950]-#expr
xor: #prec[25(30)] (#expr1 != 0) ^ (#expr2 != 0)

Best regards,
Andrew




  • [cellml-discussion] Describing rules for translating expressions into arbitrary languages, Andrew Miller, 04/30/2007

Archive powered by MHonArc 2.6.18.

Top of page