- From: matt.halstead at auckland.ac.nz (Matt )
- Subject: [cellml-discussion] Biological and other non-model citations in CellML metadata?
- Date: Fri, 30 Mar 2007 08:05:11 +1200
On 3/29/07, Nicolas Le Novere <lenov at ebi.ac.uk> wrote:
>
On Thu, 29 Mar 2007, Matt wrote:
>
>
> Can you explain in more detail or point to explanations of
>
> bqmodel:isDescribedBy?
>
>
You can find some explanations at:
>
>
http://www.ebi.ac.uk/compneur-srv/miriam-main/mdb?section=qualifiers
So there is no simple way to determine if this is a reference to a
journal article except through interpreting the URI?
>
>
Note tha qualifiers are optional to be MIRIAM-compliant. I personaly
>
think we should always use some qualification, otherwise an annotation
>
becomes very difficult to use except for jumping from webpage to
>
webpage.
>
>
> Specifically:
>
> - what is its intended meaning?
>
>
Cf above. Note that the list of qualifiers is by no mean frozen. We
>
are already aware of several gaps (e.g. how do-we qualify the relation
>
between a peptide and the gene that encodes it?)
>
>
> - when more than one of these is defined on a resource, how is this
>
> interpreted? For example: is there some precedence implied somehow?
>
>
This is up to the "tool" using the qualifiers. SBML does not allow
>
nested qualifications. There is only an implicit "hasVersion" if several
>
identical qualifiers are present:
>
>
bqmodel:isDescribedBy toto
>
bqmodel:isDescribedBy tata
>
>
means is described by toto and is described by tata. In other words
>
toto or tata describe the component.
>
>
NOT toto and tata are necessary to describe the component.
>
>
On top of that, BioModels DB add some precedence
>
http://www.ebi.ac.uk/compneur-srv/biomodels/doc/annotation.html
>
>
But all that is not part of MIRIAM rules.
>
>
> - how do you determine the kind of reference it is - for example a
>
> pubmed uri? You have a datatype for vocab/database IDs in the
>
> annotation scheme you described, but I don't see this in the
>
> bqmodel:isDescribedBy examples.
>
>
<rdf:li rdf:resource="http://www.pubmed.gov/#8983160"/>
>
>
http://www.pubmed.gov/ means "the following identifier has to be
>
interpreted as pointing to a data of PubMed".
>
>
http://www.pubmed.gov/ is unique and should not normally
>
change. However, sometimes it may neverstheless change for various
>
reasons: URI too confusing, badly choose, fusion of two resources
>
etc. For instance, the old PubMed URI was
>
http://www.ncbi.nlm.nih.gov/PubMed/
>
It was misleading because tied to a particular physical resource at
>
the NCBI.
>
>
We have a deprecation system in place that allow to resolve the
>
old URIs and provide the new ones.
>
>
>
> - how would you address auxiliary references as opposed to primary
>
> references so that a machine interpreting it can make the distinction?
>
>
I am not sure I understand that. Like primary and secondary accessions of
>
UniProt?
For journal articles, or other publications, then being able to
identify the primary reference(s) is useful. For database records, it
would also be useful to label a group as being the most important (or
defining) set, and others as 'helpful'. It was why I suggested that
CellML bibliographic referencing seperated these two, and that the
latter would need to be bound to a reason (a natural language comment
would be fine) the described why that reference was made.
>
>
>
>
> <snip>
>
>>
>
>> I entirely agree with Melanie, people should be able to pick the
>
>> resource they want, as far as they uniquely identify it. This is
>
>> clearly described in the MIRIAM paper.
>
>
>
> I'm not sure what benefits one gains from letting people arbitrarily
>
> choose what they want to use to identify something with. For example,
>
> how to you work out if particular entities in one SBML model match
>
> entities in another SBML model?
>
>
>
> Also, given that most of these resources are controlled vocabularies,
>
> there is a lot of room for misunderstanding someone's intention when
>
> using their choices of identifiers.
>
>
>
>
>
>
>
>> An annotation is formed of
>
>> three parts:
>
>>
>
>> The data-type, e.g. PubMed entry, DOI, GO term, Cell-type ontology term
>
>> ...
>
>>
>
>> The identifier of the particular information, e.g. 123456789, GO:0001234
>
>> ...
>
>>
>
>> An optional qualifier that describe the relationship between the concept
>
>> represented by the model component and the concept represented by the
>
>> particular information.
>
>>
>
>> To help people implement that, we developed MIRIAM resources
>
>> (http://www.ebi.ac.uk/compneur-srv/miriam/).
>
>>
>
>> If you download a model from BioModels DB in SBML (not in CellML at
>
>> the moment, for obvious reasons highlighted by the current
>
>> discussion), you will see something like:
>
>>
>
>> <bqmodel:isDescribedBy>
>
>> <rdf:Bag>
>
>> <rdf:li rdf:resource="http://www.pubmed.gov/#8983160"/>
>
>> </rdf:Bag>
>
>> </bqmodel:isDescribedBy>
>
>>
>
>> But on the webpage, there is:
>
>>
>
>> b>Publication ID:</b> <a
>
>> href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=8983160"
>
>> target="_blank">8983160</a>
>
>>
>
>> The URL is dynamically generated by MIRIAM webservices. I fact in the
>
>> new version of BioModels DB, to be released in the fall, the URL does
>
>> not point to PubMed anymore, but to the EBI extended Medline, more
>
>> comprehensive. BUT the URI stored in the model is still the SAME.
>
>>
>
>> Similarly for a DOI:
>
>>
>
>> <bqmodel:isDescribedBy>
>
>> <rdf:Bag>
>
>> <rdf:li rdf:resource="http://www.doi.org/#10.1063/1.1681288"/>
>
>> </rdf:Bag>
>
>> </bqmodel:isDescribedBy>
>
>>
>
>> is transformed in:
>
>>
>
>> b>Publication ID:</b> <a href="http://dx.doi.org/10.1063/1.1681288"
>
>> target="_blank">10.1063/1.1681288...</a>
>
>>
>
>> That system is very flexible. You can use any resource listed in
>
>> MIRIAM resources, and this resource can be extended at will (note that
>
>> we distribute XML version of the resource for local use). But it is
>
>> still robust and expressive.
>
>>
>
>> Cheers,
>
>>
>
>> On Wed, 28 Mar 2007, Melanie Nelson wrote:
>
>>
>
>>> Wow, I haven't posted to this list in a long time...
>
>>> But I feel compelled to give a little advice as
>
>>> someone who's spent a lot of time integrating
>
>>> biological information and therefore has made a lot of
>
>>> mistakes!
>
>>>
>
>>> By all means, have a best practice encouraging people
>
>>> to use the GO cellular_component ontology to describe
>
>>> organelles and cells. You could probably also use the
>
>>> molecular_function ontology for proteins (although
>
>>> this will be messier). However, neither is likely to
>
>>> be a complete, i.e., there will be models that
>
>>> reference a biological entity not in the GO
>
>>> ontologies. Also, there will be cases where the entity
>
>>> the model references is most properly thought of as
>
>>> related in some way (e.g., a subset, a superset, or a
>
>>> "sibling") to the GO entity. You can spend ages
>
>>> sorting this sort of thing out and coming up with
>
>>> consistent rules for handling all the relationships.
>
>>>
>
>>>
>
>>> Since you aren't really interested in sorting out this
>
>>> biological mess, you may want to consider letting
>
>>> people choose their own ontology and just reference
>
>>> it.
>
>>> An example of this practice is in the MIAME project:
>
>>> http://www.mged.org/Workgroups/MIAME/miame_1.1.html
>
>>>
>
>>> About the citations- my memory of this is fuzzy, but I
>
>>> think the original intent was that people should
>
>>> provide the PubMed ID where possible. However, not all
>
>>> journals are indexed in PubMed (for instance, there is
>
>>> a CellML paper published in one that is not), so the
>
>>> model needs to handle full citation info, too. The BQS
>
>>> model handles both, and then some, which is why we
>
>>> chose it.
>
>>>
>
>>> Hope this is helpful,
>
>>> Melanie
>
>>>
>
>>>
>
>>> --- Andrew Miller <ak.miller at auckland.ac.nz> wrote:
>
>>>
>
>>>> Matt wrote:
>
>>>>> I don't think this is a good idea.
>
>>>>>
>
>>>>> - I think bioentity should be depreciated, it has
>
>>>> not intrinsic semantic value.
>
>>>>>
>
>>>> It does, unfortunately, seem to usually target a
>
>>>> literal node at the
>
>>>> moment. It would be nice for this to at least be a
>
>>>> resource, which could
>
>>>> provide further information about the biological
>
>>>> entity (or if we decide
>
>>>> not to do that, at least a resource, with a
>
>>>> dictionary and a process for
>
>>>> adding new words to the dictionary to avoid
>
>>>> duplication).
>
>>>>
>
>>>> It seems that GO(Gene Ontology) has terms for cell
>
>>>> types, biological
>
>>>> compartments, and so on, which would offer a better
>
>>>> way to provide this
>
>>>> information.
>
>>>>
>
>>>> I still think that this metadata is useful, even if
>
>>>> the automated
>
>>>> interpretation of it is currently difficult.
>
>>>>> - If it is used currently, it should be left as
>
>>>> its current minimum
>
>>>>> specification which is to label and point to other
>
>>>> bioinformatics
>
>>>>> database IDs.
>
>>>>>
>
>>>> There are three layers of information here:
>
>>>> Layer 1: What biological entity are we describing?
>
>>>> (could be answered
>
>>>> with a GO term).
>
>>>> Layer 2: What information about that biological
>
>>>> entity are we using?
>
>>>> (could be answered with a reference to a paper, and
>
>>>> perhaps even a
>
>>>> reference to raw experimental data).
>
>>>> Layer 3: How was that information translated into a
>
>>>> model (could be
>
>>>> answered with a reference to a paper on the model).
>
>>>>
>
>>>> Layer 3 is clearly information about the model, and
>
>>>> should be described
>
>>>> by as an arc of the model resource.
>
>>>> Layer 1 is described by a literal at the moment.
>
>>>>
>
>>>> Layer 2 is therefore a gap, which we don't have any
>
>>>> proper way to
>
>>>> represent now.
>
>>>>> - The problem is not 'biologically related
>
>>>> paper's' per se, but one of
>
>>>>> identifying what was the primary publication or
>
>>>> publications that
>
>>>>> motivated a model.
>
>>>>>
>
>>>> The publication which motivated the expression of a
>
>>>> model in CellML, or
>
>>>> the publication which motivated the creation of the
>
>>>> model? Most of the
>
>>>> models in the repository were motivated by a paper
>
>>>> about a model which
>
>>>> was not initially expressed in CellML. However, the
>
>>>> way that the
>
>>>> metadata specification works now is that the paper
>
>>>> which describes the
>
>>>> model (not the paper which motivated it) is
>
>>>> referenced from the
>
>>>> information about the model (not information about
>
>>>> the CellML file).
>
>>>>> - There is also the case where a single
>
>>>> publication that contains a
>
>>>>> mathematical model is the one and only primary
>
>>>> source for the model
>
>>>>> itself - a rather common case at the moment.
>
>>>>>
>
>>>> This is what most models in CellML should aim to
>
>>>> attain. Models can be
>
>>>> submitted prior to publication as a model, but the
>
>>>> step of going from
>
>>>> the biology to a model is something which does need
>
>>>> peer review.
>
>>>>> I would prefer that the primary publication(s) be
>
>>>> identified as such,
>
>>>>> which covers the case in where there are some
>
>>>> models in the repository
>
>>>>> built from general review papers of biology with
>
>>>> no math.
>
>>>>>
>
>>>> If a model is built in that way, it should reference
>
>>>> the review papers
>
>>>> as information about the biology, and the author
>
>>>> should ideally submit
>
>>>> it for publication, at which point the reference to
>
>>>> the paper could be
>
>>>> filled in.
>
>>>>> I would prefer references to other related
>
>>>> publications to be bound
>
>>>>> explicitly to a comment in the model metadata -
>
>>>> there should be a
>
>>>>> reason identified by the author/editor/reviewer as
>
>>>> to why there has
>
>>>>> been such an association made.
>
>>>>>
>
>>>> The problem with this is that the comment is not
>
>>>> machine readable, so
>
>>>> there is then no way to get aggregate statistics on
>
>>>> why models are
>
>>>> linked. There is also a potential for significant
>
>>>> duplication of
>
>>>> information, as opposed to a set of standardised
>
>>>> predicate terms for
>
>>>> linking to a set of models.
>
>>>>> As an aside, we also need to determine whether the
>
>>>> bqs schema provides
>
>>>>> enough detail to match publications across
>
>>>> metadata instances for
>
>>>>> different models, and whether we should be
>
>>>> complimenting bibliographic
>
>>>>> data with pubmed Ids and the like.
>
>>>>>
>
>>>> I think that the PUBMED ID is always useful, because
>
>>>> it allows CellML
>
>>>> processing software (e.g. the repository) to link
>
>>>> directly to the Entrez
>
>>>> / PUBMED page. We could build links based on
>
>>>> searches for authors and
>
>>>> titles, but a unique ID is much cleaner. It seems
>
>>>> that many repository
>
>>>> models do have PUBMED IDs on them.
>
>>>>
>
>>>> Best regards,
>
>>>> Andrew
>
>>>>
>
>>>> _______________________________________________
>
>>>> cellml-discussion mailing list
>
>>>> cellml-discussion at cellml.org
>
>>>>
>
>>> http://www.cellml.org/mailman/listinfo/cellml-discussion
>
>>>>
>
>>>
>
>>>
>
>>>
>
>>>
>
>>> ____________________________________________________________________________________
>
>>> Bored stiff? Loosen up...
>
>>> Download and play hundreds of games for free on Yahoo! Games.
>
>>> http://games.yahoo.com/games/front
>
>>> _______________________________________________
>
>>> cellml-discussion mailing list
>
>>> cellml-discussion at cellml.org
>
>>> http://www.cellml.org/mailman/listinfo/cellml-discussion
>
>>>
>
>>
>
>> --
>
>> Nicolas LE NOVERE, Computational Neurobiology,
>
>> EMBL-EBI, Wellcome-Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
>
>> Tel: +44(0)1223494521, Fax: +44(0)1223494468, Mob: +44(0)7833147074
>
>> http://www.ebi.ac.uk/~lenov, AIM: nlenovere, MSN: nlenovere at
>
>> hotmail.com
>
>> _______________________________________________
>
>> cellml-discussion mailing list
>
>> cellml-discussion at cellml.org
>
>> http://www.cellml.org/mailman/listinfo/cellml-discussion
>
>>
>
> _______________________________________________
>
> cellml-discussion mailing list
>
> cellml-discussion at cellml.org
>
> http://www.cellml.org/mailman/listinfo/cellml-discussion
>
>
>
>
--
>
Nicolas LE NOVERE, Computational Neurobiology,
>
EMBL-EBI, Wellcome-Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
>
Tel: +44(0)1223494521, Fax: +44(0)1223494468, Mob: +44(0)7833147074
>
http://www.ebi.ac.uk/~lenov, AIM: nlenovere, MSN: nlenovere at hotmail.com
>
_______________________________________________
>
cellml-discussion mailing list
>
cellml-discussion at cellml.org
>
http://www.cellml.org/mailman/listinfo/cellml-discussion
>
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, (continued)
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Matt , 03/29/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Andrew Miller, 03/29/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Melanie Nelson, 03/29/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Matt , 03/29/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Nicolas Le Novere, 03/29/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Matt , 03/29/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Nicolas Le Novere, 03/29/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Matt , 03/30/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Nicolas Le Novere, 03/30/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Matt , 03/30/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Nicolas Le Novere, 03/30/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Matt , 03/30/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Nicolas Le Novere, 03/30/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Matt , 03/31/2007
- [cellml-discussion] Biological and other non-model citations in CellML metadata?, Nicolas Le Novere, 03/31/2007
[cellml-discussion] Biological and other non-model citations inCellML metadata?, David Nickerson, 03/29/2007
Archive powered by MHonArc 2.6.18.