What are the entities that BioPAX-OBO should represent?
- Review BioPAX - things that it covers are things that probably ought to be covered
- Existing BioPAX forms a kind of requirement.
- Which BioPAX level? All, eventually? No - the target should be subcellular biological phenomena, or some principled domain.
- Identify gaps
BioPAX uses entities that are ambiguous between individual molecules and pools of molecules.
Sample ontology questions:
- Can you ascribe a temperature to a molecule?
- So what is temperature? (as an example of a thermodynamic parameters.)
- Ensemble has a probability distribution, and temperature describes that.
Need to identify processes and relationships, in addition to things (or classes thereof)
Enable clean distinctions between things that usually get lumped together.
Clear mereological relations
One recurring problem: We want to say things about "proteins" (and other molecular entities) but it's not always clear what entity (in a principled sense) we are talking about: Individual molecule, pool of molecule, maybe something else. So it's not clear which "protein classes" we need and what their members will be. Having multiple classes for a "single" protein has unfortunate practical consequences.
Conceptual work of identifying relevant entities is first step in process that leads to publishable ontology document(s).
It's a requirement that the ontology let you track provenance. Biologists don't like statements that are not supported somehow (PMID, etc.).
We need to give guidance on how to obtain canonical identifiers for important classes (such as those for genes or proteins) and how mappings are to be used.
Sample knotty situation that we need to account for: A UniprotID identifies an extenstion of proteins in the world. Reactome uses a UniprotID to identify a protein in a reaction. In theory Reactome is talking about a subset of all the proteins referred to by UniprotID.