Building Reactome
- Assumes that you have mysql installed, a local installation of D2RQ
- Download the SQL for reactome from http://reactome.org/download/current/sql.gz
- The top of the reactome sql says the version number e.g.
-- MySQL dump 10.9 -- -- Host: localhost Database: test_reactome_25 -- ------------------------------------------------------
- Let's say that the version number is 25 (as it is now). In mysql
create database reactome_25;
- In the shell now, renamed the sql.gz file to something with the version in it, say sql_reactome_25
cat sql_reactome_25 | mysql -u root reactome_25
- Move in to the d2rq installation directory then execute the following to generate the default mapping to rdf
./generate-mapping -u root -d com.mysql.jdbc.Driver \ -o reactome_25_auto.n3 jdbc:mysql://127.0.0.1/reactome_25
- Create a new version of reactomemapping.pl in which the __DATA__ section is replaced with contents of reactome_25_auto.n3
- Edit the jdbc line:
d2rq:jdbcDSN "jdbc:mysql://127.0.0.1/reactome_25";
to
d2rq:jdbcDSN "jdbc:mysql://127.0.0.1/reactome_25?zeroDateTimeBehavior=convertToNull";
- Some classes/tables may have changed from a previous version. I used diff on the generated d2rq maps to figure out what did. From 24 to 25 we lost map:LiteratureReference__1stAuthorSurname (a slot), and gained map:Person_project (another slot)
- Table ReactionlikeEvent_2_entityOnOtherCell (a class/slot mapping table?)
- Table Complex_2_entityOnOtherCell (a class/slot mapping table?)
- The frames version of the ontology is in the Ontology table in the ontology column. It is a blob. select ontology from Ontology will give you it, crud and all. You are interested in just the defclass lisp forms. A modified version of those defclass statements will become reactome-25-frames.lisp. Using reactome-frames.lisp as a model, edit reactome-25-frames into shape. This involves replacing %3A with ":", removing the ";+", and copying a couple of definitions from the top of reactome-frames.lisp into the new file.
- Having determined that there are not major changes in kind we should be able to run reactomemapping25.pl to generate the *real* d2rq mappping then run d2rq to generate the rdf. First we adjust the namespaces
@prefix map: <http://purl.org/science/d2rq/reactome25/> . @prefix vocab: <http://purl.org/science/ontology/reactome/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#> . @prefix record: <http://purl.org/commons/record/reactome/> .
- Run reactomemapping25.pl
perl reactomemapping25.pl > reactomemapping25.n3
- There is one more manual fix having to do with the go URIs. The database uses accessions as a string. The properties are: GO_CellularComponent.accession, GO_MolecularFunction.accession, GO_BiologicalProcess.accession. We defined anew property that has as its value the go URI. Here's one of them - the others by analogy (or by copying from a previous version). These are added to reactomemapping25.n3
map:GO_BiologicalProcess_property_uri a d2rq:PropertyBridge; d2rq:belongsToClassMap map:GO_BiologicalProcess; d2rq:property vocab:uri; d2rq:uriPattern "http://purl.org/obo/owl/GO#GO_@@GO_BiologicalProcess.accession@@"; .
- Now we can dump the rdf. In the d2rq directory
sh dump-rdf -m ~/neuro/convert/reactome/reactomemapping25.n3 -o reactome25.nt \ -f N-TRIPLE -b http://purl.org/commons/record/reactome/
- Now generate the owl file for the schema. In lisp load reactome-frames.lisp (I added the |person| and |entityOnOtherCell| to the list of properties at the top. The load frames-to-owl.lisp, then serialize with
(write-rdfxml reactome-record "reactome-records-25.owl")
