Common Naming Project
(Back to URIs)
Goal: URIs for things that we need to talk about in life sciences, such as PubMed and Entrez Gene records, that meet the stated URI requirements and, where necessary, implement a separation of concerns between naming and publishing.
To ensure that the infrastructure (domain name, purl forwarding, server behavior, or whatever) is accountable to the wider community, we envision a steering committee including community members from a representative variety of organizations. To guard against infrastructure neglect, several independent parties the power to change the DNS configuration (should we decide to secure a domain name) or purl.org redirects. Science Commons controls the configuration of the current prototype, but should others choose to join this effort we would recruit them to the steering committee, as we don't want this to be exclusively controlled (or supported) by Science Commons, or by anyone else.
Current approach: The new URIs begin http://purl.org/commons/. We may decide that a separate domain name is needed, although to exploit the redirect and authorization services at purl.org, the new domain name might end up just pointing to purl.org.
Example: http://purl.org/commons/record/ncbi_gene/1003064
Provisional URIs coined so far are described here. In sum:
- URIs are of the form http://purl.org/commons/type/database/key
- 'type' may be record (the database record without commitment to representation), xml (an XML version of the record), html, or occasionally the type of the thing described (e.g. 'article' for journal articles)
- 'key' is usually the native key or identifier for the particular record within its database
These URIs seem to be catching on, so although we say they are experimental, we may be shamed into saying that they are the real thing. The choice of type/databank as opposed to databank/type is one point of doubt. Their documentation leaves much to be desired.
The specific 'xml', 'html', 'asn' etc. URIs forward directly to pages on various servers. The 'record' URI leads to a page generated by a Neurocommons server. It does not lead to the record itself, but rather gives (via a 303 redirect) a basic description of the record and a suite of useful links, including links to the specific record encodings and to third-party information sources (such as scripts providing RDF renderings of the records). This meta-page is (or rather will be) provided in both RDF and human-readable form.
Variants under consideration:
- using a dedicated host name as an alias for purl.org, so that we can use the purl.org server for the time being without being tied to it indefinitely (this is the approach taken by OBO Foundry)
- writing database/type instead of type/database to better support delegation within purl.org's authorization framework
- writing key.type instead of type/key to support programs that determine file type from file extension (this is not currently supported by purl.org)
- writing .../FOO_key instead of .../key to support the use of Qnames in RDF/XML and Turtle (Qnames cannot follow : with a digit) (this is not currently supported by purl.org)
Suggested reading:
- Separation of concerns
- Common Naming report 2007
- The Life Sciences Semantic Web is Full of Creeps
- Life Science Record Name (LSRN)
Technical detail: We are relying (for the time being) on purl.org to field requests for documentation for URIs that name encodings (e.g. XML) of database records; purl.org forwards to the encoding without going through our server. It is therefore a challenge to direct clients to URI documentation ("metadata") via the [URI documentation protocol]. If purl.org were kind enough to insert Link: headers in its 30x responses, then a link to URI documentation could be specified, but this service is not yet implemented by purl.org. Clients who care about documentation for these URIs might apply a Documentation source override to obtain documentation for these URIs until a direct solution becomes available.
