Bundles/hla

Bundle: hla
Graph: <http://purl.org/science/graph/hla>

Derived from file hla.dat exported by the IMGT/HLA database.

The database, like so many others, is a set of records, each with its own accession id. Each record describes one HLA allele. For example, record HLA00007 gives information related to allele HLA-A*0202.

The HLA records are given pseudo-common-naming-system URIs, e.g. <http://purl.org/commons/record/hla/HLA00007>:

select distinct ?p ?o
where 
 {
   graph <http://purl.org/science/graph/hla> { <http://purl.org/commons/record/hla/HLA00007> ?p ?o. }
 }

However, for now the records themselves are not of much use. The records link to alleles (via foaf:primaryTopic, just as for Medline records to journal articles), which are classes with URIs defined in the MaHCO bundle Bundles/mahco-hla. The important links in the hla bundle are from alleles to papers indexed by Medline:

select distinct ?s ?p
where 
 {
   graph <http://purl.org/science/graph/hla> { ?s ?p <http://purl.org/stemnet/HLA#A_0202>. }
 }

The articles are related to the alleles via both IAO "mentions" <http://purl.obofoundry.org/obo/IAO_0000142> (for uniformity with Bundles/medline/alleles) and IAO "is about" <http://purl.obofoundry.org/obo/IAO_0000136>, the second reflecting the much stronger relationship suggested by the allele being manually curated in IMgT/HLA as a being reference for the particular allele.

As of August 2009 this bundle provides 2848 links from article to allele.

In the future we might extract other information from the flat file, such as gene and/or protein sequences, or Uniprot references.


Reference Sequences and Perturbations

The IMGT also provides sequence alignments; for example, B_prot.txt for HLA-B:

HLADB-2.26.0-Jul 2009
HLA-B Protein Sequence Alignments
Sequences Aligned: 17 July 2009
Steven G. E. Marsh, Anthony Nolan Research Institute.

Prot. Pos.        -30        -20        -10                10         20
B*070201                MLVM APRTVLLLLS AALALTETWA GSHSMRYFYT SVSRPGRGEP
...
B*080101                ---- ---------- ---------- --------D- AM--------
...
...
B*1404                  **** ********** ********** *-----H--- A---------
...
B*15010102N             -R-T ---------- G--------- -ECGVGREMA --G-SEGTAG

The items in the reference sequences are represented as

 {
  ?locus sc:primary_allele ?allele.
  ?allele ro:has_part ?item.
  ?item rdf:type ?x
 }

for each item, where ?x is sc:SequenceGap or a CHEBI amino acid class, and ?locus is an HLA gene/locus (HLA-B in the data above).

Perturbations in other alleles are represented as

 {
  ?allele sc:perturbation ?item.
  ?item sc:reference_sequence_item ?ritem.
 }

where ?ritem is the item from the reference sequence corresponding to ?item.

For example:

Find the perturbations in B*1404

 PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
 PREFIX sc: <http://purl.org/science/owl/sciencecommons/>
 prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 prefix ro:   <http://www.ifomis.org/bfo/1.1/ro#>
 
 select ?pos ?from ?p ?to
 where 
  {
   ?a rdfs:label "B*1404".
   ?a sc:perturbation ?p.
   ?p sc:reference_sequence_item ?item.
   ?p rdf:type [ rdfs:label ?to ].
   ?item rdf:value ?pos.
   ?item rdf:type [rdfs:label ?from].
  }
  order by ?pos


pos from p to
7 tyrosine residue http://purl.org/science/hla/pos/B*1404/0007 histidine residue
11 serine residue http://purl.org/science/hla/pos/B*1404/0011 alanine residue
66 isoleucine residue http://purl.org/science/hla/pos/B*1404/0066 asparagine residue
67 tyrosine residue http://purl.org/science/hla/pos/B*1404/0067 cysteine residue
69 alanine residue http://purl.org/science/hla/pos/B*1404/0069 threonine residue
70 glutamine residue http://purl.org/science/hla/pos/B*1404/0070 asparagine residue
71 alanine residue http://purl.org/science/hla/pos/B*1404/0071 threonine residue
97 serine residue http://purl.org/science/hla/pos/B*1404/0097 tryptophan residue
113 histidine residue http://purl.org/science/hla/pos/B*1404/0113 tyrosine residue
114 aspartic acid residue http://purl.org/science/hla/pos/B*1404/0114 asparagine residue
116 tyrosine residue http://purl.org/science/hla/pos/B*1404/0116 phenylalanine residue
131 arginine residue http://purl.org/science/hla/pos/B*1404/0131 serine residue
156 arginine residue http://purl.org/science/hla/pos/B*1404/0156 leucine residue
163 glutamic acid residue http://purl.org/science/hla/pos/B*1404/0163 threonine residue
171 tyrosine residue http://purl.org/science/hla/pos/B*1404/0171 histidine residue
177 aspartic acid residue http://purl.org/science/hla/pos/B*1404/0177 glutamic acid residue
178 lysine residue http://purl.org/science/hla/pos/B*1404/0178 threonine residue
180 glutamic acid residue http://purl.org/science/hla/pos/B*1404/0180 glutamine residue

Up to ImmPort