Skip to content

x + 3

  • About
  • Résumé

Tag: HTTP

Dereferencing URIs

It’s a topic that has come up countless times in discussions of the Semantic Web (e.g.), and it came up recently on #code4lib: should all URIs be dereferenceable, or is it worthwhile to use non-HTTP URI schemes or non-resolving HTTP URIs?

It’s a topic that has come up countless times in discussions of the Semantic Web (e.g.), and it came up recently on #code4lib: should all URIs be dereferenceable, or is it worthwhile to use non-HTTP URI schemes or non-resolving HTTP URIs?

The consensus from Semantic Web developers seems to be that URIs need not be dereferenceable, which has a certain amount of sense to it. It you give me the URI “http://jonathan.brinley.name/”, what would you put at the location “http://jonathan.brinley.name/”? If it’s a description of me, that description also has the URI “http://jonathan.brinley.name/”, giving us two resources with the same URI. With this data now in our system, we can make absurd statements like:
<http://jonathan.brinley.name/> <#describes> <http://jonathan.brinley.name/> .
This is all very ambiguous, since it could be saying:

  1. I’m describing myself
  2. I’m describing the document at “http://jonathan.brinley.name/”
  3. The document at “http://jonathan.brinley.name/” is describing me
  4. The document at “http://jonathan.brinley.name/” is describing itself

Thus the GIGO principle rears its ugly head. If you give two separate resources the same URI (which is supposed to be a globally unique identifier, remember), then you should expect ambiguity to follow. If you want to identify something uniquely, and that something is not on the web, you should give it a distinct URI from something that is on the web.

So, that answered, we turn to the second half of the problem: is it worthwhile to use non-HTTP URI schemes or non-resolving HTTP URIs?

The recent discussion started with a mention of “info” URIs. These can be used to uniquely identify resources, but have the (potential) drawback of not being dereferenceable. As established above, non-dereferenceability is not inherently bad. If one simply wants to identify something uniquely, the “info” scheme will work, as will several other schemes.

But there is a certain utility in dereferenceability. As edsu asked: “if you were processing an xml file that included a particular namespace wouldn’t it be nice to get a document that describes that namespace without resorting to google?” This is a place where the HTTP scheme can still be useful, even if the resource itself isn’t available on-line. Nothing says a server has to respond to an HTTP Get request with either a 200 “OK” or a 404 “Not Found”. A 303 “See Other” is a perfectly reasonable response to a request for a particular resource, when all that can be provided is a description of that resource. The server can then point to the URI where this description does reside, which will be distinct from the URI for the resource it describes.

Author Jonathan BrinleyPosted on 2007-05-082008-09-27Categories MetadataTags dereferencing, HTTP, Semantic Web, URIs

Search

Categories

  • Coding
  • Metadata
  • Libraries
  • Links
  • Etcetera

Featured Posts

  • nginx as HTTPS proxy for Elasticsearch
  • Docker Desktop Filesystem Caching: Faster with Mutagen
  • Remove DEFINER clauses from MySQL dumps
  • Reaching localhost from a Docker container
  • Install Language Packs with WP-CLI

Sponsors

Archives

b359602a1a2554debaf6f11575bd272a-332
  • About
  • Résumé
x + 3 Proudly powered by WordPress