Uniform Resource Identifiers (URIs)

Uniform Resource Identifiers, or URIs, are the general term used for object identifers on the World Wide Web. The most familiar type of URI is the Uniform Resource Locator, or URL. A URL is indicated by the preface http on a web location, so all the web pages you enter in a browser are URLs.

A less common form of URI is a Uniform Resource Name, or URN. A URN looks very different than a URL; it beings with 'urn', and every part of it is separated with colons, rather than slashes.  So a urn might look like 


 A URN can have many of the same components as a URL, but it is not required to follow the same structure; only the first component after the urn: is required, as that identifies the URN namespace under which the URN is constructed. The URN scheme describes how the URL is constructed. Each URN scheme is described in the formal request made for that URL namespace.

Whereas web browsers know how to find, or resolve, URLs, there are no global rules about how to look up a URN. The key characteristic of a URN is that it represents a unique object name, but a URN is not necessarily designed to provide a way to learn more about that name. 

There are many other types of URIs in addition to URLs and URNs, but these two forms—especially URLs—are by far the most used web concepts to reference a particular entity. In general URLs refer to web pages, and URNs refer to non-digital concepts, but as you will see, either type of URI can serve just about any resource identification requirement on the web. Which one is best for referencing vocabulary terms?

URNs versus URLs

Two examples of URIs representing the CF standard variable [4] air_temperature as a URI are as follows (neither of these are real URIs):

The common belief about the difference between URNs and URLs is that a URL should be resolvable, while a URN does not have to be. This may be due in part to the fact that according to their designers, URNs "are intended to serve as persistent, location-independent resource identifiers." This also implies to many that a URN is more persistent, since it is not dependent on a web domain that can change or disappear over time.

However, the 3986 RFC [7] from the IETF (Internet Engineering Task Force) explains that a URI can be just a “name” or “locator” or both, and that “although many URI schemes (e.g. URLs) are named after protocols, this does not imply that use of these URIs will result in access to the resource via the named protocol. URIs are often used simply for the sake of identification” (emphasis added). This means that in essence, URLs can be used just to identify and name, similar to how URNs are used, without having to resolve to a live web address. In other words, the use of URLs is not restricted by resolvability issues.

For many purposes, the IETF statement gives the advantage to URLs over URNs, since even though they both are providing the identifier mechanism, URLs can provide easy navigation to the resource being identified, letting programs and users get more information in a transparent way. (A developer, scientist, or application, by using any web browser and typing the URL of the resource, could potentially obtain more information about that resource in just a few seconds.) This substantially increases the value of the URL, as it enables instant social gratification to users of URLs.

The advantage is emphasized by the reality that URNs also can be associated with persistence and ownership issues (although any such issues caused by organizational change is likely to be more limited, since due to the relatively complex steps required to obtain and manage a URN organizations that do so are likely to be more stable). Further advantage is due to the inability to confidently resolve or learn about URNs by using a single (that is, common) mechanism or resolving service. Finally, it remains non-trivial to obtain a URN 'namespace identifier' that lets an organization create official URNs.

At the same time, URLs clearly suffer from greater transience overall, and their use in situations that do not provide URI resolution will likewise cause frustration, as users enter a URL into their browser and get a "404: resource not found" message. So choosing between the two mechanisms is not automatic.

MMI recommends URLs for situations where these are useful, and intends in its own repository to provide a "resolving service" to offer additional metadata about terms (resources). (This assumes URNs are not specified as the resource identifier of choice.) If most of the URIs will not be resolvable, or if URNs are mandated by applicable organizations; and if URNs can be constructed to meet the needs of the user, then a URN can be an appropriate choice for specifying terms and related resources.

At the time of this update, no existing URN namespaces have both of the following qualities: they are explicitly for the purpose of defining URNs for vocabulary terms, and they have a good set of practices for creating large numbers of URNs. We therefore do not yet recommend the regular use of URNs to represent terms on the semantic web.

A discussion [5] about the Semantic Web [2] and URIs explains the idea that all resources in the Semantic Web (which are identified as URIs) are still 'web stuff' (visible and usable on the web), so providing URIs as URLs is a good strategy. We would like the URI-encoded resources to be easily found by search engines, and to provide some kind of versioning and other metadata about both the referenced resources and their encoding. The use of URLs provides an intuitive and self-contained link to additional resource information, essentially allowing it to be self-documenting as fully as desired.

For more discussion of this topic see Identifying Web Resources [10].