RDFa initial context & one dc:

One of the nice things to come out of RDFa 1.1 is its initial context — a list of vocabularies with prefixes which may be used without having to define locally. In other words, just write, e.g., property="dc:title" without having to first write prefix="dc: http://purl.org/dc/terms/".

In addition to making RDFa a lot less painful to use, the list is a good starting place for figuring out what vocabularies to use (if you must), perhaps even for non-RDFa applications — the list is machine-readable of course; I was reminded to write this post when giving feedback on a friend’s proposal to use prefix:property headers in a CSV file for a custom application, and by a recent announcement of the addition of three new predefined prefixes.

Survey data such as Linked Open Vocabularies can also help figure out what to use. Unfortunately LOV and the RDFa 1.1 initial context don’t agree 100% on prefix naming, and neither provides much in the way of guidance. I think there’s room for a highly opinionated and regularly updated guide to what vocabularies to use. I’m no expert, it probably already exists — please inform me!

dc:

The first thing I’d put in such an opinionated guide is to start one’s vocabulary search with Dublin Core. Trivial, right? But there is an under-documented subtlety which I find myself pointing out when a friend runs something like the aforementioned by me — DC means DC Terms. While it’s obvious that DC Terms is a superset of DC Elements, it’s harder to find evidence that using the former is best practice for new applications, and that the latter is not still the canonical vocabulary to start with. What I’ve gathered on this follows. I realize that the URIs for individual properties and classes, the prefixes used to abbreviate those URIs, and the documents which define (in English and RDF) properties and classes are distinct but interdependent. Prefixes are surely the most trivial and uninteresting, but for most people I imagine they’re important signals and documentation, thus I go on about them…

Namespace Policy for the Dublin Core Metadata Initiative (DCMI) (emphasis added):

The DCMI namespace URI for the collection of legacy properties that make up the Dublin Core Metadata Element Set, Version 1.1 [DCMES] is: http://purl.org/dc/elements/1.1/

Dublin Core Metadata Element Set, Version 1.1 (emphasis added):

Since 1998, when these fifteen elements entered into a standardization track, notions of best practice in the Semantic Web have evolved to include the assignment of formal domains and ranges in addition to definitions in natural language. Domains and ranges specify what kind of described resources and value resources are associated with a given property. Domains and ranges express the meanings implicit in natural-language definitions in an explicit form that is usable for the automatic processing of logical inferences. When a given property is encountered, an inferencing application may use information about the domains and ranges assigned to a property in order to make inferences about the resources described thereby.

Since January 2008, therefore, DCMI includes formal domains and ranges in the definitions of its properties. So as not to affect the conformance of existing implementations of “simple Dublin Core” in RDF, domains and ranges have not been specified for the fifteen properties of the dc: namespace (http://purl.org/dc/elements/1.1/). Rather, fifteen new properties with “names” identical to those of the Dublin Core Metadata Element Set Version 1.1 have been created in the dcterms: namespace (http://purl.org/dc/terms/). These fifteen new properties have been defined as subproperties of the corresponding properties of DCMES Version 1.1 and assigned domains and ranges as specified in the more comprehensive document “DCMI Metadata Terms” [DCTERMS].

Implementers may freely choose to use these fifteen properties either in their legacy dc: variant (e.g., http://purl.org/dc/elements/1.1/creator) or in the dcterms: variant (e.g., http://purl.org/dc/terms/creator) depending on application requirements. The RDF schemas of the DCMI namespaces describe the subproperty relation of dcterms:creator to dc:creator for use by Semantic Web-aware applications. Over time, however, implementers are encouraged to use the semantically more precise dcterms: properties, as they more fully follow emerging notions of best practice for machine-processable metadata.

The first two paragraphs explain why a new vocabulary was minted (so that the more precise definitions of properties already in DC Elements do not change the behavior of existing implementations; had only new terms and classes been added, maybe they could have been added to the DC Elements vocabulary, but maybe this is ahistoric, as many of the additional “qualified” DC Terms existed since 2000). The third paragraph explains that DC Terms should be used for new applications. Unfortunately the text informally (the prefixes aren’t used anywhere) notes the prefixes dc: and dcterms:, which I’ve found is not helpful in getting people to focus only on DC Terms.

Expressing Dublin Core metadata using the Resource Description Framework also notes the dc: and dcterms: prefixes for use in the document’s examples (which don’t ever actually use dc:).

Some of these documents have been updated slightly, but I believe their current versions are little changed from about 2008, a year after the proposal of the DC Terms refinements.

How to use DCMI Metadata as linked data uses the dc: and dcterms: prefixes and is clear about the ranges of properties of each: there is no incorrect usage of, e.g., purl.org/dc/elements/1.1/creator because it has no defined range nor domain, while purl.org/dc/terms/creator must be a non-literal, a purl.org/dc/terms/Agent. Perhaps this makes DC Terms seem scarier and partially explains the persistence of DC Elements. More likely I’d guess few know about the difference and lots of use of the DC Terms with non-literal ranges are used with literals in the wild (I might be guilty on occasion).

FAQ/DC and DCTERMS Namespaces:

It is not incorrect to continue using dc:subject and dc:title — alot of Semantic Web data still does — and since the range of those properties is unspecified, it is not actually incorrect to use (for example) dc:subject with a literal value or dc:title with a non-literal value. However, good Semantic Web practice is to use properties consistently in accordance with formal ranges, so implementers are encouraged to use the more precisely defined dcterms: properties.
Update, December 2011: It is worth noting that the Schema.org initiative is taking a pragmatic approach towards the formal ranges of their properties:

We also expect that often, where we expect a property value of type Person, Place, Organization or some other subClassOf Thing, we will get a text string. In the spirit of “some data is better than none”, we will accept this markup and do the best we can.

What constitutes “best practice” in this area is bound to evolve with implementation experience over time.

There you have people supplying literals for properties expecting non-literals. Schema.org RDF mappings do not formally condone this pragmatic approach, otherwise you’d see the likes of (addition in bold):

schema:creator a rdf:Property;
    rdfs:label "Creator"@en;
    rdfs:comment "The creator/author of this CreativeWork or UserComments. This is the same as the Author property for CreativeWork."@en;
    rdfs:domain [ a owl:Class; owl:unionOf (schema:UserComments schema:CreativeWork) ];
    rdfs:range [ a owl:Class; owl:unionOf (schema:Organization schema:Person xsd:string) ];
    rdfs:isDefinedBy ;
    rdfs:isDefinedBy ;

Also from 2011, a discussion of what prefixes to use in the RDFa initial context. Decision (Ivan Herman):

For the records: after having also discussed on yesterday’s telecom, I have made the changes on the profile files yesterday evening. The prefix set in the profile for http://purl.org/dc/terms/ is set to ‘dc’.

Read the expert input of Dan Brickley, Mikael Nilsson, and Thomas Baker. The initial context defines both dc: and dcterms: as prefixes for DC Terms, relegating DC Elements to dc11::

dc http://purl.org/dc/terms/ Dublin Core Metadata Terms DCMI Metadata Terms
dcterms http://purl.org/dc/terms/ Dublin Core Metadata Terms DCMI Metadata Terms
dc11 http://purl.org/dc/elements/1.1/ Dublin Core Metadata Element Set, Version 1.1 Dublin Core Metadata Element Set, Version 1.1

I found the above discussion on LOV’s entries for DC Terms and DC Elements, which use dcterms: and dce: prefixes respectively:

(2013-03-07) Bernard Vatant: Prefix restored to dcterms

(2013-06-17) Bernard Vatant: Although “dc” is often used as the prefix for this vocabulary, it’s also sometimes used for DC terms, so we preferred to use the less ambiguous “dce” and “dcterms” in LOV. See usage at http://prefix.cc/dc, http://prefix.cc/dce, http://prefix.cc/dcterms, and more discussion at http://bit.ly/uPuUTT.

I think the discussion instead supports using dc: and dc11: (because that’s what the RDFa initial context uses) instead. LOV doesn’t have a public source repository or issue tracker currently, but I understand it eventually will.

Now I have this grab-bag blog post to send to friends who propose using DC Elements. Please correct me if I’m wrong, and especially if a more concise (on this topic) and credible document exists, so I can send that instead; perhaps something like an opinionated guide to metadata mentioned way above.

Another topic such a guide might cover, perhaps as a coda, would be what to do if you really need to develop a new vocabulary. One thing is you really need to ask for help. The W3C now provides some infrastructure for doing this. Or, some qualified dissent from a hugely entertaining blogger called Brinxmat.

Some readers of my blog who have bizarrely read through this post, or skipped to the end, might enjoy Brinxmat’s Attribution licences for data and why you shouldn’t use them (another future issue report for LOV, which uses CC-BY?); I wrote a couple posts in the same blogversation; also a relevant upgrade exhortation.

Leave a Reply