[rdfweb-dev] advocating use of rdf:ID / rdf:about attributes
danbri at w3.org
Sun Aug 17 10:16:49 UTC 2003
Hi David, and welcome :)
You raise issues that are, as you note, somewhat interconnected. There's
a lot of history here, and navigating through the various design
decisions isn't as simple as it should be. I'll make an attempt!
You ask why we use rdf:nodeID or a plain blank node when describing
people (especially when describing ourselves). You ask why there are
more rdf:ID than rdf:nodeID attributes in deployed RDF/XML FOAF files
than one might expect. And also raise a concern with using mailbox-based
properties (foaf:mbox, foaf:mbox_sha1sum) to indirectly identify
someone. You're also concerned that we are somehow abusing rdfs:seeAlso.
History first. rdf:nodeID is relatively new. I wrote a longish note to
rdfweb-dev about this recently, giving some backstory, let me dig that
out. FOAF is quite deployable without using rdf:ID or rdf:nodeID or
rdf:about on <Person> elements; it is only recently that I've become
satisfied that enough RDF parsers can deal with nodeID for it to be
reasonable to deploy in FOAF documents. So that explains the fact that
it is relatively rare.
Ah, here is my note on "where did this rdf:nodeID thing come from",
...this doc should give basic background, though it sounds like you
might have that.
Some more unwritten history. Where did this FOAF thing come from?
Since '99 I've chaired the RDF Interest Group, building on the RDF-DEV
list I set up the year before. I've watched certain discussions churn
seemingly endlessly, contributing to RDF's reputation for being difficult,
academic, researchy. Two topics in particular were particularly painful
in that regard about 3 years ago (and to a lesser extent subsequently).
Anything to do with "reification" (RDF's mechanism for having one
RDF graph describe another, supposedly for representing the
who-said-what of data provenance); and also anything to do with URIs and
identification, particularly questions about identifying real world
entities (eg. people, companies). FOAF was created in spring of 2000,
when we didn't have a 'live' RDF working group at W3C, when the current
RDF spec was a bit mysterious on certain topics, and when W3C hadn't yet
set up its 'Technical Architecture Group' (TAG,
http://www.w3.org/2001/tag/) to deal with hard architectural questions.
In particular, when I started thinking about FOAF it was in the context
of dealing with RDF's approach to identification. This document from
the www-rdf-interest archives gives a good flavour of the debate:
and shows the beginnings of the FOAF design (at the time I was thinking
of it in terms of the 'ABC' project I was involved in, but that took a
...at that time, we were just figuring out some missing bits of the RDF
design which have since been clarified by the RDFCore Working Group
which Brian McBride and I co-chair (though he's done most of the work).
RDF's notion of 'anonymous' or 'bNode'(as we now call them) nodes in the
graph was a bit vague. We knew that RDF/XML syntax allowed things to be
mentioned without being URI-named, but the RDF model didn't explicitly
say how to represent this in the abstract triples view. More recent RDF
specs do this.
So so so... my point. FOAF came into being prior to rdf:nodeID, prior to
a well worked out notion of 'blank nodes', and at a time when we were
trying out various possible design avenues that RDF might go down, and
various deployment conventions for putting that theory into practice.
The first FOAF design went down the route you recommend. For each person
in the FOAF network (then called RDFWebRing) we made up an abstract
identifer, eg var:1231516235, var:genid:danbri or whatever. All the
other FOAF files that mentioned this person had to use exactly that same
URI to identify them. The dataset as of spring 2001 was still pretty
tiny (since we weren't encouraging anyone to use this stuff yet) and is
captured for all time in the old RDFWeb t-shirt, see
What did we find? that that was depressingly fragile. That looking up
"the" (or even "a") URI for a person was undeployably annoying, and that we
desperately needed some conventions for merging together descriptions of
people without requiring everyone to use exactly the same string to
identify any given person.
Early 2001 I wrote up a bit of a description of this problem in
http://rdfweb.org/2001/01/design/smush.html and both Libby and I set
about hacking our RDF tools into a more robust form that could merge RDF
graph nodes together based on 'uniquely identifying' properties.
Meanwhile the DAML+OIL (now -> OWL) language was under-way, which conveniently
provided a way of identifying which RDF properties had the 'at most one'
semantic needed to license such data merging. When the new RDF Core
group had its first face to face meeting just over 2 years ago (July
2001) the first thing it nailed down was a confirmation that RDF graph
nodes could indeed be 'blank' (ie. not labelled with URIs). Round about
the same time W3C's TAG got going, and people started to pester them
with questions like "Can two URIs denote the same thing? Can a thing
have two URIs? How many angels can fit on the URI of a pinhead?" and so
An overarching concern with FOAF has been to find practical and
pragmatic ways of deploying RDF in the public Web, without ignoring or
being slowed down by these important yet frustrating discussions.
FOAF doesn't need to wait for agreement on 'URIs for people' before
FOAF doesn't need everyone to know everyone else's URI or URIs.
FOAF doesn't rule out the possibility that a URIs-for-people concensus might
sweep through the Web community. If this happens, FOAF vocabulary,
doc formats and tools should 'just work'. Meanwhile, things just work
A longer account of FOAF's approach to identifing people and other
is at http://rdfweb.org/mt/foaflog/archives/000039.html -- I hope the
combination of this email, that article, and the other things I've cited
will help explain our current design. I also strongly recommend R.V.Guha
and Rob McCool's TAP writeup at http://tap.stanford.edu/sw002.html as a
good primer on why RDF and the Semantic Web need 'identification by
To address a couple of specific points you raised:
i) you claim FOAF's use of rdfs:seeAlso is broken.
As an editor of both FOAF and RDFS specs, I'm pretty sure it isn't.
However the customer is always right, it may be that clarification would be
useful, either in the RDFS spec or in supporting materials. In that spec we
tried to be open and pluralistic to allow people to experiment with various
deployment modes (eg. RDF inside XHTML, XMP embedding, maybe alternate RDF
If the overal impression you're left with is that it is /wrong/ to
anticipate RDF/XML as a common format for rdfs:seeAlso'd data, then we've given the wrong impression. RDF/XML is a great use of rdfs:seeAlso, its major
anticipated use on my understanding. The fact that you might not always
get some RDF doesn't count against that.
I've just written a longer piece on this in the ESW Wiki, and would
appreciate it if folks could take a look. See:
On the specific claim that we are somehow abusing rdfs:seeAlso as if it
were an identifying property (ditto foaf:name), that is an
FOAF tools should only treat uniquely identifying properties as being
uniquely identifying. Neither foaf:name nor rdfs:seeAlso are described
in their respective schemas as being an owl:InverseFunctionalProperty.
So there is no reason to expect them to have that characteristic. The
FOAF spec certainly doesn't claim that they are uniquely identifying.
If it gives that impression we need to fix it!
This brings me to:
ii) suggestion that foaf:mbox isn't uniquely identifying because
people may share or lose mailboxes or have many such mailboxes.
This is a common confusion.
FOAF uses RDFS and OWL to define its terminology. We say that several
properties are of type owl:InverseFunctionalProperty. This means that
there should only be at most one thing in the world with any given value for
such a property. These include foaf:homepage, foaf:mbox,
foaf:mbox_sha1sum, foaf:aimChatID etc.
Take a look through http://rdfweb.org/mt/foaflog/archives/000039.html
for a longer explanation.
In short, with mailboxes:
- there are many mailboxes (eg. shared ones) which are not the
foaf:mbox of anything
- there are many things which have a foaf:mbox relationship to
For example, mailto:rdfweb-dev at vapours.rdfweb.org is not the foaf:mbox
of anything, even though it is an Internet mailbox.
Also, mailto:danbri at w3.org and mailto:danbri at rdfweb.org and
mailto:daniel.brickley at bristol.ac.uk are all values of foaf:mbox for
something (ie. me). Furthermore, if the University of Bristol
decides to re-assign the 'daniel.brickley at bristol.ac.uk' mail address to someone
else, that does not affect the truth of the claim that
'mailto:daniel.brickley at bristol.ac.uk' is a foaf:mbox of mine, since we
have defined the property carefully to be in terms of its first user.
It can still be my foaf:mbox even if someone else subsequently takes it
OK I've typed too much again. I hope all this goes some way to
explaining our current design and that we have explored the other avenues you
mention. If you want to include a URI (whether URN, http URI, or
whatever you like) within your FOAF file as an identifier for yourself,
nothing's stopping you. It may even turn out to be useful. What I'm
pretty sure we do know from earlier experiments is that for FOAF to be a
successfully decentralised, loosly coupled system, additional
conventions (especially use of varied uniquely-identifying properties)
will also be needed.
If this note makes sense I'll link it from the FOAF FAQ since these topics
come up frequently, although that hasn't been clear from the Web site.
(I've also linked the FAQ from the rdfweb.org and
www.foaf-project.org homepages, doh! for not having done so before).
Thanks again for your questions, suggestions and interest in this project,
More information about the foaf-dev