[rdfweb-dev] Timeout. Chill.

Dan Brickley danbri at w3.org
Fri Aug 8 11:12:31 UTC 2003


A few comments.

Firstly, everyone, chill out. Everyone. Please.

Secondly, what we have here is perfectly natural, it's the growing pains
often seen when a small group's work gets thrust into the limelight, and 
shared assumptions and context suddenly need making explicit. As I hope
you've noticed I've been spending a lot of my spare non-work time on progressing
that in the last couple of months.

What we haven't done enough of is talk about some of the ways in which
we've been experimenting with FOAF, and about the social implications of
RDF's semistructured design. In short, mixed namespace tag-soup in XML is a 
chaotic jungle. In RDF, it has just that much structure imposed on it by the 
over-arching framework; enough structure, in fact, to allow us to
experiment with new FOAF property ideas by chucking experimental
properties and classes into FOAF files and getting the feel for how they
work. Now this worked quite well for us for a while, and we could always 
check the scutterstats to see (a) which properties were out there, (b)
which ones had caught on, which had flopped. 

There is a whole layer of RDF infrastructure which is perfectly happy
with content changes at this level, just as whole layers of XML
infrastructure don't care at all if the words you use between your tags 
change, but freak out if the words between '<' and '>' change. The RDF 
harvesters ('scutters') we use don't care what tags are used, just so
long as they're in RDF syntax so we can extract statements encoded as
triples and merge them into our data stores. The data merging
conventions we use ('smushing', owl:InverseFunctionalProperty etc) also
don't care how we describe things, but are capable of using schema
annotations so that newly-deployed identifying properties can 'just
work' without need for software updates. 

All this is good: we want to  describe more kinds of thing (Classes), 
attributes of those things and relationships amongst those things (Properties) 
than we can anticipate in one big up-front schema design. See
http://c2.com/cgi/wiki?BigDesignUpFront for software engineering reasons why 
trying to design everything in one go is problematic. These arguments
hold all the stronger in a distributed Web environment where multiple
parties are working to different schedules, agendas, goals. One schema
does not fit all, so RDF's free-wheeling schema mixing approach is a
natural fit for the Web environment.

But yes, freedom inevitably brings flexibility, and the 'how the hell do
I code to this' reaction is not an uncommon one. Perhaps think of it
this way: the expressive freedom provided by RDF provides a platform
where we can reach tighter agreements (perhaps in sub-communities)
without having to have one big schema design committee determine the
one-size that fits all. People that want to use FOAF for homepage
searching, school reunion mapping, opensource collaboration support, blogroll
interchange, addressbook interchange, photo description, spam filtering, 
trust applications, Weblog discovery, file sharing, collaborative
filtering, ... all have different (though related) agendas. I spend my 
working life in formal standardisation processes, and wouldn't wish on
anyone the task of picking 'the' schema for person description. Data
mixing and vocabulary combination are essential here, and not something
to be thrown away lightly.

Jim, I remember not so long ago your being quite vocally an RDF skeptic
on various mailing lists. I wonder if you could comment on what made it
'click' for you?

Julian, w.r.t. RDF being a "write only" format, I think that's an
overstatement, in fact it's (in my experience) almost backwards. I've
come to think of RDF as being somewhat akin to an SQL 'view'. In SQL
systems you can take multiple tables of information and project them
into a single, virtual composite which draws info from several sources
and hooks them together based on uniquely identifying pieces of content.
In RDF, as with SQL views, we have this sense of RDF as an aggregation 
mechanism that is good for taking (and merging) scattered info. But just
as with SQL views, with RDF you often get into a situation where having 
pulled a single RDF graph by merging from multiple sources, you wonder
what doing an update/insert into that graph would mean. So RDF (at least 
in FOAF-related contexts) often gets used in a search-engine like way.
You hoard up a whole bunch of it, then do some canonicalisation on the 
merged dataset (most essentially with respect to merging
identities/nodes, but also property inverses etc) and then search 
around it for whatever your application is concerned with. 
A common processing style is to treat it 
as a black box that you'll rummage in with a particular goal in mind
(eg. schoolHomepage and name and chatID). Typically you won't try to
make use of every fragment of data you've hoarded, except to make it
available to (human and machine) apps which may want to query for it. So
RDF graph API and query languages are a key part of the puzzle... The
Jena tutorial and Ian Davis's FOAF/PHP doc give some examples of the way 
RDF tools allow such processing.

A point of FOAF was to have a way of freely mixing data about
people with data about other things people care about, books, places, 
restaurants, organisations, documents; things they're buying, selling, 
interested in, describing... I fail to see how we can achive this
without some conventions for free mixing of independently created
vocabularies. But I do acknowledge that we have our work cut out
figuring how to make these freedoms practical...



More information about the foaf-dev mailing list