FOAFCorp/RDF data format

Dan Brickley danbri at w...
Wed Dec 12 22:38:09 UTC 2001



(sorry, missed these messages as I read the list thru the Web; btw Josh,
any chance you could configure YahooGroups to make the message archives
publically visible, so we can point people without YahooGroup logins at
these discussions)

This is a bit long, and perhaps a bit technical for some here. I've copied
the RDFWeb-dev list too, since thats where we discuss the FOAF stuff.
Context: http://rdfweb.org/people/danbri/2001/09/foafcorp/
http://www.theyrule.net/


Replying to bits of http://groups.yahoo.com/group/theyrule/message/43
From: "josh on" <josh at f...>
Date: Wed Dec 12, 2001 7:05 pm
Subject: Re: [theyrule] Re: theyrule.net offline? / RDF data format ideas


[[
In the foaf rdf version (which I think is a great idea) it makes the
connections (as far as I can tell) algorythmically. I agree with you that
this is something that may be better to actually mannuaklly put into the
data.

I think this for two reasons:
1. speed of display (may not be significant).
2. Accurracy.
]]


What I've done isn't really about display; I just happened to have a neat
tool to hand (http://rdfweb.org/people/damian/RDFAuthor) that made
pictures easy to generate.

In RDF, there are basically two strategies for uniquely identifying
things.

(a) by URI (Uniform Resource Identifier)
The first is by direct use of URIs (a generalisation of URLs).
There are URI schemes for lots of things, phone numbers, Web stuff
(http:*), Java classes etc. There is an informal index of URI schemes at
http://www.w3.org/Addressing/schemes.html

(b) by description
Lots of things (esp. politically sensitive, real world things like people
and companies) don't have well known URI names. For eg. there is no well
established way of naming me with a string like 'urn:people:uk:nx93xyzb'.
So, in lots of cases RDF applications will mention something (eg. a
company, a person) and instead of using a URI to identify it, they'll
simply mention some uniquely identifying properties of that thing instead.
So for companies, RDF willl usually say things like "the company whose
stockcode is XYZ" or "the company whose homepage is blahblah". For people,
in the FOAF testbed, we have used a couple of identifying properties. One,
'foaf:mbox', is an internet mailbox owned/controlled by one person. The
other, 'foaf:homepage', is a homepage.

So RDF apps allow you to assign arbitrary properties to arbitrary things.
But some properties have this extra characteristic of helping you identify
just which thing you're talking about.

What we've done with FOAF and photos might make this a bit clearer. We
wanted to build a distributed database of photos, where we markup
unambiguously some information about who is in the photographs.

http://swordfish.rdfweb.org/discovery/2001/08/codepict/ for demo or
see http://swordfish.rdfweb.org/discovery/2001/08/codepict/scutterplan.txt
for a list of data files describing photos and other stuff.

If you take a look at one example, it describes a photo with a few people
in it. http://rdfweb.org/people/danbri/rdfweb/lib-ecdl.rdf

First we describe the photo,
<rdf:Description
rdf:about="http://swordfish.rdfweb.org/photos/2001/09/08/000791.JPG">
<dc:hasVersion
rdf:resource="http://swordfish.rdfweb.org/photos/2001/09/08/thumb-000791.JPG"
/>
<dc:title>In a German Pub</dc:title>
<dc:description>Libby, Michael and Rob at ECDL 2001 in Germany.
</dc:description>

<dc:date>2001-09-08</dc:date>
<dc:coverage>Germany</dc:coverage>
<foaf:location>
<foaf:Geo rdf:value="Germany" />
</foaf:location>
</rdf:Description>



...and then we describe the people depicted in the photo:

<foaf:Person>
<foaf:name>Libby Miller</foaf:name>
<foaf:mbox rdf:resource="mailto:libby.miller at b..." />
<foaf:depiction rdf:resource="http://swordfish.rdfweb.org/photos/2001/09/08/000791.JPG"
/>
</foaf:Person>
..etc


So, to connect this back to They Rule...

In FOAF we assume that the property we call 'foaf:mbox' is *uniquely
identifying*. This is expressed in a format machines can use; they know
that whenever they see a description of something (a person or whatever)
and it has a foaf:mbox of some mailbox, the descriptions will always be
talking about the same thing. This trick avoids the need to have a central
registry of people. We also have a property 'foaf:name', but we don't
treat that as uniquely identifying: multiple people could have the same
name.

What I did when I mocked up FOAFCorp was to try to find a way around my
not knowing the email addresses of all these corporate bigwigs. Instead of
mailbox, for the corporate example I *did* use a name property, 'fc:name',
since I decided to assume that the names from They Rule were managed in a
way that made them effectively unique, in a way that wasn't true of the
looser, messier foaf:name data.

A similar strategy would be to invent a set of eg numerical identifiers
for use with They Rule, so the RDF systems could organise data based on a
property called something like 'theyRule:personID'.


Anyway this could all get too theoretical and boring rather swiftly. So I
propose an experiment. Lets a few of us go and collect up some additional
information about some of the board members mentioned in They Rule.
Specifically, I'd like to try cataloguing photographs of the board
members, where photos are available on the various corporate websites.

For example, in the Coca Cola site I found

http://www2.coca-cola.com/ourcompany/board.html
-> http://www2.coca-cola.com/ourcompany/img/circ_pic_robinson.gif


I'll use this to show how we might make some experimental RDF files that
describe how to catalogue images for each entry:

This is the current RDF describing Coca Cola...

fc:Company fc:name="Coca Cola co.">
<homepage>
<Document dc:title="Coca-Cola" web:about="http://www.cocacola.com/"/>
</homepage>
<fc:stock>KO</fc:stock>
<fc:board>
<fc:Committee>
<!-- snip snip removed other board members for readability -->

<fc:member><Person fc:name="James D. Robinson III"/></fc:member>

<!-- ... -->
</fc:Committee>
</fc:board>
</fc:Company>



What we need is one that includes photos (plus perhaps a controlled
theyRuleID or something. For example:

fc:Company fc:name="Coca Cola co.">
<homepage>
<Document dc:title="Coca-Cola" web:about="http://www.cocacola.com/"/>
</homepage>
<fc:stock>KO</fc:stock>
<fc:board>
<fc:Committee>
<!-- snip snip removed other board members for readability -->

<fc:member>
<Person fc:name="James D. Robinson III">
<theyrule:personID>p12246</theyRule:personID>
<foaf:depiction rdf:resource="http://www2.coca-cola.com/ourcompany/img/circ_pic_robinson.gif"/>
</Person>
</fc:member>

<!-- ... -->

</fc:Committee>
</fc:board>
</fc:Company>

...or similar. I could make a perl script or something available so
non-RDF geeks could get the data back out again.


If folk are interested in having a go at this, I'll write up a howto and
we can see about cataloguing a couple of boards. Mostly this is just
extracting name / image URL pairs from corporate websites.

gotta run. hope i'm making some sense!

Dan

ps. does the theyrule database have IDs for each person it knows about?

--
http://rdfweb.org/people/danbri/




More information about the foaf-dev mailing list