[rdfweb-dev] foaf:Person identification

Nick Knouf nknouf at MIT.EDU
Thu Jan 8 03:09:21 UTC 2004


On Wednesday, January 7, 2004, at 06:36 PM, Mike Higginbottom wrote:

> On Wed, 7 Jan 2004 22:08:01 -0000, Jim Ley <jim at jibbering.com> wrote:
>
>> mbox is just one of many things which can be used, in fact, weblog and
>> homepage are also commonly used (judging by the time smushing takes 
>> on them)
>>
> Sure, but they suffer from two problems.  Firstly, none of them 
> necessarily act as a unique representation of a single person.  
> Secondly, they aren't congruent with a sensible defintion of a 
> person's identity (see below).

Until we can completely digitize ourselves, something has to stand in 
as a proxy :-)

Seriously, though, the mbox seems to be the best way to go.  By using 
mbox, I can have FOAF files that correspond to different aspects of my 
life, such as personal and work.  I think this is important (and in 
fact is why I have two separate PGP public keys, one for personal 
e-mail, and one for work e-mail).  As long as people keep a record of 
their e-mail addresses (maybe not the easiest thing to do, granted) and 
enter them in their FOAF file, then you can smoosh the data together.

Anyway you look at it, you have to have _some_ ID that you can refer 
to.  A mbox is thousands of times better than a GUID: if I know 
someone's e-mail address (john.smith at w00t.com), then I can find their 
FOAF file.  And if that fails, then, well, I can search for all the 
John Smith's in FOAFspace.  Not ideal, but there's not much else you 
can do (except for the searching other data that might exist, including 
what Stephen has recently talked about).

As an aside, this is a similar problem to what I'm facing at work.  I'm 
designing a RDF system to manage information on subjects that come in 
and do psychology experiments (don't worry, no shocking :-) ).  Because 
of governmental regulations, I can't use "identifying" information to 
refer to these subjects in the database; that means no name, birthdate, 
postal code, and a huge list of other things.  (The rationale is that 
if you had a raw dump of the data you wouldn't be able to connect a 
record to an individual.)  But yet I need some way for a user (a lab 
member) to get information about a particular subject; say you need to 
know how well a subject performed on a previous experiment to know if 
they can participate in a subsequent experiment.

The way around this is similar to using mbox in FOAF.  What I do is 
take a sha1 hash of the givenName, sha1 hash of the surname, and a 
site-specific hash of the same length as the sha1 hashes.  You then 
combine all three hashes using some set of boolean operations to get 
your final hash that you use as your key.

Thus, if I have a user in lab who needs information about a subject, 
they'll be able to enter in the subject's name and get back the records 
that pertain to that subject.  Yet if you had a raw dump of the data, 
you wouldn't be able to figure that out (assuming that you keep the 
site-specific hash secure, which of course entails a whole other level 
of security...)

Thus I have a key to refer to the subject that is based on something 
that Real Live Human Beings (tm) know.

This method doesn't work across the board, of course, because in a 
larger realm there will be namespace clashes with givenNames and 
surnames.  But in the limited realm of psychological experiments in a 
certain lab at a certain university, it works.

Whew!  I guess that was a very long-winded way of saying why I think 
having a hash based on something that Real Live Human Beings (tm) use 
is a good thing :-)

Nick

>>> Would a GUID not be more suitable for this purpose?  The only
>>> problem I see with this is that most people can't generate GUIDs 
>>> locally.
>>
>> I am a not a number...
>>
> But you're not your e-mail addy, your home page or your blog either 
> ;).  None of the existing properies fit the bill but there is an 
> implication that they do.  I can't help finding this a bit of a 
> kludge.  In effect we're saying 'There's no single thing that truly 
> represents a person's identity so we'll take something and squish it 
> and pretend that it does'.  It would seem more honest to me to say 
> 'There's no single thing that truly represents a person's identity so 
> we'll use something completely abstract that can have no semantic 
> meaning attached to it accidentally or otherwise'.
>
>> GUIDs suffer many of the same problems, but more importantly generally
>> aren't thinks that 3rd parties know or can usefully do anything with, 
>> for
>> example if I'm marking up my photo album and want to say this is a 
>> picture
>> of Danbri, I've got to go and find a suitable GUID for him, that'll 
>> take me
>> too long I'm sure (even though I'll have a large local foaf database.)
>>
> Yeah, agreed, sort of.  If you're marking up a picture of Danbri then 
> you're naturally going to use his name.  That's the way we identify 
> people in practical terms.  I can see plenty of situations though 
> where that might not be enough.  What if you know two Danbris?  What 
> if, a couple of years down the line, we've got to a stage where 
> everybody is FAOFd up and our FOAF aware apps go away and find all the 
> Danbris in the world, present you with a list and ask you to select 
> which Danbri you mean exactly?  Going beyond that, I just think it 
> would be very useful to have a truly unambiguous way of referring to 
> someone.  It seems like a fundamental building block of any system 
> that is likely to become automated in some, as yet undefined, fashion. 
>  This is why we have bank account numbers rather than just bank 
> account names for example.  A GUID approach is not necessarily 
> something that is going to be the primary query mechanism in a user 
> interface but I'm sure it would be useful as a back-end thing and 
> possibly even as a user interface element where ambiguities need 
> resolving.
>
> My 2c
>
> Mike
> -- 
> Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
>
>
>
> _______________________________________________
> rdfweb-dev mailing list
> rdfweb-dev at vapours.rdfweb.org
> wiki: http://rdfweb.org/topic/FoafProject
> http://rdfweb.org/mailman/listinfo/rdfweb-dev




More information about the foaf-dev mailing list