[foaf-dev] beyond foaf:mbox_sha1sum

Mischa Tuffield mischa.tuffield at garlik.com
Tue Dec 22 13:00:59 CET 2009

Hello All, 

Firstly, a quick note to say that I have updated the foaf validator I wrote [1] to work with the newest version of FOAF.

As for the use of foaf:mbox_sha1sum, I too agree that it shouldn't be marked as archaic as lots of people do tend to use it in the wild, but its use should in turn be discouraged at every opportunity. It was around this time last year we built the foaf builder [2], a UI for writing out FOAF, and we decided to force any mbox_sha1sums into a user's private FOAF data, which is in turn housed behind an OAuth endpoint.

It is noted that we will have to go through the foafbuilder UI, to make sure it fits with the current FOAF ontology, will hopefully get round to that soon ....

As for the blacklist of IFPs, we too have a blacklist of IFPs, which we flag up in the FOAF validator. We also found that many people use the foaf:homepage property to point to their browser's homepage. And as a result we had to blacklist, http://www.google.com, yahoo.com, and bbc.co.uk to name a few. 

Example output of the foaf:validator can be seen here, sorry danbri :)


Comments, feedback, suggestions welcomed, and a final note saying good work with the FOAF upgrade danbri and libby!



[1] http://foaf.qdos.com/validator
[2] http://foafbuilder.qdos.com/
On 19 Dec 2009, at 18:05, Gregory Williams wrote:

> On Dec 19, 2009, at 12:50 PM, Richard Cyganiak wrote:
>> On 19 Dec 2009, at 14:42, Dan Brickley wrote:
>>> Time to gently retire it?
>>> http://ebiquity.umbc.edu/blogger/2009/12/17/foafmbox_sha1sum-considered-harmful/
>>> etc
>>> Thoughts on ways forward?
>>> 1. mark foaf:mbox_sha1sum as archaic
>> No. It's in wide use, and it has valid uses, althout it's perhaps  
>> overused ATM.
>>> 2. rewrite http://xmlns.com/foaf/spec/#term_mbox_sha1sum to more
>>> clearly emphasise the risks, and that decision to publish shouldn't be
>>> made for others
>> It's certainly good to emphasize the risks. There could be something  
>> like: “A service that promises to keep users' email addresses private,  
>> should not publish the sha1-obfuscated form either.”
>> I still think that the property is useful for translating mailing list  
>> archives to RDF, for example. The text should not be so alarmist that  
>> it discourages such uses.
> I agree with Richard here. It's in wide use and it has legitimate uses, so I wouldn't like to see it marked as archaic.
>>> 3. perhaps remove the owl:InverseFunctionalProperty typing (this will
>>> help with OWL DL compatibility too)
>> +1. In practice, doing IFP smushing on this property according to the  
>> OWL spec is a recipe for disaster anyway [1].
> I'm not convinced about this "recipe for disaster" stuff. The pedantic web page you link to suggests to me that people should just be careful when using this term, not that they shouldn't use it. The fact that some sites don't properly protect against exporting the hash of an empty string (where an email address should have been) doesn't strike me as a reason that the sites that do use it properly shouldn't benefit from its current use.
> As for the issue of it not being a true IFP since SHA1 can collide, I'm not terribly convinced by this either (modulo the issue of sites improperly exporting email fields as hashes). Has anyone ever bothered to analyze the potential for collision on mailto: IRIs? It's got to be much smaller than the general collision probability since email addresses have syntax restrictions, and I wouldn't think this is a case where we're worried about the ease of generating collisions (since I could simply bypass the generation stage and just assert bad data by claiming to have the same sha1_sum as somebody else).
>>> 4. encourage data publishers to assign URIs to account holders
>>> directly, to indicate openID URIs and other identifying properties as
>>> users permit
> Agreed.
>> Tangent: I find mbox_sha1sum useful for adding former email addresses  
>> that I no longer use to my FOAF file. The hashes can still be used for  
>> smushing, but no one would mistake the old email addresses as being  
>> current. That's something I could not do with foaf:mbox alone. Is  
>> there a case for a new property or some sort of new idiom here?
> Yeah, this is how I use it as well.
> .greg
> _______________________________________________
> foaf-dev mailing list
> foaf-dev at lists.foaf-project.org
> http://lists.foaf-project.org/mailman/listinfo/foaf-dev

Mischa Tuffield
Email: mischa.tuffield at garlik.com
Homepage - http://mmt.me.uk/
Garlik Limited, 2 Sheen Road, Richmond, TW9 1AE, UK
+44(0)20 8973 2465  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.foaf-project.org/pipermail/foaf-dev/attachments/20091222/13ab6ff9/attachment.htm 

More information about the foaf-dev mailing list