[foaf-dev] beyond foaf:mbox_sha1sum

Steve Harris steve.harris at garlik.com
Tue Dec 22 16:43:27 CET 2009


On 22 Dec 2009, at 15:16, Norman Gray wrote:
>
> Steve, hello.
>
> On 2009 Dec 22, at 14:39, Steve Harris wrote:
>
>> On 22 Dec 2009, at 14:26, Norman Gray wrote:
>>>
>
>>> I don't think there's any need for a convention.  The URI http:// 
>>> foo is necessarily identically equivalent in function to the URI http://foo/ 
>>> , by virtue of the HTTP spec.  Thus although they don't compare  
>>> equal as strings, they are explicitly noted as equivalent in  
>>> section 6.2.3 of RFC 3986 (which includes equivalent in the  
>>> owl:sameAs sense, though I doubt it would be either necessary or  
>>> useful to state this explicitly).
>>
>> That's not my reading of RFC 3986.
>>
>> It says: “Implementations may use scheme-specific rules, at further  
>> processing cost, to reduce the probability of false negatives. For  
>> example...” then goes on to give an example of some HTTP-specific  
>> normalisations.
>
> But that paragraph ends "the following four URIs are equivalent" --  
> not "might be", or "can be taken to be", but "are".  And it's not  
> itself mandating this equivalence -- I think it's just remarking on  
> an equivalence which is deducible from RFC 2616.

But the word "equivalent" has many meanings, and it's all on the  
context of a "may" anyway.

> There's a rabbithole opening up here...

Indeed :)

> It's not that I have a particular attachment to declaring root URIs  
> without trailing slashes (I have no desire to make things hard for  
> you!).  I suppose I'm simply remarking that the equivalence is one  
> of the list of hassles that RDF-consuming software has to deal with  
> (which this broader mbox_sha1sum thread is illustrating is a long  
> and aggravating list!).

I don't believe that there's anything in the RDF specs which suggests  
that <http://foo.com> and <http://foo.com/> are equal. They are  
different URIs, even if they are equivalent in some sense. In SPARQL  
I'm pretty sure that they're not equal.

In our FOAF systems we do a lot of pre-processing anyway, it would  
just be nice if it wasn't so neccesary. It's likely to put people off  
implementing FOAF services.

The normalisation text equally applies to content creators, so if you  
think it's an absolute requirement then it stands to reason that you  
shouldn't be emitting them without the trailing / :) Enough with the  
RFC lawyering though, sorry!

> For what it's worth, it's the same situation with <mailto:foo at example.com 
> > and <mailto:foo at Example.COM> -- the two URIs are scheme-specific  
> equivalent, and it lands to you as the data consumer to handle that  
> equivalence.  Doing so by normalising the URIs during ingestion, or  
> handling it with owl:sameAs triples, both have difficulties I can  
> appreciate.

Yes, it's all tricky.

> But it seems to me that the normalisation effort has to be on your  
> side as the data consumer, rather than on mine as the data provider  
> (the syntactical difference doesn't matter to me -- I'm making true  
> statements whichever form I use).

I think we should both be putting the effort in.

> All the best (and happy christmas/New Year!),

And the same to you.

- Steve


More information about the foaf-dev mailing list