[foaf-dev] [foaf-protocols] FOAF sites offline during cleanup

Kingsley Idehen kidehen at openlinksw.com
Wed Apr 29 13:20:54 CEST 2009


Hugh Glaser wrote:
> Hi again.
> A problem I have is that you seem to be encouraging people to use your store
> to do research (for example of the sort we have been talking about on
> percentage bnodes) thinking it is on LD, LOD, SW, Web of Data or whatever,
> having read claims such as:
> "What we have right now is the LOD-Cloud Warehouse".
> Such analysis might be deeply flawed, because they don't understand your
> term "Warehouse", and I have seen no validation of your claim.
> I don't have time to keep going to find out what you have, but a very quick
> perusal of the areas I have some knowledge of suggest there is a lot
> missing.
> Even looking at your voiD for the rkb stuff perhaps I am looking to do
> research into voiD), I find you only have about 10 rkb stores, whereas we
> publish voiD descriptions of more than 30.
> Looking at the triple count, you report 19703, whereas we report (
> http://southampton.rkbexplorer.com/models/void.ttl) 322555.
> Also, looking for some of the other bubbles as voiD descriptions, I can't
> find them. So exploring a bit (using Nick Gibbins' ECS Southampton bubble,
> rather than one of mine) I find that I (21) seem to be the only URI of type
> http://id.ecs.soton.ac.uk/person/11234
> you have.
>
> Don't get me wrong - I think you have done a great job getting all this
> stuff together, and providing a facility for people to work with and
> publicise, and it is a really interesting exercise to investigate the
> interaction of the Web of Data and the Cloud.
> But I am seriously concerned that people may be misled into thinking there
> is more there than there is, when this will always be the nature of the
> activity.
>   
Hugh,

I absolutely understand your concern.

To cut a long story short, how would you suggest we describe what we 
have? What about the following:

1. A collection of most of the data from the LOD-Cloud pictorial
2.  LOD-Cloud sample
> I really don't want to be reviewing/seeing papers in a few months time where
> people are presenting analysis they claim to have done of the "LOD cloud" or
> similar, and they have based their data gathering on the misconception that
> all they have to do is look at your cloud.
>   
Neither do I, but I have expressly called out to everyone that has 
contributed to the LOD-Cloud (warehouse) to verify what's been loaded so 
far. Sadly, deafening silence until we make any kind of claim. As you 
know this work is non trivial (in all respects).

It would be really sad if the easy part of providing dataset 
verification feedback for our instance becomes the reason for it to 
stagnate and ultimately wither away (we do have a zillion other things 
to do with our time, seriously).

The goal of what we call the LOD-Cloud instance is to provide the Linked 
Data Web will a powerful faceted browsing and entity information lookup 
solution based on Linked Data. To date we haven't even seen DBpedia 
replicas let alone what we now have. Both are significant validators of 
the Linked Data Web in general.

I can assure you, I didn't have academic papers in mind when 
commissioning either of these endeavors.

Kingsley

> Best
> Hugh
>
>
>
> On 29/04/2009 02:31, "Kingsley Idehen" <kidehen at openlinksw.com> wrote:
>
>   
>> Peter Williams wrote:
>>     
>>> almost incomprehensible - to the layman. But, I believed you - to about 51%.
>>>  
>>>       
>> The LOD-Cloud pictorials:
>>
>> 1. http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-03-27.html
>> 2.
>> http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-03-27_colored.png
>>
>> Problems:
>>
>> 1. The black and white clickable version does really group the bubbles
>> 2.  Neither pictorial provides clarity as to what's constructed from
>> physical RDF dumps (as per LOD community best practices), "on the fly"
>> RDFization, or Progressive Crawling.
>>
>>
>> Thus, when I say: we have a Virtuoso instance hosting the LOD-Cloud [1],
>> someone can always come along an question the accuracy of the claim (as
>> Hugh has just done).
>>
>> In anticipation of the problem I describe above, I sought to partition
>> the LOD-Cloud along the following lines: Warehouse (stuff loaded from
>> dumps) and Dynamic (RDFized Data). Then I could say with accuracy, bar
>> inadvertent omission, that we have an instance hosting the Warehouse
>> component of the LOD-Cloud.
>>
>> Hugh: So to be precise, we are claiming to host the LOD-Cloud Warehouse
>> :-) The Graph Group IRIs used in the VoiD graph [2] reflect most of the
>> partitioning you see in the colored pictorial re. the stuff available as
>> Linking Open Data community dumps [3][4].
>>
>>
>> Links:
>>
>> 1. http://lod.openinksw.com
>> 2. http://lod.openlinksw.com/void/Dataset
>> 3. http://esw.w3.org/topic/DataSetRDFDumps
>> 4. http://esw.w3.org/topic/HCLSIG/LODD/Data
>>
>>
>> Kingsley
>>
>>     
>>> ________________________________________
>>> From: foaf-dev-bounces at lists.foaf-project.org
>>> [foaf-dev-bounces at lists.foaf-project.org] On Behalf Of Kingsley Idehen
>>> [kidehen at openlinksw.com]
>>> Sent: Tuesday, April 28, 2009 5:16 PM
>>> To: Hugh Glaser
>>> Cc: Semantic Web; foaf-dev Friend of a
>>> Subject: Re: [foaf-dev] [foaf-protocols]  FOAF sites offline during cleanup
>>>
>>> Hugh Glaser wrote:
>>>  
>>>       
>>>> Hi Kingsley.
>>>> It is great for people to be able to find a lot of the LOD cloud at your
>>>> site, but please be careful about your claims concerning the data you have
>>>> crawled from LOD.
>>>> To say "our actual VoiD graph for LOD cloud" is to mislead readers into
>>>> thinking that it captures more than it does.
>>>>
>>>>    
>>>>         
>>> Yes, and No.
>>>
>>> Remember, I did try to partition the LOD-Cloud by Warehouse,
>>> Sponged/RDFized, and Crawled, but nobody would have it.
>>>
>>> What we have right now is the LOD-Cloud Warehouse. Also note, when you
>>> look at the VoiD graph you are seeing Graph Group IRIs (containers of
>>> Graphs that contain Triples), so you need to drill down a level or two.
>>>
>>> Also, if you feel a dataset dump is missing from the LOD-Cloud
>>> pictorial, please don't hesitate to hola etc..
>>>
>>> BTW - I don't equate the LOD-Cloud pictorial as being equivalent to the
>>> Linked Data Web :-)
>>>
>>> Kingsley
>>>  
>>>       
>>>> Best
>>>> Hugh
>>>>
>>>>
>>>> On 28/04/2009 13:10, "Kingsley Idehen" <kidehen at openlinksw.com> wrote:
>>>> As for the % re. FOAF, I think that can be determined from our actual
>>>> VoiD graph for LOD cloud [1]. I don't know off the top of my head if
>>>> FOAF is up to 50%.
>>>>
>>>>    
>>>>         
>>>>> The "Linked" part of the name implies that crawling is a valid tactic
>>>>> to gather the data to me.
>>>>>
>>>>>      
>>>>>           
>>>> Not disputing that, just describing what we have in the instance :-)
>>>> Remember, we've sponged (crawled and RDFized) data since inception of
>>>> our participation in this space.
>>>>
>>>> Links:
>>>>
>>>> 1. http://lod.openlinksw.com/void/Dataset
>>>>
>>>> Kingsley
>>>>
>>>>
>>>>    
>>>>         
>>> --
>>>
>>>
>>> Regards,
>>>
>>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
>>> President & CEO
>>> OpenLink Software     Web: http://www.openlinksw.com
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> foaf-dev mailing list
>>> foaf-dev at lists.foaf-project.org
>>> http://lists.foaf-project.org/mailman/listinfo/foaf-dev
>>>
>>>  
>>>       
>> --
>>
>>
>> Regards,
>>
>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
>> President & CEO
>> OpenLink Software     Web: http://www.openlinksw.com
>>
>>
>>
>>
>>
>>     
>
>
>   


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com






More information about the foaf-dev mailing list