class/property occurrence: stats from rdfweb harvester

Dan Brickley danbri at w...
Fri Jun 28 19:16:57 UTC 2002


I've made a start on generating stats from Scutter's postgres-based
RDF store. Currently this just dumps a list of all the classes and
properties in the aggregated store. There are many ways we might want
to extend this. Richer description of the database contents
(class-contextualised property occurance status; tokenising of literal
content etc). I want each RDFWeb aggregator to expose such content
back to the wider Web, for service discovery / description, query routing
etc purposes.

Right now, I'm just dumping some raw stats manually into the www server's
CVS space. My scutter installation isn't crontab'd yet (though it is
nearly Debian packaged). At the moment there are only 3 or 4 RDFWeb
harvesters/aggregators, and they have basically the same dataset to play
with. When (not if ;-) there is more RDF out there than any single node
can happily manage, we'll want smarter harvesting policies, and more data
such as this that describes each collection/query point. This should
happen imho pretty fast, once some indexing tools are made easily
available. Hence the renewed attention of server self-description.

Anyhow, see http://rdfweb.org/2002/foaf/scutterstats-20020628.txt
and http://rdfweb.org/2002/foaf/gen_scutterstats.sh for the queries that
generate this. Not rocket science.


per today's RDFIG chump,
http://rdfig.xmlhack.com/2002/06/28/2002-06-28.html#1025290578.394214
...this is something I've been working on (er, thinking about; I forget
the difference) for a few years. See:
http://www.w3.org/TandS/QL/QL98/pp/distributed.html
http://www.dlib.org/dlib/january98/01kirriemuir.html
...for more (pre-RDF, pre-P2P, pre-WebServices) context.


Suggestions (and better SQL queries) welcomed...

Dan




More information about the foaf-dev mailing list