[rdfweb-dev] RE: Schemarama revisited

Victor Lindesay victor at vicsoft.co.uk
Tue Aug 12 07:13:34 UTC 2003


Roll on Schemarama (or similar) then.

I appreciate that RDF is still 'young' but these issues have to be
tackled and solved for real world use.

Consider these two scenarios. 
A paramedic calling up medical records on a PDA at an accident scene.
A fund manager gambling with your pension using real time financial
feeds.
If RDF is to be used in these types of scenario (and why not) where data
is critical, then it must fit into the accepted conventions and
practices we use in software development, where data and validation are
like, as it says in the song, a horse and carriage.

> -----Original Message-----
> From: Dan Brickley [mailto:danbri at w3.org] 
> Sent: 12 August 2003 00:26
> To: Victor Lindesay
> Cc: rdfweb-dev at vapours.rdfweb.org
> Subject: Schemarama revisited
> 
> 
> Hi 
> 
> * Victor Lindesay <victor at vicsoft.co.uk> [2003-08-11 22:57+0100]
> > You seem to be painting a picture of a RDF as a way to 
> represent data
> > and exchange data that's so loose that any form of 
> validation is a waste
> > of time. In the real world that means unreliable and 
> unusable. No wonder
> > RDF has slow take up.
> > 
> > I thought that RDF had RDF schemas.
> 
> RDF schemas and XML schemas work differently.
> 
> XML schemas are all about the rules for whether some chunk of 
> wellformed
> XML counts as being a particular kind of XML document (ie. document
> typing). They're all about those things in the world that are 
> XML documents.
> 
> RDF schemas are about everything else. 
> 
> When you see 'ShippingOrder', 'Address', 'Person' etc. in an 
> XML schema,
> the schema isn't telling you about shipping orders, addreses, or
> persons; it's telling you about a particular XML data format for
> describing such things. And it gives you rules for figuring out when a
> chunk of XML is so borken (eg. wrong tag structure, or 
> missing info) it 
> couldn't possibly be a sensible description.
> 
> RDF schemas make claims couched in terms of a (cartoonified)
> representation of the world we're describing in our XML 
> documents. They
> say things like "all People are Mammals"; "'livesNear' is a 
> relationship
> that holds between people and places"; "if something is the 
> rss1:title of 
> something it is also the dc:title", and so on. They describe 
> patterns of 
> constraints about true descriptions of the world, rather than 
> mandating 
> particular tagging structures within XML documents that describe the
> world.
> 
> RDFS alone is pretty weak, you can't even contradict 
> yourself, and hence
> can't do much to check RDF data for obvious screwups. So 
> we've boosted 
> the expressive power of RDF by creating OWL, the "Web Ontology
> Language". OWL has all sorts of machinery for making more 
> sophisticated 
> claims about the world. In OWL, you can say things like: 
> "Nothing can be both a 
> Person and a Document, as those classes are mutually disjoint"; 
> "foaf:depiction and foaf:depicts are inverses, if you see ?x and ?y 
> related by the one, you can infer the inverse"; "Something is 
> a W3CTeamPerson 
> if it is a Person AND it has a foaf:workplaceHomepage of 
> http://www.w3.org"...
> 
> I could go on with OWL examples, since there's a lot else it 
> can do, but
> the point is that it describes constraints about the world, not about
> XML documents. OWL, unlike RDFS alone, gives you plenty of ways to 
> contradict yourself, and hence to produce machine-checkably bad data. 
> 
> If you're wondering whether there's a gap where a proper 
> explanation of the 
> relationship between RDF and XML schema languages should be, 
> you're right.
> 
> My 'FOAF contradictions' writeup at 
> http://rdfweb.org/mt/foaflog/archives/000040.html is possible fodder
> towards this end, as is this brief note.
> 
> They should be complementary pieces of the puzzle. I believe 
> there are 
> compelling arguments for working at the RDF level, but we still lack 
> certain things at the RDF level that we enjoy when working at the XML
> level. (Similarly, I sometimes work with XML content using 
> 'grep' level tools).
> In particular, we lack a way for RDF people to talk about the 
> expected information payload of a particular RDF/XML 
> document, or class
> of documents. It's all very well RDF/RDFS/OWL allowing us to 
> say "Human beings 
> have two parents", but what if we want to describe a document format
> which demanded that each person-description included the full name of
> both parents of each person mentioned. That's to my mind 
> where we lack 
> both machine-friendly and human-friendly conventions for 
> describing our 
> expectations about what a particular class of documents will tell us.
> 
> Libby and I did a little work in this area a couple of years ago,
> inspired by the XPath-based Schematron system by Rick Jelliffe and a 
> big XML-DEV thread on schema language pluralism (nicely written up 
> by Leigh Dodds at 
> http://www.xml.com/lpt/a/2001/02/07/schemarama.html).
> Our experiment was named 'Schemarama' after the schema pluralism 
> debate and in tribute to Rick's elegant system. The basic idea was to 
> find a way of re-using RDF query technology to express constraints on 
> the expected content of certain kinds of documents. The way 
> we actually
> did this was conceptually the same as Schematron's, but implemented 
> (in a rough'n'ready manner) on top of Squish (an RDF query language)
> instead of on XPath.
> 
> http://ilrt.org/discovery/2001/02/schemarama/ points to the 
> demos, which
> still (Libby, I never doubted for a second ;) still to work.
> 
> The Jobs Schemarama example (linked from there) may be of 
> interest, since that 
> usage scenario has cropped up again recently in RSS/Atom discussions.
> What we try to show there is an RDF-based way of asserting that, in 
> our particular document format (or workflow context(*)) we want to be
> able to find a match for each of:
> 
>   (job::advertises ?item ?job)  
>   (job::title ?job ?title)  
>   (job::salary ?job ?salary)  
>   (job::currency ?job ?currency)  
>   (job::orgHomepage ?job ?orghp) 
> 
> ...wherever the graph has a thing of type rss::item.
> 
> One other thing worth mentioning here, and that I admire very 
> much about 
> Rick's Schematron system is that it de-couples data checking from
> vocabulary definition. With Schematron (and Schemarama, our 
> version) you 
> get to express rules about how you want your XML documents to be. But
> you don't mix that up with the task of defining the concepts 
> and terms 
> which your XML documents use to describe the world. This 
> makes a lot of 
> sense to me since it allows those terms to be used freely 
> across various 
> classes of XML document, without their only being one set of rules for
> their use. I should also mention we did the Schemarama work 
> before OWL 
> and DAML+OIL were really on the scene, and it probably needs 
> re-thinking 
> in the light of the facilities offered by OWL.
> 
> So...
> 
> Yes, there's a need to characterise for machines and humans our
> expectations about the information payload of XML documents, 
> and to do 
> machine checking of that data. But... this doesn't mean RDF 
> is inherently 
> too permissive to achieve this. A combination of 
> contradiction detection 
> (OWL) and Schematron-like 
> testing of document contents (eg. Schemarama) we have a fair 
> toolset to 
> play with. On top of that, it is possible to write XML 
> schemas (W3C XML
> Schema or RELAX-NG etc) which use a more traditional approach 
> to saying 
> what a doc will contain (in terms of element/attribute 
> trees), yet whose
> instance data also fits the rules of RDF/XML syntax. There's plenty in
> the tool cupboard, we just have our work cut out figuring a story for 
> hooking it all together...
> 
> Dan
> 
> 




More information about the foaf-dev mailing list