[foaf-dev] parser error with bio: schema - uneven XML in rdfs:comment

Gregory Williams greg at evilfunhouse.com
Thu Jun 9 06:42:00 CEST 2011


On Jun 8, 2011, at 3:37 PM, Dan Brickley wrote:

> Parsing the bio: schema with the RDF::Trine Perl parser (see
> http://pastebin.com/raw.php?i=WZxpxHiR) , I run into an error:
> http://chatlogs.planetrdf.com/swig/2011-06-08#T19-31-36
> 
> 21:26 danbri: kasei, it seems parser doesn't like the bio: schema,
> even if rapper does
> 21:26 danbri: 'Cannot canonicalize XMLLiteral: :2: parser error :
> Extra content at the end of the document '
> 21:28 danbri: ah, the fragment is illformed
> 21:28 danbri: $VAR1 = '<p>Based on information at <a
> href="http://en.wikipedia.org/wiki/Henry_VIII_of_England">Wikipedia</a>
> and <a href="http://www3.dcs.hull.ac.uk/cgi-bin/gedlkup/n=royal?royal00828">Hull
> University</a></p>
> 21:28 danbri: <pre>
> 21:28 danbri: interesting
> 
> ... it seems the parser doesn't like the XML - presumably not being a
> complete document. I tried tweaking it but just managed to generate
> more errors (which suggest the entity-escaping is being undone and the
> < and > causing trouble - Greg, could you take a look?).

Hi Dan,

I'll take a look at this, but I think Ian is right that the problem is being triggered by the xml fragment not having a single root. Either way, after taking a look at the perl code I think this should be considered more of a warning than an error as I think you should still end up with all of the right triples in the end, despite the loud warning messages indicating otherwise.

thanks,
.greg



More information about the foaf-dev mailing list