[foaf-protocols] P2P FOAF search
melvincarvalho at gmail.com
Sun Jun 14 12:41:53 CEST 2009
On Thu, Jun 11, 2009 at 12:34 PM, Luke Maurits<luke at maurits.id.au> wrote:
> Okay! Time to make some of the ideas that have been floating around a
> bit more concrete!
> Here's a sort of "meta-protocol", a very high level description of the
> overall system which could be implemented in countless ways. We can
> work our way up from small scale implementations up to betters ones,
> some of which could eventually become the actual standard protocols:
> P2P FOAF search Meta Protocol 0.1
> The NETWORK is composed of a number of NODES.
> Each NODE is a software agent which has:
> 1) A STORE of RDF triples relevant to a set of foaf:People
> 2) A LIST of other NODES which it has the ability to send MESSAGES to.
> Each MESSAGE has:
> 1) A TYPE
> 2) A finite number of PARAMETERS (key/value pairs)
> 3) A BODY
> (think of these as being analagous to the method, headers and entity of
> a HTTP request)
> We define two TYPES of MESSAGE:
> 1) SEARCH messages which contain:
> i) An "origin" PARAMETER whose value is some set of information
> sufficient to enable sending a MESSAGE to a NODE ii) A "ttl" PARAMETER
> whose value is a non-negative integer iii) A "searchid" PARAMETER whose
> value is a 128 bit string. iv) A BODY which is some representation of
> (perhaps but not necessarily a literal string) a SPARQL query
> 2) RESPONSE messages which contain:
> i) A "searchid" PARAMETER whose value is a 128 bit string.
> ii) A BODY which is some representation of a set of RDF triples
> relevant to a foaf:Person
> The behaviour of the NODES is as follows:
> To search for information about a foaf:Person, the NODE constructs a
> SEARCH MESSAGE with:
> 1) An "origin" PARAMETER specifying its own contact details
> 2) A "ttl" PARAMETER of some fixed positive integer
> 3) A random "searchid" PARAMETER
> 3) A BODY which is a representation of a SPARQL query for retrieving
> the desired information
> and sends this MESSAGE to some subset of the NODES in its LIST.
> The NODE then listens for RESPONSE MESSAGE from other NODES which have
> a "searchid" PARAMETER matching that of the SEARCH MESSAGE it sent.
> For each such RESPONSE MESSAGE, the information contained in the BODY
> is added to the set of results for this search. The listening ends
> after a fixed period of time has passed since the receipt of the last
> RESPONSE (or since the sending of the SEARCH if no RESPONSES are
> Upon receipt of a SEARCH MESSAGE from another NODE, a NODE may
> optionally (at its own discretion) convert the BODY of the SEARCH to a
> SPARQL query and apply that query to its STORE. If the result set is
> non-empty, then the NODE may optionally (at its own discretion)
> construct a RESPONSE for each foaf:Person in the result set with:
> 1) A "searchid" PARAMETER whose value is the same value as the
> "searchid" PARAMETER of the SEARCH MESSAGE.
> 2) A BODY which is some representation of the the RDF triples
> pertaining to that foaf:Person
> and send that RESPONSE to the NODE specified by the SEARCH MESSAGE's
> "origin" PARAMETER.
> If the "ttl" PARAMETER of the SEARCH is non-zero, the NODE may then
> optionally (at its own discretion) send to some subset of the NODES in
> its LIST a SEARCH message which is identical to the one received with
> the exception that the value of the "ttl" PARAMETER is decremented by 1.
> Things to note:
> * The details of message transport are completely open, meaning the
> protocol could be implemented in many different ways (I'll sketch two
> below later). Some client software could support multiple ways,
> effectively bridging what are technically distinct NETWORKS
> * Responses to SEARCH messages are optional at every stage, so client
> software could filter them based on:
> 1) The value of the "origin" PARAMETER
> 2) Any property of the SPARQL query
> 3) Any property of the result set it computers for the query
> * A NODE's STORE is not required to be static - it may be changed
> after receipt of a SEARCH and before the execution of a query or
> sending of RESPONSES, to enable the sort of logic shown in Toby
> Inkster's email.
> To sketch a concrete but lightweight implementation of this
> Handling transport by XMPP:
> * Each NODE could be associated with a XMPP account.
> * Each NODE's LIST could be the friend list associated with its XMPP
> * MESSAGES could be inserted in XMPP messages.
> * The "origin" PARAMETER values could be XMPP account ids (user at domain)
> Super-simple search representation:
> * The representation of a SPARQL query could be a comma separated list
> of "search strings" for foaf:Name, foaf:mbox and foaf:mbox_sha1sum,
> where search strings support wildcard characters. So: "Steve*,*,*" is
> converted to a SPARQL query requesting that foaf:Name beings with Steve
> and nothing else.
> Super-simple result representation:
> * A set of RDF triples about a foaf:Person could be represented by a
> URI which dereferences to a foaf:PersonalProfileDocument
> This feels pretty achievable. XMPP does all the hard transport work,
> and libraries are available in just about any programming language.
> The querying abilities are very limited but a complete cinch to
> implement. Proof of concept would only require 3 running NODES with
> specially crafted STORES and LISTS such that:
> 1) Node A wants information about foaf:Person X
> 2) Node C knows information about foaf:Person X
> 3) Node B is on Node A's LIST but Node C is not
> 4) Node C is on Node B's LIST
> Two or three people should be able to do this demonstration. If there
> is interest I may take a stab at a Python implementation of this
> implmenetation of the meta-protocol in the near future. Other people
> are of course welcome to try implementations in other languages - XMPP
> and the super-simple search representation should make it very easy to
> get them working together.
> 1) You need an XMPP account just to search for other FOAFers, which is
> far too restrictive to be a general search solution.
> 2) You need to know somebody else on the network just to search for
> other FOAFers, which is again too restrictive.
> 3) Searches are not very anonymous (many NODES will know (i) what you
> searched for and (ii) Your XMPP account details). It's easy to
> envisage applications of FOAF where search anonymity is desirable.
> 4) Others?
> To sketch a heavier implementation of the protocol which wouldn't have
> downsides 1-3 above but would need (1) an extension of the protocol and
> (2) a lot more coding work to implement in software:
> Handling transport by TCP/IP:
> * It should be easy to serialise MESSAGES in a format *very* similar to
> HTTP, which can just be sent around as per usual over TCP connections.
> * The "origin" PARAMETER values could also be (IP address, port number)
> * Each NODE's LIST could be a list of (IP address, port number) pairs
> Each NODE's LIST would have to be populated in some way. This could be
> done by defining extra MESSAGE TYPES for discovery/exploration. I know
> there *are* ways to do this, but I'm not very clear on them at the
> moment. More reading required!
> Search representation:
> * Full blown SPARQL?
> This seems like the right direction for a "proper" implementation but
> it's too much work to aim for right off the bat.
> Alright! Sorry for the high level of detail in this email, but what do
> people think?
Sounds good, definitely keen to run a few nodes, on client machines
and web server. Not sure how much overlaps with other projects such
as distributed SPARQL (im not 100% up to date on things like this),
but I'm sure that will become clearer over time ...
More information about the foaf-protocols