3/14/08

The problems of SPARQL

In the following I would like to summarise the main challenges that may hamper the utility of SPARQL:

  1. Formulating a query in SPARQL requires the query writer to understand the data structures of the RDF resources being queried, which is usually done by eye-parsing of these resources. Since RDF resources are typically large and written in technical and long terms, the eye-parsing scenario is not practical, and certainly hampers the whole utility of SPARQL. Compared with SQL, writing a SQL query requires also the writer of this query to lookup and understand the underpinning database schema, however, such database schemes are typically concise and manageable even in case of large organizations. In addition, the people who design a database for an organization, usually, are the same people who write the SQL queries, i.e. unlike RDF, the world for a database is closed.
  2. Learning SPARQL (and RDF) is not easy for the majority of the IT community. SPARQL is not yet known outside its small community. Although RDF and SPARQL are indeed simple technologies, but I found that the majority of the IT people (including many senior researchers) cannot understand or author simple RDF documents. I believe this is because the intuition of representing knowledge in directed labelled graphs and graph patterns is not familiar in the IT education. Unlike databases and SQL, where the intuition and the logic of relations is being taught in every IT program since more than 40 years, the description logic underpinning RDF and SPARQL, even it is simple, but it needs many years of killer applications and tools to go.
  3. SPARQL expressivity seems unsatisfactory. Although SPARQL is an expressive language, but as one may notice in the SPARQL literature, the majority is requesting many extensions to enrich SPARQL in different ways. In one hand, I found that only few of these suggestions are fundamental to the core of SPARQL, but the majority are conductive, i.e. they do not affect the functionality or the expressiveness of SPARQL, as they can be emulated somehow, but they make SPARQL more natural, practical, and concise for different usage scenarios. On the other hand, the amount of proposed extensions, even they are conductive, but they give an indication that the expressivity of the SPARQL operators is not satisfactory in practice. In my opinion, SPARQL should not be extended unless this is very necessary, as these extensions may harm the scalability and optimization of SPARQL. Instead, I believe an extra layer(s) toping SPARQL for handling such extensions is necessary.



1 comment:

Prateek said...

Great post! I always had these issues related to SPARQL in my mind.You summarized them really well!