Earlier today Chris Webb posted a blog entry entitled OData and Microsoft BI where he remarked upon the increasing importance of Microsoft’s burgeoning OData protocol for accessing data over HTTP. Chris also linked to Douglas Purdy’s blog post OData: The Movie which listed the following Microsoft and non-Microsoft technologies that support (or soon will support) OData either as a data producer or a data consumer:
- .Net client
- AJAX Client
- PHP client
- Java client
- Visual Studio
- SQL Server (via Reporting Services)
- Codename "Dallas"
To that list I would add Windows Azure which exposes data from its storage engine using OData and also the forthcoming Live Framework technology that is expected to surface user-centric data from Microsoft’s various Live Services properties (e.g. SkyDrive, Messenger, Calendar).
Chris suggested that a pervasive format for data on the web would be a big step forward because any consumer that understood that format could talk to any producer that chose to expose it; win-win-win indeed. Unfortunately life is of course never that simple and as is so often the case there are many competing technologies in this space including (but, I suspect, not limited to) Microsoft’s OData, Google’s GData and the World Wide Web consortium’s Resource Description Framework (RDF).
This is nothing new of course; we can draw parallels with Atom and RSS which were recent competing technologies used to expose blog syndication feeds. Interestingly both of these technologies generally sit side-by-side quite happily these days (read: Will the Online Identity War turn out like the XML Syndication War?) and it will be interesting to see whether the same will happen in the OData/GData/RDF space; will organisations opt to expose multiple service heads for their various databases? I await the answer to that one with interest.
RDF is of particular interest chiefly because one of its main proponents is Sir Tim Berners-Lee, the inventor of the world wide web. RDF has been knocking around for about 10 years now and Berners-Lee sees it as the technology that will underpin the growth of the fabled Semantic Web; indeed he has been heavily involved with the recently announced service http://data.gov.uk which exposes data held by the UK government and it is no surprise that that service exposes its data using RDF.
I don’t know too much about GData but I do have a working knowledge of both OData & RDF so perhaps an overview and comparison of the two might be useful. To this observer it appears as though OData is closely tied to the tables-rows-and-columns paradigm that anybody familiar with traditional relational databases will understand. An OData service exposes multiple entities (roughly analogous to tables) and the relationships between them.
RDF however is built around the concept of triples which I assert is more analogous to a conceptual Entity-Attribute-Value model (that assertion is something I hope to explore in a future blog post). The key difference between RDF and OData though is that an RDF document can contain links to RDF documents that are hosted on other services and attributes of entities are defined in terms of those other services. Put more simply, RDF data is one massively distributed, non-centralised, interlinked web of data and if that sounds an awful lot like the mass of HTML documents which form the world wide web then you shouldn’t be too surprised given that Berners-Lee is so heavily involved. The notion of linking data together in this way has also given rise to the very descriptive moniker linked data which you may well hear mentioned in the same sentence as “semantic web” from time to time.
This has been a rather rambling blog post and really its just me dumping some thoughts out onto your computer screen as you can tell from the rather unimaginative title. Nonetheless there is some interesting stuff going on here that I hope to explore in the future so if you're interested please return some time. If you have any thoughts I'd love to read them in the comments section below.