RDF
RDFa in Drupal 7: last call for feedback before alpha release
The first alpha release of Drupal 7 will be created next Friday Jan 15th. We've already incorporated most of the feedback we received from the semweb community so far, but I wanted to give the community a last chance to review the RDFa markup and the default RDF mappings we use before it's too late. I should emphasize that all the markup and default RDF mappings that we ship in core will be pretty much set in stone after the stable release of Drupal 7, hence this call for feedback. Site administrators who care about semantics will be able to alter these mappings by installing extra modules, but many people (read several 10K sites) will just install Drupal 7 and not care about the semantics it generates. Therefore we want to make sure the RDFa generated by Drupal out of the box is somewhat correct and does not make folks from the semantic/pedantic web community angry :) - we've tried to keep the semantics as generic as possible for that reason.
RDF mappings
I've created a diagram representing the default semantics of the core data structure which has been committed and I would appreciate feedback on the RDF terms we've used.

RDFa markup
To make the RDFa markup review process easier, I've updated the usual testing site at http://drupalrdf.openspring.net/. It features a blog post with some comments which represents a typical Drupal 7 page annotated with RDFa. Some other pages have been randomly generated to be able to test the tracker which acts as a very simple sitemap in RDFa.
Note that the URI for the resources of type node, comment, term and user is the URI of the page which describe them. This has been decided in order to keep things simpler and after careful discussion with some members of the community. Hash URIs for identifying things different from the page describing them can be implemented quite easily but this case hasn't emerged in core (but will do in the modules people will build and use).
For those willing to try out the software by themselves, I've uploaded an unstable version of Drupal 7 core you can download which includes some of the last RDF patches which are still under review. Please file any bug you encounter at http://drupal.org/node/add/project-issue/drupal or leave a comment below.
Status of RDF in Drupal (November 09) and wrap up of ISWC2009
I had the pleasure to give a presentation of the paper "Produce and Consume Linked Data with Drupal!" at ISWC2009 last, and I was very honored we won the Best Semantic Web in Use Paper award! The 30 minutes of presentation + Q/A passed very quickly and I didn't have much time to expand on the status of RDF in Drupal 7 vs. Drupal 6 after describing the inner workings of the modules we developed. I'm sure this will also interest some people outside the attendees. First of all, the current stable version of Drupal is Drupal 6 (the latest version at the time of this writing being Drupal 6.14). This is the version on which we started to implement the contributed modules presented at ISWC2009, namely RDF CCK, RDF external vocabulary importer (Evoc), SPARQL Endpoint and RDF SPARQL Proxy. Contributed modules means they do not get included in the core Drupal package, but people can download them from drupal.org for free and drop them on their server so Drupal core can be extended. These 4 modules work pretty well on Drupal 6, you can get RDF export in RDF/XML, N-Triples, turtle, json. However generating RDFa is not very easy as it requires to patch the CCK on which we rely to generate the content pages and store the various field data. We made sure this would not be a problem in the next version of Drupal (Drupal 7) which is still under development, and due to be released sometime next year. While we were at it, we also worked on porting one of the functionality present in the RDF CCK and Evoc module to Drupal 7 core: the ability to map the data structure to RDF and expose this in RDFa. This means that, by default and without requiring any knowledge about RDF from their administrator, Drupal 7 sites will expose the following elements as RDFa: title, date, author, content, comments, terms, users, etc. Of course, only publicly available data will be available as RDFa, whatever is private (like user emails addresses) will remain private. This will be part of Drupal 7 core. Needless to say that the rest of the functionalities offered by the set of already existing RDF contributed modules for Drupal 6 will also be available for Drupal 7 once these modules have been ported. We're starting to port these to Drupal 7 next Sunday, as part of the #D7CX Contrib upgrade code sprint in Boston. If you plan to use RDF in your next site, and can wait until Drupal 7 is released, I'd strongly encourage you to start looking at the new Drupal APIs and functionalities. Some RDF features which were not addressed in Drupal 6 will be much easier to achieve in Drupal 7. Try the latest development snapshot of Drupal 7 and report any bug you encounter.
Produce and Consume Linked Data with Drupal!
Produce and Consume Linked Data with Drupal! is the title of the paper I will be presenting next week at the 8th International Semantic Web Conference (ISWC 2009) in Washington, DC. I wrote it at the end of M.Sc. at DERI, in partnership with the Harvard Medical School and the Massachusetts General Hospital which is where I am now working.
It presents the approach for using Drupal (or any other CMS) as a Linked Data producer and consumer platform. Some part of this approach were used in the RDF API that Dries committed a few days ago to Drupal core. I have attached full paper, and here is the abstract:
Currently a large number of Web sites are driven by Content Management Systems (CMS) which manage textual and multimedia content but also - inherently - carry valuable information about a site's structure and content model. Exposing this structured information to the Web of Data has so far required considerable expertise in RDF and OWL modelling and additional programming effort. In this paper we tackle one of the most popular CMS: Drupal. We enable site administrators to export their site content model and data to the Web of Data without requiring extensive knowledge on Semantic Web technologies. Our modules create RDFa annotations and - optionally - a SPARQL endpoint for any Drupal site out of the box. Likewise, we add the means to map the site data to existing ontologies on the Web with a search interface to find commonly used ontology terms. We also allow a Drupal site administrator to include existing RDF data from remote SPARQL endpoints on the Web in the site. When brought together, these features allow networked RDF Drupal sites that reuse and enrich Linked Data. We finally discuss the adoption of our modules and report on a use case in the biomedical field and the current status of its deployment.
RDFa in Drupal: Bringing Cheese to the Web of Data
"RDFa in Drupal: Bringing Cheese to the Web of Data" is the title of our short paper which was recently accepted at the 5th Workshop on Scripting and Development for the Semantic Web. It seems that the topic of food on the semantic web is the new black as this paper comes out at the same time as Boris Mann's announcement about the Open Restaurants aka "BaconPatioBeer".
This paper illustrates how a CMS like Drupal can be used on the Semantic Web and make every Drupal site part of the growing Web of Data. We created a cheese review site as a use case. It relies on the RDF API and the RDF CCK modules.
The good news is that we are working to get this RDF goodness into Drupal core! We are organizing an RDF code sprint. This sprint builds on Dries' ideas expressed in his recent posts Drupal, the semantic web and search and RDFa and Drupal. With RDF in the core of Drupal and RDFa output by default, it's dozens of thousands of websites which will all of a sudden start publishing their data as RDF.
So far, Stéphane Corlosquet, Florian Loretan, Benjamin Melançon and Rolf Guescini have signed up. How about you?
Some others are willing to come but cannot afford the trip until some funding is secured. To help us fund the sprint and bring more Drupal rockstars on board, please consider making a donation using the ChipIn widget on this page. The money will be used to cover flight, food and hotel costs for the sprinters. All sprinters are generously donating their time to make this happen. It would also be great to fly in a few additional people with extensive testing and Fields experience. Any excess money will be used to add more people, or will be donated to the Drupal Association.
Report on my recent trip to the US: Harvard, DrupalCon...
During my 5 week stay in the US, I was based at the Harvard's Initiative in Innovative Computing where I worked on the Drupal based Science Collaboration Framework (SCF) project with Tim Clark, Sudeshna Das and Benjamin Melançon.
screencast on RDFa in Drupal - examples and use cases
This is the video which was presented during DrupalCon DC 2009 at the Practical Semantic Web and Why You Should Care session.
The Semantic Web strikes again
Exciting times for the Semantic Web in Drupal...
Harvard IIC and SCF

Today is my first day at Harvard's Initiative in Innovative Computing where I'll work on the Drupal based Science Collaboration Framework (SCF) project with Tim Clark, Sudeshna Das and Benjamin Melançon. I had the chance to meet Tim last year when he visited DERI and presented the SCF project. We'll work on aligning the efforts which were put into SCF with the efforts of the Drupal community in terms of RDF. We will see what requirements are emerging from a project such as SCF and contribute them back to the Drupal community.
Talking about the Semantic Web and Drupal next week at DrupalCon Szeged 2008
Following up on the interest of the Drupal Semantic Web group, I'll present my ideas on the Semantic Web which will be an update of the talk I gave in Barcelona. I will also present a project which I co-started a few months ago: Neologism.
First RDF Schema for a Semantic Web enabled Drupal
As a semantic web researcher and developer, my goal is to bring these technologies to the
lay people. The main problem is the common chicken and egg dilemma,
where the semantic web technologies need semantic data to become truly
useful and powerful, but nobody wants to produce such data until they can see how powerful the semantic web is.
There is an immense amount of data available on the internet spread over millions of
HTML pages, PDF documents, et cetera. These formats have been designed for
making these documents understandable for people, but not for machines. In this instance RDF
comes in as a language to describe data and relationships within the data. From a web of documents we evolve to a web of pieces of
data, i.e. concepts, items, ideas, events, people, you name it. Each of
them can be identified by their own Uniform Resource Identifier (URI), and the web becomes a global database.

