RDF in Drupal 7 and the Web at large

(notes on a session at DrupalCon 2010)

Lin Clark of DERI Galway, heavy D7 contributor

  • RDF is based on very basic principles
  • programs and sites can exchange information
  • search engines can display more relevant info in results because of structured data
  • data mashers can combine different datasets to reveal new things
  • "machine understandable": information explicitly marked up by what it means rather than relying on context
  • RDF = Resource Description Framework
    • a resource is a uniquely named thing. rather than providing a full URI, can provide a namespace and build a CURIE (compact URI) from that namespace.
    • a resource can also have a type, defined by another CURIE.
    • relationships among resources can also have types defined in CURIEs.
  • "Giant Clobal Graph" = what the Web will become when everyone uses RDF
  • SPARQL is query language for retrieving RDF information
  • Drupal has internal structure similar to RDF, but in the past it hasn't been marked up externally in the HTML.
  • Drupal's field names have been idiosyncratic in the past; RDF will make them universally dereferenceable.

History of RDF in Drupal

  • Dries wrote an rdf.php for Drupal back in 2000
  • RDF module written in 2007, RDF CCK in 2008
  • RDF is serialized as RDFa so it can be embedded in other XML files. (a = attributes)
  • RDFa is standardized in XHTML 1.1 and HTML 5.
  • use the property, typeof, rel attributes of HTML tags to assign RDF properties to data
  • RDF is credited with a 30% increase in search traffic, instantly improves search results

How to use it in D7

  • Drupal is pioneering RDFa adoption
  • all core entity types are marked up with RDF attributes: title, date, fields, comment count, reply to, creator.
  • all core entities have "cool URIs": node/#, comment/#, user/#, taxonomy/term/#
  • support FOAF (people), SIOC (authorship), SKOS (terms), DC (generic properties) vocabulary namespaces out of the box.
  • use page caching to avoid performance hit, especially on pages with lots of comments.

What's coming up

  • There will be a mapping user interface for D7 - not yet complete.
  • There will also be a SPARQL endpoint, so anyone will be able to query site's data
  • RDF proxy will allow for automatic updating of remote data when source is changed.

RDF in D7 for developers

  • hook_rdf_namespaces() adds a namespace/vocabulary to the page
  • rdf_mapping_save($array) to customize mappings
Taxonomy upgrade extras: