RDF

Data wrangling, feature engineering, and dada

And surrealism, and impressionism...

In my data science glossary, the entry for data wrangling gives this example: “If you have 900,000 birthYear values of the format yyyy-mm-dd and 100,000 of the format mm/dd/yyyy and you write a Perl script to convert the latter to look like the former so that you can use them all together, you’re doing data wrangling.” Data wrangling isn’t always cleanup of messy data, but can also be more creative, downright fun work that qualifies as what machine learning people call…

Properties

Children's edition.

Going through some old files, I found a homework assignment that my younger daughter did seven or eight years ago. When doing RDF-related data modeling you put a lot of thought into properties, and I remember getting a kick out of this introduction to the concept when she brought it home.

Simple federated queries with RDF

A few more triples to identify some relationships, and you're all set.

Once, at an XML Summer School session, I was giving a talk about semantic web technology to a group that included several presenters from other sessions. This included Henry Thompson, who I’ve known since the SGML days. He was still a bit skeptical about RDF, and said that RDF was in the same situation as XML—that if he and I stored similar information using different vocabularies, we’d still have to convert his to use the same vocabulary as mine or vice versa before we could use our…

Playing with SPARQL Graph Store HTTP Protocol

GETting, POSTing, PUTting, and DELETEing named graphs.

One of the new SPARQL 1.1 specifications is the SPARQL 1.1 Graph Store HTTP Protocol, which is currently still a W3C Working Draft. According to its abstract, it “describes the use of HTTP operations for the purpose of managing a collection of graphs in the REST architectural style.” Recent releases of Sesame support it, so I used that to try out some of the operations described by this spec. I managed to do GET, PUT, POST, and DELETE operations with individual named graphs, so that…

RDFa can be so simple

Despite claims to the contrary.

I got so tired of hearing people complain about how confusing RDFa is that while I was on hold during a recent phone call I threw together a demo of just how simple it can be. The document has the two basic kinds of triples: one with a literal for an object, with data typing thrown in for good measure, and one with a resource URI as its object. A View Source of that document will show this in its head element (namespaces are declared earlier):