I originally planned to title this “Partial schemas!” but as I assembled the example I realized that in addition to demonstrating the value of partial, incrementally-built schemas, the steps shown below also show how inferencing with schemas can implement transformations that are very useful in data integration. In the right situations this can be even better than SPARQL, because instead of using code—whether procedural or declarative—the transformation is driven by the data model…
For several years I thought of “knowledge graphs” as the buzzphrase that had partially replaced “Linked Data”, which was the buzzphrase that had partially replaced “Semantic Web”. In a 2012 blog entry I explained how Hadoop and the new-at-the-time NoSQL databases had convinced me that even if a technology has a funny name, selling it based on the problems it solves makes more sense and ages better than selling a buzz phrase vision and then, if that goes well,…
Over a year ago, in Querying geospatial data with SPARQL: Part 1, I described my dream of pulling geospatial data down from Open Street Map, loading it into a local triplestore, and then querying it with queries that conformed to the GeoSPARQL standard. At the time, I tried several triplestores and data sources and never quite got there. When I tried it recently with Ontotext’s free version of GraphDB, it all turned out to be quite easy.
I recently needed to join two datasets at work, cross-referencing one property in a spreadsheet with another in a JSON file. I used a combination of jq
, perl
, sort
, uniq
, and… I won’t go into details.
I had heard that Go (also known as “golang”) was an increasingly popular newish programming language before I migrated my blog from being generated by handmade XSLT scripts on snee.com to using the Hugo platform to generate it on bobdc.com. Hugo is written in Go, which was invented at Google (get it?) by three people, two of whom had contributed to the development of C, Unix, and important related technology at Bell Labs. Go provides an excellent basis for a website generation…
Something that happens to me now and then: I’ll hear that an organization with a lot of interesting data (science, music, whatever) makes the data available on a SPARQL endpoint. I send my browser to the URL listed as the SPARQL endpoint and I see a web form. I enter a simple query on the web form to retrieve a few random triples, click the form’s button, and the results of my query appear. Then I enter fancier queries to explore the endpoint’s data.
I have seen several tools for converting spreadsheets to RDF over the years. They typically try to cover so many different cases that learning how to use them has taken more effort than just writing a short perl script that uses the split()
command, so that’s what I usually ended up doing. (Several years ago I did come up with another way that was more of a cute trick with Turtle syntax.)