SPARQL

Querying machine learning movie ratings data with SPARQL

Well, movie ratings data popular with machine learning people.

August 22, 2015

While watching an excellent video about the pandas python data analysis library recently, I learned about how the University of Minnesota’s grouplens project has made a large amount of movie rating data from the movielens website available. Their download page lets you pull down 100,000, one million, ten million, or 100 million ratings, including data about the people doing the rating and the movies they rated.

Visualizing DBpedia geographic data

With some help from SPARQL.

July 15, 2015

I’ve been learning about Geographical Information System (GIS) data lately. More and more projects and businesses are doing interesting things by associating new kinds of data with specific latitude/longitude pairs; this data might be about air quality, real estate prices, or the make and model of the nearest Uber car.

SPARQL: the video

Well, a video, but a lot of important SPARQL basics in a short period of time.

May 3, 2015

Spark and SPARQL; RDF Graphs and GraphX

Some interesting possibilities for working together.

March 29, 2015

In Spark Is the New Black in IBM Data Magazine, I recently wrote about how popular the Apache Spark framework is for both Hadoop and non-Hadoop projects these days, and how for many people it goes so far as to replace one of Hadoop’s fundamental components: MapReduce. (I still have trouble writing “Spar” without writing “ql” after it.) While waiting for that piece to be copyedited, I came across 5 Reasons Why Spark Matters to Business by my old XML.com editor Edd…

R (and SPARQL), part 2

Retrieve data from a SPARQL endpoint, graph it and more, then automate it.

January 20, 2015

In part 1 of this series, I discussed the history of R, the programming language and environment for statistical computing and graph generation, and why it’s become so popular lately. The many libraries that people have contributed to it are a key reason for its popularity, and the SPARQL one inspired me to learn some R to try it out. Part 1 showed how to load this library, retrieve a SPARQL result set, and perform some basic statistical analysis of the numbers in the result set. After I…

R (and SPARQL), part 1

Or, R for RDF people.

January 13, 2015

R is a programming language and environment for statistical computing and graph generation that, despite being over 30 years old, has gotten hot lately because it’s an open-source, cross-platform tool that brings a lot to the world of Data Science, a recently popular field often associated with the analytics aspect of the drive towards Big Data. The large, active community around R has developed many add-on libraries, including one for working with data retrieved from SPARQL endpoints, so…

Querying aggregated Walmart and BestBuy data with SPARQL

From structured data in their web pages!

November 9, 2014

The combination of microdata and schema.org seems to have hit a sweet spot that has helped both to get a lot of traction. I’ve been learning more about microdata recently, but even before I did, I found that the W3C’s Microdata to RDF Distiller written by Ivan Herman would convert microdata stored in web pages into RDF triples, making it possible to query this data with SPARQL. With major retailers such as Walmart and BestBuy making such data available on—as far as I can tell—every…

Dropping OPTIONAL blocks from SPARQL CONSTRUCT queries

And retrieving those triples much, much faster.

October 6, 2014

While preparing a demo for the upcoming Taxonomy Boot Camp conference, I hit upon a trick for revising SPARQL CONSTRUCT queries so that they don’t need OPTIONAL blocks. As I wrote in the new “Query Efficiency and Debugging” chapter in the second edition of Learning SPARQL, “Academic papers on SPARQL query optimization agree: OPTIONAL is the guiltiest party in slowing down queries, adding the most complexity to the job that the SPARQL processor must do to find the relevant…

Filtering foreign literals out of SPARQL query results

Parsing JSON with Python

Amazon's failed folksonomy and Kevin Federline

RDF serialization formats

Selecting all the triples from all the graphs

Editing schemas, ontologies, and SKOS taxonomies with VocBench

SPARQLing anything

Querying for audio on Wikidata

Use SPARQL to query for movies, then watch them

SPARQL queries of the Billboard Hot 100

tags

home

blog

categories

writing

music

about

Recent Posts