Wikidata's excellent sample SPARQL queries
Learning about the data, its structure, and more.
Learning about the data, its structure, and more.
First (SPARQL-oriented) steps.
I’ve written so often about DBpedia here that a few times I considered writing a book about it. As I saw Wikidata get bigger and bigger, I kept postponing the day when I would dig in and learn more about this Wikipedia sibling project. I’ve finally done this, starting with a few basic steps and one extra fun one:
Disney! Apple! Amazon! MasterCard!
Since I wrote “Experience in SPARQL a plus” about SPARQL appearances in job postings almost three years ago, I still find myself pointing people to it to show them that SPARQL is not some academic theoretical thing but a popular tool in production use at well-known companies.
With just a bit of Python to frame it all.
In a recent blog entry for my employer titled GeoMesa analytics in a Jupyter notebook, I wrote
If emojis have Unicode code points, then we can...
I knew that emojis have Unicode code points, but it wasn’t until I saw this goofy picture in a chat room at work that I began to wonder about using emojis in RDF data and SPARQL queries. I have since learned that the relevant specs are fine with it, but as with the simple display of emojis on non-mobile devices, the tools you use to work with these characters (and the tools used to build those tools) aren’t always as cooperative as you’d hope.
Especially inferencing.
I’ve been hearing more about the Blazegraph triplestore (well, “graph database with RDF support”), especially its support for running on GPUs, and because they also advertise some degree of RDFS and OWL support, I wanted to see how quickly I could try that after downloading the community edition. It was pretty quick.
Well, movie ratings data popular with machine learning people.
While watching an excellent video about the pandas python data analysis library recently, I learned about how the University of Minnesota’s grouplens project has made a large amount of movie rating data from the movielens website available. Their download page lets you pull down 100,000, one million, ten million, or 100 million ratings, including data about the people doing the rating and the movies they rated.
With some help from SPARQL.
I’ve been learning about Geographical Information System (GIS) data lately. More and more projects and businesses are doing interesting things by associating new kinds of data with specific latitude/longitude pairs; this data might be about air quality, real estate prices, or the make and model of the nearest Uber car.