I was catching up with my old friend Paul Prescod the other day. We have not only known each other since the early days of XML, but actually before that: “since XML was a four-letter word”, to quote Paul.
I’ve been thinking about which machine learning tools can contribute the most to the field of digital humanities, and an obvious candidate is document embeddings. I’ll describe what these are below but I’ll start with the fun part: after using some document embedding Python scripts to compare the roughly 560 Wikibooks recipes to each other, I created an If you liked… web page that shows, for each recipe, what other recipes were calculated to be most similar to that…
When I wrote Semantic web semantics vs. vector embedding machine learning semantics, I described how distributional semantics–whose machine learning implementations are very popular in modern natural language processing–are quite different from the kind of semantics that RDF people usually talk about. I recently learned of a fascinating project that brings RDF technology and distributional semantics together, letting our SPARQL query logic take advantage of entity similarity as rated…
When I presented “intro to the semantic web” slides in TopQuadrant product training classes, I described how people talking about “semantics” in the context of semantic web technology mean something specific, but that other claims for computerized semantics (especially, in many cases, “semantic search”) were often vague attempts to use the word as a marketing term. Since joining CCRi, though, I’ve learned plenty about machine learning applications that…