My new job
Lots of cutting edge technologies, 18 minutes from my home.
Lots of cutting edge technologies, 18 minutes from my home.
Simple copyediting things.
I’ve done some copyediting as part of my job, especially with marketing material. Certain basic mistakes come up so often that I made a list that I’ve been tempted to give to whoever gave me the original content and say “please make sure that it doesn’t have any of these problems first!” I didn’t, but for those who are interested, following these simple rules will make your writing look more professional. The nice thing about these is that, unlike with truly good writing, no skill and very…
And surrealism, and impressionism...
In my data science glossary, the entry for data wrangling gives this example: “If you have 900,000 birthYear values of the format yyyy-mm-dd and 100,000 of the format mm/dd/yyyy and you write a Perl script to convert the latter to look like the former so that you can use them all together, you’re doing data wrangling.” Data wrangling isn’t always cleanup of messy data, but can also be more creative, downright fun work that qualifies as what machine learning people call…
Complete with a dot org domain name.
Lately I’ve been studying up on the math and technology associated with data science because there are so many interesting things going on. Despite taking many notes, I found myself learning certain important terms, seeing them again later, and then thinking “What was that again? P-values? Huh?”
Well, movie ratings data popular with machine learning people.
While watching an excellent video about the pandas python data analysis library recently, I learned about how the University of Minnesota’s grouplens project has made a large amount of movie rating data from the movielens website available. Their download page lets you pull down 100,000, one million, ten million, or 100 million ratings, including data about the people doing the rating and the movies they rated.
With some help from SPARQL.
I’ve been learning about Geographical Information System (GIS) data lately. More and more projects and businesses are doing interesting things by associating new kinds of data with specific latitude/longitude pairs; this data might be about air quality, real estate prices, or the make and model of the nearest Uber car.
Especially machine learning.
Earlier this month I tweeted “When people write about AI like it’s this brand new thing, should I be amused, feel old, or both?” The tweet linked to a recent Harvard Business Review article called Data Scientists Don’t Scale about the things that Artificial Intelligence is currently doing, which just happened to be the things that the author of the article’s automated prose-generation company is doing.
Well, a video, but a lot of important SPARQL basics in a short period of time.