IMDB + Google + Netflix + RSS = Goodness?

I did a short programming project over the long weekend, building on a script I wrote last summer — it’s an IMDB+Netflix mash-up. Once a week, typically on Tuesdays, my computer at home goes through all the new DVD releases for the week, looks up the user ratings on IMDB, and then builds a post here on JakeSavin.com with links to all the movies in IMDB and Netflix, including links to add to my Netflix queue. Why, you ask? Read on…

Netflix and RSS

I love Netflix, and while the Netflix recommendation system is really cool, it doesn’t help me to find new releases in a timely manner. Now, Netflix does have some RSS feeds, in particular one that lists all the new DVD releases each week, which is a cool feature, but it doesn’t tie into the recommendation system, so its value is minimal as far as my Netflix experience is concerned. After all, I can get a list of new DVD releases in lots of places online and off, including the wall behind the counter at my local Blockbuster or Hollywood Video store.

The trouble with user ratings

Now, while I love Netflix recommendations, their user ratings have the same problems for me that Amazon’s and Yahoo’s do — most movie watchers have less discriminating tastes than I do, and a 5-star rating system doesn’t capture enough granularity to make the rating for a single movie work for me. In aggregate, it works great for the Netflix recommendation engine, but for a single movie it’s hit-or-miss at best. It may also be that there just aren’t enough users rating movies on these systems for the rating to carry much meaning, especially for movies that have just been released. This is where IMDB fits in.

IMDB has really high-quality user ratings

If you’re not familiar with IMDB, it’s a long lived, public database of movies and TV shows, with user ratings, and tons of other data about just about any movie you can think of. Aside from its vast capability for settling movie trivia arguments at parties, the thing I like the most about IMDB is the user ratings.

For whatever reason, IMDB users’ ratings match mine and Cindy’s tastes much more closely than do the ratings on Netflix or Amazon or Yahoo. This is probably at least in part because there may be more users providing ratings on IMDB, but it’s more likely that ratings have been collecting on IMDB for a longer period of time at the point that new DVDs are released, since they enter the IMDB system as soon as their existence is known, rather than near the DVD release date.

My guess is that the IMDB user ratings will also be more accurate than Netflix for other discriminating movie watchers as well.

The IMDB + Google + Netflix + RSS mash-up

So a while back, armed with this knowledge and some pretty good scripting skills, I hacked together a mash-up that loads the Netflix new release RSS feed, looks up all the titles using IMDB’s search engine, scrapes the user ratings from the HTML in the search result, and builds an outline (in OPML) on my computer with links to movies with ratings over 6.0. It worked pretty well for a while, and found lots of great movies I would otherwise have passed by. But it had some problems:

  • Many times, IMDB would return multiple results, and not necessarily for the movie I wanted
  • Sometimes the IMDB search would return the wrong movie — their search isn’t as good as Google’s
  • The resulting document was only useful for me: Cindy didn’t want to get on my computer to look at it, and so when I get busy with work or whatever, the new movies don’t make it into our queue — not to mention that nobody else could see them at all

So over the weekend I updated my hack. Here’s what I changed:

  • Now it uses Google to find the movies on IMDB, using a search like "movie title" 2006 inurl:title site:imdb.com — this usually gets to the correct page on IMDB
  • Once all the results have been found and IMDB user ratings compiled, it builds the HTML for a post on my blog, and sends it up to my New DVD Releases category — and the HTML now includes links straight to the Netflix AddToQueue page for the movie — this category is linked to under Good Flicks in my site navigation
  • As a bonus, my site is now checking the user-agent, so when serving to IE on Windows Mobile, all the Netflix links go to the appropriate page on the Netflix Mobile site, to make for happy mobile Netflix’ing

The end result: A page where anyone who has a Netflix account can go to find great new movies, sorted by IMDB user ratings. Just log into Netflix, go to my Good Flix page, and click the title of any movie to add it to your Netflix queue.

There’s still the occasional wrong movie returned by Google — for example right now it’s showing IMDB information for Cars instead of a new release called The Route — but far fewer than before. Also, movies that have been rated by only a few people, say under 400 or so, may not actually be any good since the first raters seem to be biased towards higher ratings, for whatever reason. For example, Tweek City is probably not worthy of an 8.2 out of 10, but that’s what the first 15 raters seem to think — I wonder how many of them are affiliated with the making of the movie. But — movies with high IMDB user ratings by 500 or more people are almost always worth watching. For example, Beer League might actually be funny with a 6.1 for 652 votes. At least the movies my mash-up finds are likely to be worth watching by me, and if you like them too, then all the better.

This would all work much better if IMDB had a reliable API for finding movies and getting their user rating data, but as far as I know they don’t, so for now it’s a hack. Maybe they’ll open up their database more in the future to stay competitive. After all, IMDB is owned by Amazon, and they’re definitely hip to web APIs in general. We’ll see…

So have fun, and let me know what you think of this hack?

5 Comments

  1. Dan W. said:

    Pretty Cool, Jake. I notice that your project doesn’t seem to have its own RSS feed — no auto-detect meta tag, and the button on the page is a feed for your general blog, not this one. Any way to make that work, so I can subscribe to this feed as well as your blog?

    January 2, 2007
    Reply
  2. Jake Savin said:

    Thanks, good catch.

    That the page didn’t link to its own RSS feed was a bug in my template — now fixed. The actual feed is here.

    That said, you’ll only see one post per week in the above feed, instead of one post per movie. This is sub-optimal IMO, and I may address it in the future. It shouldn’t be too hard, and will also allow me to remove mis-matched movies if/when I catch them.

    Cheers man, and thanks again!

    January 2, 2007
    Reply
  3. Dan W. said:

    Cool. Now all you need are a thumbnail of the movie poster, so I can … well, judge a DVD by its cover …. and an embedded QuickTime trailer 🙂

    January 3, 2007
    Reply
  4. Aaron said:

    Hi Jake,

    This is a very nice mash up. I wish Amazon would help straighten out the API for IMDB as you mentioned. What I would like is an RSS feed of my own ratings… because I agree that the 5 star system is too limited. I like publishing my recent ratings on my blog so my friends can see movies I’ve watched recently and what I thought of them. The nice thing about using Amazon for this purpose as I do currently is that they also include music and book ratings in the same feed (everything you’ve reviewed lately).

    December 3, 2007
    Reply

Post a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.