Inside the Netflix Recommendation Engine
Netflix makes a business out of getting subscribers to add tons of DVDs to a list of discs that will later be mailed out. Theoretically, the more discs in that list, the longer that subscriber will remain with the service, since new movies will just keep coming. So a big part of Netflix's business is recommending titles to subscribers based on what they've previously enjoyed. Netflix calls its recommendation system "Cinematch™."
In October 2006, Netflix announced The Netflix Prize, a $1 million cash award to anyone who could improve Cinematch™'s recommendation accuracy by 10%. What this "recommendation accuracy" bit means is: the system needs to get 10% better at predicting what a given user will think about a given movie, based on that user's prior movie preferences. Netflix asks users on its site to rank the movies it recommends (on a scale of 1 to 5 stars), and thus is able to mine this kind of data from daily usage.
Two weeks ago, The New York Times ran a fantastic article on Cinematch™ and The Netflix Prize. The Times profiled various programmers who are trying to improve the recommendation system's accuracy. Here's a snippet:
Each time he or his kids think of a new approach, [Len] Bertoni writes a computer program to test it. Each new algorithm takes on average three or four hours to churn through the data on the family's "quad core" Gateway computer. Bertoni's results have gradually improved. When I last spoke to him, he was at No. 8 on the leader board; his program was 8.8 percent better than Cinematch. The top team was at 9.44 percent. Bertoni said he thought he was within striking distance of victory. But his progress had slowed to a crawl. The more Bertoni improved upon Netflix, the harder it became to move his number forward. This wasn't just his problem, though; the other competitors say that their progress is stalling, too, as they edge toward 10 percent. Why? Bertoni says it's partly because of "Napoleon Dynamite," an indie comedy from 2004 that achieved cult status and went on to become extremely popular on Netflix. It is, Bertoni and others have discovered, maddeningly hard to determine how much people will like it. ...
Read the rest (and be sure to watch the accompanying video) for a surprisingly technical, but very readable, look into the technology behind recommendations.