What We Learned from 5 Million Books


In this fourteen-minute TEDxBoston Talk, uber-geeks Erez Lieberman Aiden and Jean-Baptiste Michel talk about what they've learned from processing 5 million books (or 500 billion words) via the Google Ngram Viewer. This tool uses text scanned from books to find specific terms and phrases, so you can figure out historical patterns of language usage. This may sound really geeky, and it is, but it's presented in a really sweet way. On stage we have two guys who are like us -- geeks -- sharing some fun examples of what they've learned. Even better, the audience is equally geeky and laughs at all the right stuff. I found this delightful.

A representative quote: "We were thinking, the best way to learn is to read all these millions of books. Now, if there's a scale for how Awesome that is, it has to rank extremely, extremely high. The problem is, there's an x-axis for that, which is the Practical axis, and this is very, very low."

Discussed: thrived vs. throve; how we lose interest in the past more rapidly; career advice for people wishing to be famous; mathematical observations of censorship and propaganda; culturomics; Anrgh!

If you want to play with the Ngram Viewer, here are some interesting starting points: Star Trek, 1940-2008, and Star Wars vs. Star Trek, 1940-2008. See also: Reagan vs. Clinton.