What Does I.Q. Really Test?

Back in 2007, Malcolm Gladwell wrote a terrific article for The New Yorker on the history of I.Q. At its core, Gladwell's article is a book review, covering James Flynn's What Is Intelligence? Gladwell discusses a series of surprising facts about I.Q., and explains long-held debates about differences in I.Q. among populations (the most famous being the brouhaha over The Bell Curve). The salient facts are: 1. I.Q. scores are frequently normed. In general, the entire population taking the test tends to get better at the test over time -- this is called the Flynn effect, and on average we're getting 0.3 points better per year. To account for the Flynn effect, periodically the I.Q. tests are renormalized such that the median score remains at 100. This means that someone who scores a 100 on a new test has tested better than someone who got a 100 on the old test.

Why does this matter? Well, I'll let Gladwell delve into the details, but the core issue is that you can't easily compare scores over spans of time because the tests and their scores have been changing. Where it gets really weird is when you look into the classification of low scores (specifically, who is retarded) and you correct for norming over time. As Gladwell writes:

...the Flynn effect puts the average I.Q.s of the schoolchildren of 1900 at around 70, which is to suggest, bizarrely, that a century ago the United States was populated largely by people who today would be considered mentally retarded.

Now, clearly this was not the case. So what exactly does the Flynn effect mean?

2. I.Q. tests measure culturally specific cognition, not core intelligence. Although I've heard this argument for years (and agreed with it on principle), I've never heard a succinct explanation of specifically what's going on in the tests. In what specific way is I.Q. a test of cultural characteristics? Here's what Gladwell writes (in part):

The very fact that average I.Q.s shift over time ought to create a "crisis of confidence," Flynn writes in "What Is Intelligence?" (Cambridge; $22), his latest attempt to puzzle through the implications of his discovery. "How could such huge gains be intelligence gains? Either the children of today were far brighter than their parents or, at least in some circumstances, I.Q. tests were not good measures of intelligence." The best way to understand why I.Q.s rise, Flynn argues, is to look at one of the most widely used I.Q. tests, the so-called WISC (for Wechsler Intelligence Scale for Children). The WISC is composed of ten subtests, each of which measures a different aspect of I.Q. Flynn points out that scores in some of the categories—those measuring general knowledge, say, or vocabulary or the ability to do basic arithmetic—have risen only modestly over time. The big gains on the WISC are largely in the category known as "similarities," where you get questions such as "In what way are 'dogs' and 'rabbits' alike?" Today, we tend to give what, for the purposes of I.Q. tests, is the right answer: dogs and rabbits are both mammals. A nineteenth-century American would have said that "you use dogs to hunt rabbits." "If the everyday world is your cognitive home, it is not natural to detach abstractions and logic and the hypothetical from their concrete referents," Flynn writes. Our great-grandparents may have been perfectly intelligent. But they would have done poorly on I.Q. tests because they did not participate in the twentieth century's great cognitive revolution, in which we learned to sort experience according to a new set of abstract categories. In Flynn's phrase, we have now had to put on "scientific spectacles," which enable us to make sense of the WISC questions about similarities. To say that Dutch I.Q. scores rose substantially between 1952 and 1982 was another way of saying that the Netherlands in 1982 was, in at least certain respects, much more cognitively demanding than the Netherlands in 1952. An I.Q., in other words, measures not so much how smart we are as how modern we are.

Gladwell's review is a fascinating read, and raises a variety of interesting questions about the nature of intelligence and what exactly I.Q. tests measure. If you're interested in cultural complexity and cognition in general, I highly recommend Everything Bad Is Good for You, whose thesis is not that crappy TV makes you smarter, but instead that modern culture is cognitively complex.