Posted by Dalliard
As an online discussion about IQ or general intelligence grows longer, the probability of someone linking to statistician Cosma Shalizi’s essay g, a Statistical Myth approaches 1. Usually the link is accompanied by an assertion to the effect that Shalizi offers a definitive refutation of the concept of general mental ability, or psychometric g.
In this post, I will show that Shalizi’s case against g appears strong only because he misstates several key facts and because he omits all the best evidence that the other side has offered in support of g.
Attention Conservation Notice: About 11,000 words on the triviality of finding that positively correlated variables are all correlated with a linear combination of each other, and why this becomes no more profound when the variables are scores on intelligence tests. ... To summarize what follows below ..., the case for g rests on a statistical technique, factor analysis, which works solely on correlations between tests. Factor analysis is handy for summarizing data, but can't tell us where the correlations came from; it always says that there is a general factor whenever there are only positive correlations.
One of the examples in my data-mining class is to take a ten-dimensional data set about the attributes of different models of cars, and boil it down to two factors which, together, describe 83 percent of the variance across automobiles.  The leading factor, the automotive equivalent of g, is positively correlated with everything (price, engine size, passengers, length, wheelbase, weight, width, horsepower, turning radius) except gas mileage. It basically says whether the car is bigger or smaller than average. The second factor, which I picked to be uncorrelated with the first, is most positively correlated with price and horsepower, and negatively with the number of passengers — the sports-car/mini-van axis.
In this case, the analysis makes up some variables which aren't too implausible-sounding, given our background knowledge. Mathematically, however, the first factor is just a weighted sum of the traits, with big positive weights on most variables and a negative weight on gas mileage. That we can make verbal sense of it is, to use a technical term, pure gravy. Really it's all just about redescribing the data.