January 28, 2013

Regression toward the mean and Francis Galton

It's difficult to hold in one's mind the notion that two opposed statements can be true at the same time: that the glass, for example, is both partly full and partly empty. 

The concept of regression toward the mean is one of the more paradoxical notions ever discovered. Although it's a fundamental aspect of everyday life, it took until Sir Francis Galton in the late 19th Century to be recognized. 

In 2005, Jim Holt wrote an excellent review in The New Yorker of a tendentious biography of Galton, putting Galton's historic achievement in proper perspective:
Galton might have puttered along for the rest of his life as a minor gentleman scientist had it not been for a dramatic event: the publication of Darwin’s “On the Origin of Species,” in 1859. Reading his cousin’s book, Galton was filled with a sense of clarity and purpose. One thing in it struck him with special force: to illustrate how natural selection shaped species, Darwin cited the breeding of domesticated plants and animals by farmers to produce better strains. Perhaps, Galton concluded, human evolution could be guided in the same way. But where Darwin had thought mainly about the evolution of physical features, like wings and eyes, Galton applied the same hereditary logic to mental attributes, like talent and virtue.

In other words, Galton's thinking evolved from qualitative to quantitative.
"If a twentieth part of the cost and pains were spent in measures for the improvement of the human race that is spent on the improvements of the breed of horses and cattle, what a galaxy of genius might we not create!” he wrote in an 1864 magazine article, his opening eugenics salvo. It was two decades later that he coined the word “eugenics,” from the Greek for “wellborn.” 
Galton also originated the phrase “nature versus nurture,” which still reverberates in debates today. (It was probably suggested by Shakespeare’s “The Tempest,” in which Prospero laments that his slave Caliban is “A devil, a born devil, on whose nature / Nurture can never stick.”) At Cambridge, Galton had noticed that the top students had relatives who had also excelled there; surely, he reasoned, such family success was not a matter of chance. His hunch was strengthened during his travels, which gave him a vivid sense of what he called “the mental peculiarities of different races.” Galton made an honest effort to justify his belief in nature over nurture with hard evidence. In his 1869 book “Hereditary Genius,” he assembled long lists of “eminent” men—judges, poets, scientists, even oarsmen and wrestlers—to show that excellence ran in families. To counter the objection that social advantages rather than biology might be behind this, he used the adopted sons of Popes as a kind of control group. His case elicited skeptical reviews, but it impressed Darwin. “You have made a convert of an opponent in one sense,” he wrote to Galton, “for I have always maintained that, excepting fools, men did not differ much in intellect, only in zeal and hard work.” Yet Galton’s labors had hardly begun. If his eugenic utopia was to be a practical possibility, he needed to know more about how heredity worked. His belief in eugenics thus led him to try to discover the laws of inheritance. And that, in turn, led him to statistics. 
Statistics at that time was a dreary welter of population numbers, trade figures, and the like. It was devoid of mathematical interest, save for a single concept: the bell curve. The bell curve was first observed when eighteenth-century astronomers noticed that the errors in their measurements of the positions of planets and other heavenly bodies tended to cluster symmetrically around the true value. A graph of the errors had the shape of a bell. In the early nineteenth century, a Belgian astronomer named Adolph Quetelet observed that this “law of error” also applied to many human phenomena. Gathering information on the chest sizes of more than five thousand Scottish soldiers, for example, Quetelet found that the data traced a bell-shaped curve centered on the average chest size, about forty inches. 
As a matter of mathematics, the bell curve is guaranteed to arise whenever some variable (like human height) is determined by lots of little causes (like genes, health, and diet) operating more or less independently. For Quetelet, the bell curve represented accidental deviations from an ideal he called l’homme moyen—the average man. When Galton stumbled upon Quetelet’s work, however, he exultantly saw the bell curve in a new light: what it described was not accidents to be overlooked but differences that revealed the variability on which evolution depended. His quest for the laws that governed how these differences were transmitted from one generation to the next led to what Brookes justly calls “two of Galton’s greatest gifts to science”: regression and correlation. 
Although Galton was more interested in the inheritance of mental abilities, he knew that they would be hard to measure. So he focussed on physical traits, like height. The only rule of heredity known at the time was the vague “Like begets like.” Tall parents tend to have tall children, while short parents tend to have short children. But individual cases were unpredictable. Hoping to find some larger pattern, in 1884 Galton set up an “anthropometric laboratory” in London. Drawn by his fame, thousands of people streamed in and submitted to measurement of their height, weight, reaction time, pulling strength, color perception, and so on. ... 
After obtaining height data from two hundred and five pairs of parents and nine hundred and twenty-eight of their adult children, Galton plotted the points on a graph, with the parents’ heights represented on one axis and the children’s on the other. He then pencilled a straight line though the cloud of points to capture the trend it represented. The slope of this line turned out to be two-thirds. What this meant was that exceptionally tall (or short) parents had children who, on average, were only two-thirds as exceptional as they were. In other words, when it came to height children tended to be less exceptional than their parents. The same, he had noticed years earlier, seemed to be true in the case of “eminence”: the children of J. S. Bach, for example, may have been more musically distinguished than average, but they were less distinguished than their father. Galton called this phenomenon “regression toward mediocrity.” 
Regression analysis furnished a way of predicting one thing (a child’s height) from another (its parents’) when the two things were fuzzily related. Galton went on to develop a measure of the strength of such fuzzy relationships, one that could be applied even when the things related were different in kind—like rainfall and crop yield. He called this more general technique “correlation.” 
The result was a major conceptual breakthrough. Until then, science had pretty much been limited to deterministic laws of cause and effect—which are hard to find in the biological world, where multiple causes often blend together in a messy way. Thanks to Galton, statistical laws gained respectability in science. 
His discovery of regression toward mediocrity—or regression to the mean, as it is now called—has resonated even more widely. Yet, as straightforward as it seems, the idea has been a snare even for the sophisticated. The common misconception is that it implies convergence over time. If very tall parents tend to have somewhat shorter children, and very short parents tend to have somewhat taller children, doesn’t that mean that eventually everyone should be the same height? No, because regression works backward as well as forward in time: very tall children tend to have somewhat shorter parents, and very short children tend to have somewhat taller parents. The key to understanding this seeming paradox is that regression to the mean arises when enduring factors (which might be called “skill”) mix causally with transient factors (which might be called “luck”). Take the case of sports, where regression to the mean is often mistaken for choking or slumping. Major-league baseball players who managed to bat better than .300 last season did so through a combination of skill and luck. Some of them are truly great players who had a so-so year, but the majority are merely good players who had a lucky year. There is no reason that the latter group should be equally lucky this year; that is why around eighty per cent of them will see their batting average decline. 
To mistake regression for a real force that causes talent or quality to dissipate over time, as so many have, is to commit what has been called “Galton’s fallacy.” In 1933, a Northwestern University professor named Horace Secrist produced a book-length example of the fallacy in “The Triumph of Mediocrity in Business,” in which he argued that, since highly profitable firms tend to become less profitable, and highly unprofitable ones tend to become less unprofitable, all firms will soon be mediocre. A few decades ago, the Israeli Air Force came to the conclusion that blame must be more effective than praise in motivating pilots, since poorly performing pilots who were criticized subsequently made better landings, whereas high performers who were praised made worse ones. (It is a sobering thought that we might generally tend to overrate censure and underrate praise because of the regression fallacy.) More recently, an editorialist for the Times erroneously argued that the regression effect alone would insure that racial differences in I.Q. would disappear over time. 
Did Galton himself commit Galton’s fallacy? Brookes insists that he did. “Galton completely misread his results on regression,” he argues, and wrongly believed that human heights tended “to become more average with each generation.” Even worse, Brookes claims, Galton’s muddleheadedness about regression led him to reject the Darwinian view of evolution, and to adopt a more extreme and unsavory version of eugenics. Suppose regression really did act as a sort of gravity, always pulling individuals back toward the average. Then it would seem to follow that evolution could not take place through a gradual series of small changes, as Darwin envisaged. It would require large, discontinuous changes that are somehow immune from regression to the mean.  
Such leaps, Galton thought, would result in the appearance of strikingly novel organisms, or “sports of nature,” that would shift the entire bell curve of ability. And if eugenics was to have any chance of success, it would have to work the same way as evolution. In other words, these sports of nature would have to be enlisted to create a new breed. Only then could regression be overcome and progress be made. 
In telling this story, Brookes makes his subject out to be more confused than he actually was. It took Galton nearly two decades to work out the subtleties of regression, an achievement that, according to Stephen M. Stigler, a statistician at the University of Chicago, “should rank with the greatest individual events in the history of science—at a level with William Harvey’s discovery of the circulation of blood and with Isaac Newton’s of the separation of light.” By 1889, when Galton published his most influential book, “Natural Inheritance,” his grasp of it was nearly complete. He knew that regression had nothing special to do with life or heredity. He knew that it was independent of the passage of time. Regression to the mean held even between brothers, he observed; exceptionally tall men tend to have brothers who are somewhat less tall. In fact, as Galton was able to show by a neat geometric argument, regression is a matter of pure mathematics, not an empirical force. Lest there be any doubt, he disguised the case of hereditary height as a problem in mechanics and sent it to a mathematician at Cambridge, who, to Galton’s delight, confirmed his finding.

Read more.

Keep in mind that Galton was born in 1822 and was in his later 60s by the time he published "Natural Inheritance." Indeed, the book "The Wisdom of Crowds" begins with an anecdote about a major conceptual breakthrough that Galton came up with when he was 85.

52 comments:

Anonymous said...

It's difficult to hold in one's mind the notion that two opposed statements can be true at the same time: that the glass, for example, is both partly full and partly empty.

Steve, how are those statements opposed? I don't think they are. Arguably, the statement "the glass is both full and empty" (sans "partly" or "half") is more opposed but even that would depend on how one defines "full" and "empty."

Socially Extinct said...

Regression to the mean is God, or it can certainly be construed as such by the resourceful and fanatical theist.

Anonymous said...

http://www.spiegel.de/international/reversing-population-decline-germany-s-new-immigrant-influx-a-880038.html

Anonymous said...

http://www.spiegel.de/international/europe/nationalism-and-populism-decided-czech-presidential-election-a-880011.html

Anonymous said...

http://blogs.spectator.co.uk/fraser-nelson/2013/01/see-no-crime-hear-no-crime-and-speak-no-crime/

misty said...

Thanks for the link. Galton was a lot more fun than the stuffy Victorian gent I'd imagined him to be.

Though it could be argued he was a bit too adventurous to be bred.

Mingo the Merciless said...

Say it five times before breakfast tomorrow; more important, understand it as the center of a network of implication: “Avian equality is a contingent fact of history.” Equality is not given a priori; it is neither an ethical principle (though equal treatment may be) nor a statement about norms of social action. It just worked out that way. A hundred different and plausible scenarios for avian history would have yielded other results (and moral dilemmas of enormous magnitude). They didn’t happen. “Species” does not exist. We are all the same under the feathers. – Stephen Jay Guckoo, The Flamingo’s Smile: Reflections in Natural History, 1985.

Anonymous said...

"Steve, how are those statements opposed? I don't think they are."

More specifically, they are contrary statements. Not contradictory. Can contrary statements be considered opposing statements? No. The opposite of full is empty.

Anonymous said...

For the confused:

http://infoproc.blogspot.com/2008/10/regression-to-mean.html

http://infoproc.blogspot.com/2010/07/assortative-mating-regression-and-all.html

http://infoproc.blogspot.com/2009/11/mystery-of-nonshared-environment.html

Anonymous said...

Could someone recommend a textbook with all the equations and references to the experimental literature?
Robert Hume

dearieme said...

"I have always maintained that, excepting fools, men did not differ much in intellect, only in zeal and hard work.” Even Darwins have their blind spots.

P.S. What a pleasure it is to see 'mediocrity' used correctly, rather than as a euphemistic cliché.

Georgia Resident said...

The article asserts that Galtonian Eugenics were based on bad science, but where exactly did they go wrong? Like a great many other heritable traits, like height, intelligence is distributed along a bell curve in populations. Additionally, twin studies in the 20th century, and recent genetic studies, have established a heritability of at least 50% for IQ. It would seem that Galton has been largely vindicated. By contrast, "designer babies", exactingly modified for higher intelligence, good looks, and vigorous health, are still largely a pipe dream.

candid_observer said...

Over the years, I've come to realize that there is no definitive text on regression to the mean as it applies to traits such as IQ. Everything I have ever read is either highly incomplete or confused, and usually both.

Galton struggled for decades on the subject; we still seem to struggle. My guess is that it requires both mathematical insight and skill and an almost philosophical insight and skill, and those are in rare combination.

ben tillman said...

Galton made an honest effort to justify his belief in nature over nurture with hard evidence.

Or, more charitably, the reviewer might have said that Galton stated the facts that led him to his conclusion.

Pincher Martin said...

What is the best biography of Francis Galton?

Eric Rasmusen said...

Does he have the record for oldest age at which someone has made a major mathematical discovery?

Luke Lea said...

I liked this line near the end of the review:

"If its technologies are used to shape the genetic endowment of children according to the desires—and financial means—of their parents, the outcome could be a “GenRich” class of people who are smarter, healthier, and handsomer than the underclass of “Naturals.”

Isn't that the kind of "breeding" the upper and upper-middle classes of Britain practiced for centuries? The aristocracy certainly thought of itself as -- and perhaps was -- "smarter, healthier, and handsomer" than the commoners.

Luke Lea said...

My brain is old and creaky, but isn't the reason regression to the mean doesn't doesn't lead to everyone being dead average is because sheer chance produces individuals who are far from the mean? Perhaps the author says that but I didn't see it.

Anonymous said...

The article asserts that Galtonian Eugenics were based on bad science, but where exactly did they go wrong? Like a great many other heritable traits, like height, intelligence is distributed along a bell curve in populations.


I'm really starting to despair of the reading ability of iSteve commenters. As the article make clear, "Galtonian Eugenics" were an early notion of Galton which he himself largely refuted in his later work on regression.

Anonymous said...

The aristocracy certainly thought of itself as -- and perhaps was -- "smarter, healthier, and handsomer" than the commoners.


No, it was not. Francis Galton was not a member of the aristocracy. Neither was Charles Darwin, Adam Smith, Thomas Newcomen, James Watt, or the overwhelming majority of other prominent scientific men in 18th and 19th century Britain.

jody said...

i'm thinking of a few brothers in sports where one couple produced more than one son of world class ability.

klitschkos
gasols
mannings
barbers
harbaughs
molinas
staals
lopezs
borlees

although the borlee brothers and lopez brothers are twins.

then there are many more examples of brothers where only one of them made it to the highest level, like brian urlacher, blake griffin, michael vick. there are thousands of those guys.

sports ability to the level where you actually know a guy's name is very rare. that level of talent is much more rare and constrained than run of the mill very high intelligence. this kind of stuff easily falls under regression to the mean, because name recognition level sports ability is many, many SDs out to the right. it is almost never passed on.

jody said...

"Over the years, I've come to realize that there is no definitive text on regression to the mean as it applies to traits such as IQ. Everything I have ever read is either highly incomplete or confused, and usually both."

indeed.

without additional mathematical principles fleshing out and expanding on the ideas here, the math, as presented, does absolutely, positively posit that all the very smart people will decline into relatively average people after a few generations. at every IQ SD step, the math says, you lose IQ points when you reproduce.

the math herien provides no mechanism by which new very smart people and even geniuses may be produced at the volumes we observe. quite the contrary. it says not only that all smart people will have few children of intelligence even equal to their themselves, but actually, most of their kids will be dumber, no matter how hard they try to find a smart mate and make smart children.

it's hopeless, says the math. most of your kids will be 1 SD dumber or a little more, so just deal with it. it says this at every SD step of the way down, not just for geniuses up at IQ 160.

jody said...

"My brain is old and creaky, but isn't the reason regression to the mean doesn't doesn't lead to everyone being dead average is because sheer chance produces individuals who are far from the mean? Perhaps the author says that but I didn't see it."

precisely. so where are all the smart people coming from? until there is math which describes how that happens, this regression to the mean math does not accurately predict what we observe.

in fact, without adequate mathematical principles to explain how very smart people or geniuses come into existence, we're left to rely on lottery math. that is to say, you only get new IQ 160 geniuses if you have millions and millions of average people having sex and banging out kids. then a couple of them will produce geniuses by a genetic roll of the dice. heck, the math even says you can't reliably reproduce IQ 145 people. or even IQ 130 people.

but what lottery math predicts is that the greater the number of people pumping out kids in your population, the more lottery winner children will be produced. so if it's the year 1900 and the population of your nation is 20 million people, your nation might only produce a few geniuses by random chance. but if it's the year 2000 and the population of your nation has swelled to 100 million people, all those average people out there in all the towns and villages are humpin' away and making 5 times as many geniuses.

yet we NEVER observe this. we never observe a linear increase in the number of people in the very smart to genius range, derived strictly from population growth, ESPECIALLY in the third world. mexico produced no geniuses in 1900 and by 2000 it still had not produced any geniuses despite it's population exploding from about 15 million in 1900 to about 100 million in 2000.

jody said...

in a sense, you end up with a subatomic particle collider hypothesis. where it's simply a matter of the number of sheer, er, genital collisions, if you will, akin to particle collisions in a physics experiment. millions of collisions are mundane events producing nothing interesting. it's only when increasing the volume up to tens of millions or a hundred million collisions, and picking out 3 or 4 unusual events, where things get interesting.

well, several nations are undergoing population explosion, but producing no geniuses. thailand, egypt, philippines, indonesia, pakistan, turkey. with 80 million people, it shouldn't matter much that their mean IQ is 85 or 90 or whatever it is. with THAT many citizens, these nations should be randomly producing a few people in the IQ 160 range and certainly numerous people in the IQ 145 range. but they aren't.

i even eliminated african nations for obvious reasons, but if the "lottery event" math were accurate, you would expect to see many people in the IQ 130 to 145 range in nigeria, ethiopia, south africa, tanzania as well. but we don't.

instead, we observe what common sense tells us. dumb people make more dumb people and very, very rarely produce a genius from out of the blue. smart people mostly make other smart people, and the size of the talent pool of the very smart people in your nation, is what contrains production volumes of the next generation of very smart people. this is exactly what smart fraction hypothesis predicts.

despite exploding populations and hundreds of millions of citizens, third world nations are not supercolliders for producing people who are in the IQ 145 to 160 range.

Ex Submarine Officer said...

I find it surprising that the average chest size of a bunch of Scottish soldiers in the early 19th century was 40 inches.

During WWII, the average was 36 or so for U.S. soldiers, I believe.

David said...

Holt's review is definitely excellent. Thanks for posting it.

Anonymous said...

Regression to the mean does not mean that tall parents will have shorter children or that smart parents will have less smart children. It means that the expected height of the children of taller than average parents - in the absence of supplementary information - will be less than the height of their parents, and greater than the population mean.

However, we generally have supplementary information, such as the heights of ancestors and other genetically related individuals. Given this information, we may conclude that both parents are beneath their "family mean" height, and that their children would have expected heights greater than their parents. The past is the best predictor of the present.

Regression to the mean does not preclude eugenics or other selection programs. However, as has been mentioned, the reproduction process includes a great deal of "luck", i.e. unexplained variance or "noise". As the development of the line proceeds, this noise must be removed through eliminating undesirable individuals from the breeding pool. This action is objectionable on numerous moral fronts, with high risk of blowback and "unforeseen consequences".

Finally, why do you want to select for IQ? A few very smart people go a long way in a semi-functional society. As it is we have to quarantine the super-smart at places like Cal Tech. Be careful what you wish for.

Neil Templeton

Jeff said...

Most of the world seems to have a strange fascination with LA's dark side. I think that goes a long way towards explaining Chandlers success.

Anonymous said...

if thousands of 160 folks have 1/44 chance to produce a similar IQed kid, the odds are still much better than the normal population where a 160IQ is a 1 in 10000 4sigma score.
The millions of above average but not brilliant folks have much lower probability of producing einsteins, but their numbers make sure they make up the remaining 43/44. If they are reproducing of course.


btw interesting start to the article:

"In the eighteen-eighties, residents of cities across Britain might have noticed an aged, bald, bewhiskered gentleman sedulously eying every girl he passed on the street while manipulating something in his pocket. What they were seeing was not lechery in action but science. Concealed in the man’s pocket was a device he called a “pricker,” which consisted of a needle mounted on a thimble and a cross-shaped piece of paper. By pricking holes in different parts of the paper, he could surreptitiously record his rating of a female passerby’s appearance, on a scale ranging from attractive to repellent. After many months of wielding his pricker and tallying the results, he drew a “beauty map” of the British Isles. London proved the epicenter of beauty, Aberdeen of its opposite."

He would have got kanazawa'ed today.

Pincher Martin said...

Jody,

"precisely. so where are all the smart people coming from? until there is math which describes how that happens, this regression to the mean math does not accurately predict what we observe."

If you read what other people posted here, you would know that regression to the mean does roughly predict what we observe.

"yet we NEVER observe this. we never observe a linear increase in the number of people in the very smart to genius range, derived strictly from population growth, ESPECIALLY in the third world. mexico produced no geniuses in 1900 and by 2000 it still had not produced any geniuses despite it's population exploding from about 15 million in 1900 to about 100 million in 2000."

Bringing up Mexico is an entirely different issue.

First, regression to the mean describes what takes place within a breeding population. Different breeding populations *may* have different means to regress to. There's no evidence that most Mexicans and most Americans, for example, revert to the same IQ mean.

If that's the case, then you need to be careful about what mean you're measuring and what you're trying to deduce from it. Suppose the number of pygmies has quadrupled over the last half century. Does that mean the number of pygmies tall enough to play NBA basketball should also quadruple?

Second, since you were not around to observe the number of geniuses in Mexico in the early 1900s, and I'm guessing you aren't well versed in Mexican high culture at the beginning of the twentieth century, how do you know that country hasn't produced a proportionate number of geniuses in relation to its growing population over the last hundred years?

I'm guessing that you don't even know contemporary Mexico well enough to appreciate how many geniuses have recently been born within its borders. I certainly don't. If you do know, I'd like to know how.

Third, you seem to assume that a person who tests as a genius (however defined) will inevitably become famous. But however you measure genius - whether it's eligibility for membership in one of the prestigious high IQ societies or some accomplishment that merits inclusion - most high IQ men and women either accomplish very little of note or they remain anonymous to all but a small number of peers in their fields. Some high IQ people's greatest accomplishment is their test scores.

That kind of puts a crimp in your seat-of-the-pants analysis about failing to observe many geniuses in Mexico.

Anonymous said...

The best kind of eugenics for a pleasant society is negative: weed out the idiots and destructive people via sterilization.

Anonymous said...

we never observe a linear increase in the number of people in the very smart to genius range, derived strictly from population growth, ESPECIALLY in the third world


Damn, son, it's like you're being deliberately stupid. Alas, I suspect you're just naturally stupid.

Yes, we would expect that - all else being equal - the number of people who are at "genius" level of intelligence would increase with the size of the overall population. And yes, this does happen. It even happens in the Third World! Why you think it doesn't happen is just one more of the mysteries of your thought processes, along with your curious notion that you're a brave free-thinker for not using capitalization.

Anonymous said...

without additional mathematical principles fleshing out and expanding on the ideas here, the math, as presented, does absolutely, positively posit that all the very smart people will decline into relatively average people after a few generations. at every IQ SD step, the math says, you lose IQ points when you reproduce.


There is no "math, as presented".

For the umpteenth time, the principle of regression to the mean does not say what you think it says.

It certainly does not say that "all the very smart people will decline into relatively average people after a few generations". I'm reasonably certain that after a few generations all the reasonably smart people of today will be, you know, dead.

Regression to the mean does tell us that the descendants of todays very smart people will be less intelligent on average than todays very smart people. But - and here is the point which continually eludes you - that has no impact on the size of the right side of the overall populations IQ curve, because other high IQ people are always being born to parents of lesser IQ.

Their descendants in turn will regress to the mean.

JeremiahJohnbalaya said...

No one here seems to understand tha the use of the phrase "regression to the mean" here is precicely equivalent to the definition of a time-invariant random variable. The use of families or the subsequent test takers in the wiki link are irrelevant in that subsequent measurements are, by definition, independent of previous ones. That isn't true of red noise of course, which doesn't regress to a mean.

I mean come on. Think about how stupid that wikipedia example is. The top 10% of random guesser are chosen to ... randomly guess again???

Of course in the real world, test takers don't guess. And genes are passed along. Obviously a the probability distribution of a child's IQ is a function of his parents (and their parents and their parents).

Anonymous said...

@jody
"without additional mathematical principles fleshing out and expanding on the ideas here, the math, as presented, does absolutely, positively posit that all the very smart people will decline into relatively average people after a few generations. at every IQ SD step, the math says, you lose IQ points when you reproduce."


Here is some math to explain what is going on. I will just simplify things by assuming there are three groups of people, the geniuses, the normals, and the stupids. How it generalizes to the case of continuous variation will hopefully be clear.

Suppose there are 10,000 normal people, 100 stupids and 100 geniuses in a population. Regression towards the mean tells us most of the children of the geniuses will be normal. Assume 90% of their children will be normal. Also, regression towards the mean will say 90% of the kids of the stupids will be normal.

There is also natural variation, so not all the kids of the normals will be normal. In this example, let's say 0.1% of the kids of the normals will be geniuses and 0.1% will be stupid. (The reason for the choice of number will be clear soon)

Also, assume the population is just replacing itself, with no difference in fertility. Then the number of geniuses in the next generation will be:
90%*100 +0.1%*10,000 = 90 + 10 =100
Which is exactly the same as the first generation. You can also see the the number of normals and stupids is the same in the next generation.

The reason this happens is because there will always be more people closer to the average, so even if a small fraction of their kids move far from the average, this will be enough to keep their numbers stable.

The situation would stabilize eventually no matter what numbers I picked. To see how this happens, see the wikipedia article on the eigenvalue problem.

Anonymous said...

Sorry about that, I wrote the example wrong. You need 0.9% of the normals to have genius kids, and 0.9% stupid kids.

Then the number of geniuses in the next generation will be
10%*100 + 0.9%*10,000 = 10 + 90 =100

DoJ said...

Suppose there are 10,000 normal people, 100 stupids and 100 geniuses in a population. Regression towards the mean tells us most of the children of the geniuses will be normal. Assume 90% of their children will be normal. Also, regression towards the mean will say 90% of the kids of the stupids will be normal.

There is also natural variation, so not all the kids of the normals will be normal. In this example, let's say 0.1% of the kids of the normals will be geniuses and 0.1% will be stupid. (The reason for the choice of number will be clear soon)

Also, assume the population is just replacing itself, with no difference in fertility. Then the number of geniuses in the next generation will be:
90%*100 +0.1%*10,000 = 90 + 10 =100


You meant "0.9%", and "10%*100 +0.9%*10,000 = 10 + 90 = 100".

Holt's description is pretty good, but I know that I had to spend a bit of time figuratively putting pencil to paper a few years ago to properly understand this concept, so I expect most other people will have to actually reproduce a bit of math as well. I'll provide a few more guideposts here.

(The following assumes that IQ has mean 100 and standard deviation 15, and everything can be accurately described by simple Gaussian distributions. This is not actually true, but once you master the basics you have a much better shot at effectively modeling reality's messier bits.)

First, consider the two extreme cases:
1. Zero additive heritability among humans. Then everyone can be said to have a "genetic IQ" of 100 by virtue of being human, and all observed IQ variation among living people has nothing to do with "genetic IQ".

2. Near-100% additive heritability. In this case, the "genetic IQ" distribution is essentially synonymous with the actual IQ distribution, and everyone's IQ is very close to the average of their parents'.

The reality is, of course, in between. One less obvious, but crucial, consequence of this is that the distribution of "genetic IQ" is SIGNIFICANTLY NARROWER than the distribution of actual IQs. For example, if additive heritability is 64%, the "genetic IQ" distribution has a standard deviation of 12 (i.e. sqrt(0.64) * 15), rather than 15: a person with a "genetic IQ" of 148 is as rare as a person with an actual IQ of 160. (This is why almost everyone with 160 IQ has to be both lucky AND have good genetics.) In this case, adding standard-deviation-12 "genetic IQ" to standard-deviation-9 "luck" produces standard-deviation-15 actual IQ.

If you know nothing about a 160 IQ person's ancestry, your best estimate of their "genetic IQ" (assuming 64% additive heritability) is 138.4; you'd expect them to benefit from a hefty 21.6 luck points. (Recall that, if additive heritability was zero, they'd be benefiting from 60 luck points; so it makes sense for this number to scale linearly with heritability.) So if two such people marry, you'd expect their kids to cluster around 138.4. But if these kids marry other "regressed kids of 160 IQ parents", there will be no further regression to the mean. These folks are clustered around 138.4 "genetic IQ" and zero bonus points from luck, so they will tend to breed true.

Similarly, if you know 160 IQ person A's parents also averaged 160 IQ, your estimate of person A's "genetic IQ" should be significantly higher than 138.4. (How much higher is a question for Bayes' theorem and a computer.)

Depressingly, this logic justifies very un-American attitudes toward class and race, at least when it comes to marriage. Instead of trying to shut it down, progressives should really be trying to push human genetics research forward as quickly as they possibly can, because that's the only real way to render the old attitudes obsolete.

Anonymous said...

"The best kind of eugenics for a pleasant society is negative: weed out the idiots and destructive people via sterilization."

I don't think you can have a pleasant society where people are forcibly sterilized when they haven't done anything.

A strict (but sensibly designed) criminal justice policy would allow a person's own actions to decide their reproductive potential.

(Necessarily combined with a strict but sensible immigration policy.)

.
"Isn't that the kind of "breeding" the upper and upper-middle classes of Britain practiced for centuries? The aristocracy certainly thought of itself as -- and perhaps was -- "smarter, healthier, and handsomer" than the commoners."

Not really. They were standard arranged marriages based on wealth and status so with almost no direct selective element at all except when a rich man married a non-rich beauty.

The real selective mating in England was among the middle strata that eventually produced the Franklins and the Darwins.

.
"My brain is old and creaky, but isn't the reason regression to the mean doesn't doesn't lead to everyone being dead average is because sheer chance produces individuals who are far from the mean?"

I think it implies that it's not just the genes themselves but also whether they get expressed or not? So a family might pass down the same genes but which ones are set to on or off has a random element?

.
"I find it surprising that the average chest size of a bunch of Scottish soldiers in the early 19th century was 40 inches."

Mountain lungs?

.
"Finally, why do you want to select for IQ?"

Agree. Select for IQ at the bottom and health at the top.

.
"The millions of above average but not brilliant folks have much lower probability of producing einsteins, but their numbers make sure they make up the remaining 43/44. If they are reproducing of course."

Yep. Increase the mass in the middle and they'll produce all the outliers you'll need.

Anonymous said...

"I think it implies that it's not just the genes themselves but also whether they get expressed or not? So a family might pass down the same genes but which ones are set to on or off has a random element?"

Doh.

Or as someone mentioned on another thread - recessives.

dumb of me not to think of that.

Anonymous said...

"where a 160IQ is a 1 in 10000 4sigma score."

a 1 in 30,000 score apparently.
1 in 10000 corresponds to roughly 3.72sigma.

Anonymous said...

Eugenics does not have to be forcible. Voluntary programs in India caused many thousands of men to get vasectomies for a transistor radio, for instance. Low IQ kids could be offered various incentives.

Sterilization could also be made a condition of parole for certain offenders.

Anonymous said...

"Low IQ kids could be offered various incentives."

That's the point though - tricking people into trading something priceless for something of only nominal value. Practically speaking you maybe right but i don't think that kind of society is likely to be pleasant.

Now tricking low IQ people into sterilization *after* one or two kids is something else.

NOTA said...

Regression to the mean is a mathematical phenomenon, not a genetic one. That's why it works on IQ scores, test scores, racing times, performance flying, etc. It's just telling you how correlated numbers work. The underlying assumption is that the two variables (say, father's' height and sons' height) each fall on a normal distribution, and there is a linear relationship between father's height and the mean of his sons' height. Regression to the mean just says that if my dad is +1 sigma, the mean of his sons' distribution is less than +1 sigma but more than the sons' average.

Suppose my father is exactly of average height. Then your knowledge about my height is a bell curve centered on the average of the sons' height distribution. That is, if you needed to make a bet about my height, you could do no better than to use the probability tables from a normal distribution. If someone wants to know how likely it is that I will be 6'5", if you know the mean and standard deviation of the sons, you can calculate that probability.

Now, suppose you move my father to +1 sigma--one standard deviation above the fathers' mean. What do you know about my height? It's also a bell curve, but the center of it--the mean--is moved a little to the right. All regression to the mean says is that you move the center of my bell curve, but by less than one sigma to the right. That's it.

So, if my dad is 6'3", you should bet on me being taller than the average, but not as tall as he is. Sometimes, you will lose that bet, but it's the best one you can make. And this all has to add up so that we see the distributions of fathers and sons thst we already know.

Imagine that we separated all the fathers into people above and below the average, and gave each father exactly one son. If there are any tall sons of short fathers, then there *must* be some short sons of tall fathers, just to make the numbers come out right. The fact that short men can have tall sons means that tall men must be able to have short sons. That's the key insight into regression to the mean--upward mobility (sons who are taller or smarter than their dads[1]) requires downward mobility (sons who are shorter or dumber than their dads).

[1] This has to be true in terms of rankings among fathers and sons. If everyone is getting smarter or taller thanks to nutrition or something, then most sons will be taller than their fathers, but some shorter than average men will have taller than average sons, and some smarter than average dads will have dumber than average sons, and so on.

NOTA said...

Re eugenics:

The dysgenic pattern of fertility we see now (smart women have fewer kids, later, than dumb women) is the result of people responding to incentives:

a. Smart ambitious people mostly don't want to have kids till they're done with school and somewhat established in the world. It takes longer to do that if you become a doctor than if you become a waitress.

b. Smarter people and people from better backgrounds tend to be better at making life choices like "should I be careful about not getting pregnant?" and "should I shack up with this exciting badass lowlife?"

c. Smarter people are better at accomplishing what they set out to do, including using their birth control properly to avoid pregnancy.

d. There are a lot of built-in incentives for or against having kids that are quite different at the top and bottom. Extra kids at the bottom may mean more government assistance (though not enough to cover the costs--single mothers of multiple kids are, as a group, dirt poor and struggling all the time), where at middle income and above they mean more expenses (private school or expensive house in a good school district, college, tutoring, braces, swimming lessons, summer camp) without more help.

Some of the incentives can be reversed without any more coercion than is already in place. How about agreeing to a rise in the top tax rate if it is exactly offset in revenue terms by an increase in the per-child tax credit? This would benefit Republican voters, but not donors or think-tank owners, so the GOP probably wouldn't support it, but it would have a positive eugenic effect, as well as helping middle class families. How about simply making long-term easily-used birth control easy to get for everyone including the poor, with public service announcements pushing it? Anything that makes it easier for middle class people to get good schools at lower cost is similarly a win, as it decreases the cost of raising kids to middle-class standards. Fixing the student loan/tuition bubble so we stop having the marginal kids with no degree and big debts, and the medium-smart kids with a degree and a pile of debt, would also help.

I think Steve's term for this is affordable family formation. Making it more affordable to have kids at middle class payscales and standards would be mildly eugenic.

NOTA said...

Is there data that follows IQs from three generations?

If you know my parents were smart, you should predict that I'm smart. If you know my grandparents were smart, too, that should increase your estimate of my intelligence. Similarly, if you know my siblings are smart, that should increase your prediction about my intelligence.

If someone has built up a regression model of this kind, it would be interesting to see, so we could put some numbers on this. Maybe we could get a reasonable first cut here by some kind of weighted average of ancestors? There must be a standard way of doing this....

Steve Sailer said...

"Regression to the mean is a mathematical phenomenon, not a genetic one."

Yes ... but ... sexual reproduction sees more regression toward the mean than asexual reproduction (cloning). That's why farmers want to clone their top producing specimens of livestock.

Anonymous said...

If you know my parents were smart, you should predict that I'm smart.


No, I should not. There should be a degree of probability that you (and your siblings) will be smart. That probability is not going to be terribly high. It will be far less than a certainty or near certainty. Anyone who says "I predict that the offspring of these two smart parents will be smart" is simply revealing their ignorance of genetics.

Anonymous said...

"Regression to the mean is a mathematical phenomenon, not a genetic one."


It's both a mathematical and (within the context of this discussion of genetically determined traits) a genetic phenomenon. Mathematics is simply a language we use to describe various phenomena. Don't confuse that with mathematics being the phenomenon in question.

BK201 said...

Maybe Steve needs to do a "genetics 101" column, because it strikes me that a lot of his readers are surprisingly hazy on basic high school biology level genetics.

In particular the chapter on Recessive Genes needs attention.

People are "carriers" for a vast amount of genetic information, only a faction of which is expressed in each of us as individuals. With respect to intelligence, there are large numbers of people who are not of unusually great (or unusually weak) intelligence themselves but who carry within them the genetic potential for great(er) intelligence. And/or for weaker intelligence.

By the same token people of great intelligence - and of lesser intelligence - carry the genetic blueprints for making children of more "normal" intelligence. The operative word there is "carry". We are "carriers" for lots of traits which we do not manifest ourselves. A couples potential offspring could in theory exhibit any of a wide variety of states - different hair colors, different eye colors, different heights.

Assume the existence of a normal Euro-American couple of somewhat above average attributes. He's 6'1", she's 5'7". They're both IQ 115. Both are slightly more attractive than average. Now assume that they have a lot of children - by which I mean 10,000 or so children. (We can do this in Theoretical Land)

If we carefully study those 10,000 children we'd find that they display a lot of variety. Some will be much taller than Mom and Dad, others much smaller. Some will be much more intelligent than Mom and Dad, some much less so. But the average should be - excluding all environmental factors - pretty close to their parents.

Pretty close, but somewhat less than. Less than, because their parents (being normal Euro-Americans) have normal Euro-American DNA. In the case of the parents that normal DNA happened to throw up slightly abnormal results (just as their own DNA as manifested in their 10,000 kids threw up some deviations from the norm) but their DNA is still such that its common or normal expression is of 5'10 men, 5'4" women, and people of IQ 100. So their children, on average, will be closer to that norm than their parents.

We all have "active genes" and "latent genes". We all have an "active genetic IQ" and a "latent genetic IQ". Your active genes may give you a genetic IQ if 125, but your latent genes are what govern the probable IQ of your offspring, and they will point to a number less than 125.

NOTA said...

Steve: Isnt that just another way of saying clones' and identical twins' traits are more strongly correlated than siblings' or parents'/childrens'? Twin studies tell us that much already, which is why cloning Albert Einstein will likely produce a very smart kid, but probably won't give you someone who will revolutionize physics the way Einstein did.

Anon: Regression to the mean is seen in all kinds of correlated measurements, whether genes are involved or not. So explaining it in terms of genes is interesting, but mainly because it explains why inheritance of intelligence or height more or less follows this nice mathematical model. There isn't some explanation involving reessive genes that explains why the kids who did best on the first calculus exam will usually do a little worse on the second one, or why the mutual funds who get the best returns this year will usually not do as well next year.

Anonymous said...

Regression to the mean is seen in all kinds of correlated measurements, whether genes are involved or not. So explaining it in terms of genes is interesting, but mainly because it explains why inheritance of intelligence or height more or less follows this nice mathematical model...


You've got it backwards. Regression to the mean is explained in terms of mathematics. But it is not mathematics. Gravitational attraction is explained using mathematics. But hopefully you would not say that gravitational attraction is an interesting something which is used to explain mathematical principles?

We construct nice mathematical models and theories to help explain observed phenomena. (As Galton did with regression to the mean) The phenomena do not "follow" the mathematical models, not in any sense of the word "follow".

Anonymous said...

nlufd [url=http://www.salelouisvuitton-no1.com]louis vuitton handbags[/url] lwbjnm http://www.salelouisvuitton-no1.com uomsm [url=http://www.get-louisvuittonoutlet.com]louis vuitton outlet[/url] eggdhq http://www.get-louisvuittonoutlet.com qnkda [url=http://www.pick-louisvuittonoutlet.com]louis vuitton outlet[/url] pscowi http://www.pick-louisvuittonoutlet.com pwmg [url=http://www.foxlouisvuitton.com]louis vuitton bags[/url] nniicu http://www.foxlouisvuitton.com gycvx [url=http://www.lo-louisvuittonoutlet.com]louis vuitton outlet[/url] xjbcxq http://www.lo-louisvuittonoutlet.com szvmc [url=http://www.locheaplouisvuitton.com]louis vuitton outlet[/url] zblnal http://www.locheaplouisvuitton.com irma