May 2, 2014

White House: Big Data causes big disparate impact

The brighter folks within the Obama Administration are starting to figure out what I've been saying for some time: that a lot of the hype over Big Data and apps and the like are businesses trying to get around traditional regulations, including regulations against discrimination. For example, one of the costs that government imposes on licensed taxi drivers is that they are supposed to drive customers wherever they want. But with the new ride sharing apps, drivers can just look up possible gigs offered on their smartphones and say, "Florence and Normandie? Let me Google that ... uh-oh. Re-ject-ed! Ventura and Laurel Canyon? Accepted!"

Back in 1982, my Advanced Marketing Models professor in B-School got to talking about predictive systems used by lenders, insurance companies, and the like. Somebody asked if they really work to identify bad risks. Oh, sure, they really worked, he replied. The problem is that the use of truly powerful predictive factors, like race, have been outlawed and the government is leery of the use of approximate factors like zip code. So they don't work as well as they did a few years ago. This is a quiet way for the white majority to subsidize the black and brown minority in terms of mortgage defaults, insurance rates, etc. 

But Big Data to the rescue of Big Business! You don't actually need to know that somebody is black if you know she drinks grape soda, smokes Kools, loves Tyler Perry, etc. All that stuff correlates with being a worse risk (and with being black). John Podesta has just released a White House report on this Menace. From the NYT:
Call for Limits on Web Data of Customers 
By DAVID E. SANGER and STEVE LOHR  MAY 1, 2014 
But the most significant findings in the report focus on the recognition that data can be used in subtle ways to create forms of discrimination — and to make judgments, sometimes in error, about who is likely to show up at work, pay their mortgage on time or require expensive treatment. The report states that the same technology that is often so useful in predicting places that would be struck by floods or diagnosing hard-to-find illnesses in infants also has “the potential to eclipse longstanding civil rights protections in how personal information is used in housing, credit, employment, health, education and the marketplace.” 
The report focuses particularly on “learning algorithms” that are frequently used to determine what kind of online ad to display on someone’s computer screen, or to predict their buying habits when searching for a car or in making travel plans. Those same algorithms can create a digital picture of person, Mr. Podesta noted, that can infer race, gender or sexual orientation, even if that is not the intent of the software. 
“The final computer-generated product or decision — used for everything from predicting behavior to denying opportunity — can mask prejudices while maintaining a patina of scientific objectivity,” the report concludes. 
Mr. Podesta said the concern — he suggested the federal government might have to update laws — was that those software judgments could affect access to bank loans or job offers. They “may seem like neutral factors,” he said, “but they aren’t so neutral” when put together. The potential problem, he added, is that “you are exacerbating inequality rather than opening up opportunity.”
   
Stop noticing!

41 comments:

Anonymous said...

"""""You don't actually need to know that somebody is black if you know she drinks grape soda, smokes Kools, loves Tyler Perry, etc"""""""

Grape soda? People actually drink grape soda, (and HiC, Hawaiian Punch, etc) over the age of nine?

Why?

Oh, had to finish the sentence, 'to know that somebody is' ....wow.

Now if it had read that they continue to drink the Kool-Aid, that would've been a different matter entirely.

Anonymous said...

The imminent ban on racist learning algorithms will really throw a monkey wrench in the quest for general AI.

And I was going to quip in another thread that Sterling's real crime was not finding a sufficiently indirect way of expressing his prejudices. "Stop hanging around Tyler Perry fans" gets the point across while allowing him enough room to wriggle or recriminate.

Anonymous said...

I'm white and I like grape soda. Therefore, all statistics are invalid!

Anononymous said...

North Korea releases list of U.S. ‘human rights abuses’: ‘The U.S. is a living hell’
"Under the citizenship act, racialism is getting more severe in the U.S. The gaps between the minorities and the whites are very wide in the exercise of such rights to work and elect."

Robert in Arabia said...

http://www.youtube.com/watch?v=GhxqIITtTtU IQ Test

Butter said...

"The potential problem, he added, is that “you are exacerbating inequality rather than opening up opportunity."

No, the problem, Mr. Podesta, is black people.

Al said...

One strategy for evaluating disparate impact would be to compute the total derivative w.r.t an indicator variable for race.

http://en.wikipedia.org/wiki/Total_derivative

Essentially you're asking how much the prediction will change on average if we change the indicator variable for race and change all other variables associated with that indicator.

biff said...

>a lot of the hype around Big Data and the like is businesses trying to get around traditional regulations-



Looked to me like the DotCom Boom was capital flight from various industries smashed by trad regulations. Then Clinton took a couple hundred million from Microsoft's competitors to cut Microsoft down to size, and oops, busted the boom.

ResnoTemperedBell said...

" ... and to make judgments, sometimes in error, ... "

and oftentimes not.

Oswald Spengler said...

Big Data allows users to notice certain facts without (for now) others noticing that one is noticing.

Anonymous said...

The government is a couple of decades late on this issue. The barn door has been wide open for decades.

At least in auto insurance, Zip Code + Occupation + Credit Score will weed out undesirable risks of all races. And it is legal.

There are companies that specialize finding the small percentage (but large number) of people with bad credit that are actually desirable -- like some people with a bankruptcy.

Race might be a proxy for a lot of characteristics, but there are even better predictors for risk. There is money to be made by further subdividing the traditionally undesirables.

In fact, why not do some data mining and correlate IQ with use of menthol cigarettes, grape soda, gangster rap, &c?

Or figure out exactly what products are favored by meth head free booting Okies and Arkies. Like purchasing large quantities of decongestants, buying little or no food, etc.

Fuck race. Lets go straight to alphas, betas, etc.

Anonymous said...

Coupla typos...
leery of the use OF approximate factors subsidize the bLack and brown

Jokah Macpherson said...

If the Obama campaign is using big data to identify "likely Democratic voters", though, and remind them to get their asses out to the polls on Tuesday, it's a good thing.

You're allowed to notice things if doing so helps the right people.

Anonymous said...

Mr. Podesta says "The potential problem, he added, is that “you are exacerbating inequality rather than opening up opportunity.” No mention of the social good in reducing losses to the big data users? I wonder why?

I think all businesses using big data should be forced to sell their ownership interests to Magic Johnson and the Guggenheim partnership. After all, they are doing the same thing Sterling did . . . using race to avoid business losses. Who knew taking business losses could be such a noble enterprise?

Anonymous said...

Hey Steve, what happened to your WePay?

Anonymous said...



Remember the days when they would argue that your stats must be wrong or that you've cherry picked the data? Now they just declare that bad outcomes aren't allowed.





scottlocklin said...

Speaking as someone who has done "big data" -you can tell all kinds of crap about people's activities, simply from knowing their IP address. Add cookies, and you can tell more than they may know about themselves.
I doubt as any of this gets used in trading MBS. I don't know what the regulatory overhead is in giving out loans, but for trading them, you don't need to get real fancy to understand what the default probabilities are.

DR said...

"Grape soda? People actually drink grape soda, (and HiC, Hawaiian Punch, etc) over the age of nine?"

I think it was Half Sigma that conjectured that intelligence was correlated with a propensity to liking bitter foods. The observation being that the rich and educated seem to be the primary consumers of unsweetened black coffee, IPA beers and bitter liqueurs. The lower classes prefer sweet tasting foods and drinks, which probably helps to explain the higher propensity to diabetes.

(This also makes them like children, who have a higher relative concentration of bitter tastebuds because they die off faster than other taste buds. A small amount of bitterness is very noticeable to a child).

Whiskey said...

biff, the first part of your thought, the Dotcom Boom at least in part a flight from traditional highly regulated industries by investment capital, is a pretty good take.

However, Clinton did very little to Microsoft which was both a menace to the DotCom boom (which depended on servers and various platform agnostic services such as video delivered to customers -- MS was poor in server software and tried to suppress Linux/Unix rivals) and irrelevant.

Microsoft was late to the internet, famously, full of security and performance holes, and had the following problems:

1. It stopped its phenomenal growth as everyone who had a pc bought one and plateaued in both OS and Office Suite software sales.

2. It stopped being able to avoid expensing stock options, and therefore ceased offering them to highly ambitious programmers and ...

3. The stock ceased its relentless climb based on astonishing 1980's to early 1990s growth in sales anyway, being stagnant and ...

4. Microsoft went cheap on labor and hired loads of H1Bs who as PMs or various senior programmers hired their cousins and cousins friends from villages in China and India and kicked the talented White programmers out as ...

5. Microsoft used the stack hiring process, basically on the "top 30%" of programmers hired in the year based on very nebulous standards by managers were retained. This meant doing things like helping another team get their product ready by fixing bugs in say a shared library that the product depended on got you fired, brown-nosing over meaningless Dilbert-style powerpoint decks for your manager got you retained.

But your main point of the Dotcom Boom being at least in part informed by capital fleeing to big growth opportunities unbounded by growth deflating regs is solid.

Much of the air was let out when cable/fiber companies just could not complete the last mile and home video delivery and other stuff just was not available. Remember a lot of people even in 2000 were on dial up.

Anonymous said...

"-you can tell all kinds of crap about people's activities, simply from knowing their IP address. Add cookies, and you can tell more than they may know about themselves. "

That's been my dream. Instead of stupid junk mail and junk popup ads, I want marketers to know what I want before I know it.

The predictive stuff on amazon, iTunes and netflix doesn't really work for me. Yet.

I like about 20% of the stuff on the list of 125 stuff that white people like. http://stuffwhitepeoplelike.com/full-list-of-stuff-white-people-like/

As Henry Ford didn't say, "If I asked people what they wanted, they would have said faster horses."

Contaminated NEET said...

Wow, Big Data vs. Big Tolerance. It's like Godzilla vs. Mothra: the winner is just going to destroy us all anyway, so I don't know who to root for.

Auntie Analogue said...


The use and abuse of Big Data boil down to Mr. Sailer's favorite apothegms: Who? Whom?

Farang said...

Anonymous said at 7:06 PM:
The imminent ban on racist learning algorithms will really throw a monkey wrench in the quest for general AI.

The ban will be effective only in the USA. If AI is ever invented, it will be invented elsewhere, presumably in Japan, South Korea or China.

What's happening in the West nowadays is similar to what happened in the Arab world centuries ago: as black genes spread into their gene pool, their civilization lost its inventiveness.

Jonathan Silber said...

Americans: blind, ignorant, & helpless, on principle--because it's only fair!

Anonymous said...

Wow, Big Data vs. Big Tolerance. It's like Godzilla vs. Mothra: the winner is just going to destroy us all anyway, so I don't know who to root for.

I'm rooting for Gamera to come in unexpectedly and give us all a damn good whacking!

Jonathan Silber said...

Americans are to notice nothing, know nothing, say nothing, and act on nothing: it's why we have College for Everyone.

Big Bill said...

The "who likes Tyler Perry + Grape Soda + Kools?" problem shows up everywhere.

In genetics it appears as the "who has Allele X + Allele Q + Allele R?" problem.

Given enough datasets, no matter how disparate, Reality will always surface.

Big Bill said...

They “may seem like neutral factors,” he said, “but they aren’t so neutral” when put together.

It sounds like the Feds are affirming the accuracy of stereotypes. I can just hear the Federal regulator hassling a marketeer about his "neutral factors":

"Goddammit, Bill, don't play dumb with me! Everybody knows that Negroes like Tyler Perry movies, grape soda and Kools! Just because you don't use 'race' doesn't mean you aren't using race!"

Anonymous said...

Obama sure didn't mind Big Data targeting minorities when it won him two national elections and was clearly the difference in both races.

Felix said...

What's happening in the West nowadays is similar to what happened in the Arab world centuries ago: as black genes spread into their gene pool, their civilization lost its inventiveness.

The Arab world was never possessed of inventiveness.

Kevin Michael Grace said...

Newports long ago replaced Kools as the preferred cigarette of Black Americans.

Anonymous said...

Another cost of diversity: they need to know every detail about our personal lives as a proxy to the real predictive detail they want to get at.

"Under the citizenship act, racialism is getting more severe in the U.S. The gaps between the minorities and the whites are very wide in the exercise of such rights to work and elect." - cmon shark jumping, we need to run this into the ground.

Anonymous said...

What uber might say is that a cab driver in a high crime area can get a fare who's identity has been verified. My guess is people with a cellphone, credit cards, a drivers license, bank account, ect are equally likely to kill you black white or Asian. So race discrimination is not the issue. Class discrimination maybe amplified.

NOTA said...

Given more in-depth knowledge about people, you can do much better predicting things about them than you could with just race. The key thing about race is that it's easy to see and correlates with a lot of stuff you care about. But as an example, if you can find a way to sort prospective employees by intelligence, work ethic, and honesty, then race probably doesn't add much to your predictions. Unfortunately, if blacks do worse than whites on average in those things, then your race-blind model will still statistically discriminate, leading to hiring fewer blacks than their proportion in the population.

This is basically what happens with tests. Probably some employers tried to find tests that wiuld let them keep discriminating against blacks, but even employers and universities who intend to be race-blind in hiring will find that standardized test score averages aren't the same for blacks and whites. Hire the best 10% of scorers on your test, and you will end up with an almost-all-white fire department, as various city fire departments have found in practice.

Mr. Anon said...

"Jonathan Silber said...

Americans are to notice nothing, know nothing, say nothing, and act on nothing:"

We are becoming the sergeant Schultz society.

Anonymous said...

"What's happening in the West nowadays is similar to what happened in the Arab world centuries ago: as black genes spread into their gene pool, their civilization lost its inventiveness."

I think a more likely explanation of Arab decline is the Muslim propensity for inbreeding. Centuries upon centuries of it have taken its toll on Arab inventiveness.

Anonymous said...

The Wire beat that menthol cigarette and grape soda drum for its entire run.

Starbucks is having another Happy Hour promotion for their $5, 500-calorie frappuccino drinks. According to the manager of the local franchise, his overweight women and black customer percentages double.

Anonymous said...

I can't help but picture the orthodoxy mandating that the nerds program the orthodox errors into their AIs.

And the subsequent knock-on errors causing the AIs to run amok and destroy all of humanity for the demented liars they are. Or more likely, smashing Yellows and Whites because equality.

Svi

Anonymous said...

It sounds like the Feds are affirming the accuracy of stereotypes. I can just hear the Federal regulator hassling a marketeer about his "neutral factors":

"Goddammit, Bill, don't play dumb with me! Everybody knows that Negroes like Tyler Perry movies, grape soda and Kools!


LOL! This thread's just chock full o' laffs. The Godzilla vs. Mothra one was good, too.

Svi

Harry Baldwin said...

Starbucks is having another Happy Hour promotion for their $5, 500-calorie frappuccino drinks. According to the manager of the local franchise, his overweight women and black customer percentages double.

I think the people buying those drinks rationalize that they're just buying a cup of coffee. Starbucks is a coffee shop, after all. It's genius marketing.

Anonymous said...

I assume this won't affect the gov's secret bulk data collection spying program.