A Blog by Jonathan Low

 

Oct 20, 2016

How Big Data Algorithms Manipulate Us

Our unerring faith in the objectivity of data can, is and will be used against us. It's not personal, it's just business. JL

Cathy O'Neil reports in Wired:

Dual promises exist in the current world of Big Data, where smart money armed with big data continually searches for opportunities to exploit dumb money—increasingly, individuals, swing voters, and even democracy itself. As long as the public trusts in the objectivity of statistical models, and as long as they remain intimidated by them, such opportunities will likely continue.
The age of Big Data has generated new tools and ideas on an enormous scale, with applications spreading from marketing to Wall Street, human resources, college admissions, and insurance. At the same time, Big Data has opened opportunities for a whole new class of professional gamers and manipulators, who take advantage of people using the power of statistics.
I should know. I was one of them.
In 2007 I was a hedge fund quant, working at D.E. Shaw. I was trained to anticipate what others would do so I could do the opposite. It was my profession to game the system. Was this stock just mentioned on TV? Bet on the bounce. Are mutual funds forced to true up their accounts by the end of the quarter? Bet on the market impact of their predictable trading. Is everyone using the same risk model? Anticipate the outcomes. There was money to be made finding patterns and exploiting them, some of it coming from pension funds and 401Ks, other from nameless and faceless heaps of international wealth. We thought we deserved it because we had the technology and because we were faster and better with numbers. It was never about market efficiency; it was dumb money versus smart money, and we took advantage whenever we could.This behavior is by no means unique to finance. One of the early, influential algorithms is the college-ranking model from U.S. News & World Report. In fact, it’s been so influential, the role of many college administrators has gradually changed from improving academics to improving their college’s ranking, whatever the cost. And since the U.S. News model doesn’t track cost, the more the model is gamed, the more expensive colleges become. Over time the widespread gaming has led to outrageous tuitions, bloated administrations, and the death of the safety school, since high rejection rates are bad for rankings. The pool of high school kids hasn’t changed, but the system is in an arms race of prestige.
More recent big data college algorithms work on an individual student basis. Inside the college, admissions offices use algorithms that weigh each student on likelihood of acceptance and financial aid requirements. Outside the college, professional consultants charge parents more than $25,000 to help their child get an offer from, say, NYU based on predictive algorithms. Again, an expensive arms race, and the victims are the lower- and middle-class students who cannot afford either college or the college-admissions process.
Information is power, and in the age of corporate surveillance, profiles on every active American consumer means that the system is slanted in favor of those with the data. This data helps build tailor-made profiles that can be used for or against someone in a given situation. Insurance companies, which historically sold car insurance based on driving records, have more recently started using such data-driven profiling methods. A Florida insurance company has been found to charge people with low credit scores and good driving records more than people with high credit scores and a drunk driving conviction. It’s become standard practice for insurance companies to charge people not what they represent as a risk, but what they can get away with. The victims, of course, are those least likely to be able to afford the extra cost, but who need a car to get to work.
Not all gaming involving big data is so obvious. Sometimes what looks reasonable on the outside has an exploitative underbelly, and sometimes the gamers aren’t totally aware of their gaming. Consider online personality tests, which are required of more than 60 percent of prospective American workers. The tests themselves are hard to game: They ask questions that don’t have obvious answers, and the prospective employees never learn their scores. And if job applicants fail the test, they never get called back for an interview.
Now consider the perspective of the employers who use such personality tests to filter their hires. The tests are cheap, and they save tons of money in human resource hires, but they are also almost entirely opaque. In other words, the personality test is treated as a money-saving black box, but it’s not clear what that black box is actually doing, and whether using it constitutes discriminatory hiring practices. Seven companies who used a personality test by the big data company Kronos are being sued for that reason; the lawsuit charges that the test constitutes a mental health exam, which is illegal under the Americans with Disabilities Act.
Big data profiling techniques are exploding in the world of politics. It’s estimated that over $1 billion will be spent on digital political ads in this election cycle, almost 50 times as much as was spent in 2008; this field is a growing part of the budget for presidential as well as down-ticket races. Political campaigns build scoring systems on potential voters—your likelihood of voting for a given party, your stance on a given issue, and the extent to which you are persuadable on that issue. It’s the ultimate example of asymmetric information, and the politicians can use what they know to manipulate your vote or your donation.
After the movie Wall Street came out in 1987, some people thought of Gordon Gekko as a villain, while others considered him a role model. The same dual promises exist in the current
world of Big Data, where smart money armed with big data continually searches for opportunities to exploit dumb money—increasingly, individuals, swing voters, and even democracy itself. As long as the public trusts in the objectivity of statistical models, and as long as they remain intimidated by them, such opportunities will likely continue.
In 1954, Darrell Huff wrote a neat little book called How To Lie With Statistics, a how-to guide for marketers to mislead and manipulate people into believing just about anything with the right graphic and the well-placed stat. Computationally speaking, we’ve come a long way since 1954, but in one important way nothing has changed. There’s enormous opportunity for manipulation in big data, and we need to remain skeptical and vigilant.

0 comments:

Post a Comment