Bloomberg D4GX 2016 Keynotes: How Big Data Can Be a Tool for Good (Or Evil)

Oct 6, 2016 9:50 AM ET

Originally posted on Bloomberg.com

Can city governments use data analytics to help improve the lives of citizens? That was the question presented to attendees at Bloomberg’s annual Data for Good Exchange conference on Sunday, September 25, 2016, and the answers weren’t always easy to hear.

“Before data can start helping people, it first must stop hurting them,” said Cathy O’Neil, data scientist and author of the new book “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.”Throughout her career, O’Neil told the group in her opening keynote, she has witnessed big organizations repeatedly fail at data science by using formulae that harbored built-in bias – often doing more harm than good.

Frustratingly, these mistakes often go uncorrected for decades, buried in the layers of some larger software system. Even when they are found, vested interests can keep them from being fixed.

“If someone found a mistake of mine, I would thank them,” said O’Neil, who developed technology on Wall Street for D.E. Shaw, and also has taught at Barnard College. “I went into finance and I found this wasn’t true. The AAA-rating of mortgage backed securities was a mathematical lie that was being used to obscure truth,” she says, referring to banks’ misjudgments before the 2008 recession.

In addition to causing financial meltdowns, these mathematical lies can have smaller, but more pernicious effects that sometimes focus on a single individual.

One painful example can be found in school systems that use value-added models to assess teachers’ effectiveness. “We try to use data to find bad teachers and get rid of them, with the goal of improving education,” said O’Neil, “but it’s very noisy, and it’s very very inconsistent.” Heuristics, like annual standardized tests, aren’t usually statistically significant, and when weighted equally with qualitative assessments, can paint an erroneous picture of a teacher’s performance.

Despite pitfalls, this method of teacher evaluation is being used in more than half of U.S. states. “It’s being heralded, but it’s not accurate enough, and there’s no feedback loop,” said O’Neil. “I call these weapons of math destruction, and we’re doing triage here” to fix them.

Good algorithms, the kind that have been properly considered, create an ecosystem where they can try something and measure the results. If a change doesn’t improve the model, it gets thrown out. That’s a feedback loop. “Today, in a lot of data science, there is no ground truth,” says O’Neil.

Often, these errors reside in systems used to segment groups of people automatically: a hospital intake system grouping patients into urgent or non-urgent, for example. Relevant historical information goes into the algorithm, and a result is returned. But O’Neil says that programmers can often forget to reassess the definition of “success” that was used to create the algorithm in the first place. People flagged by the system may simply have been, in the past, drawn from a pre-selected or self-selecting group, causing the model to misinterpret correlation for causation. This can be especially disastrous in hiring and firing policies where true abilities take a backseat to some superficial factor.

Worse yet, because they’re the domain of technical fields, algorithmic decision systems can engender a devotion to scientism – rather than actual scientific rigor – that lulls administrators into complacency and even defensiveness.

“Algorithms have a mathematical veneer that protects them from scrutiny, and it’s not okay,” said O’Neil. “Sometimes we think we have super powers. But we should think of ourselves as interpreters of the value judgments of society at large, asking who is harmed, and looking at the cost-benefit analysis of the people who benefit.”

Deciding which algorithms reach the threshold for re-evaluation can be one of the toughest practical obstacles to improving city systems. O’Neil says the importance of an algorithm to the public comes down to three factors: whether it affects a lot of people, whether the scoring is secret, and whether the results are being fed back into the algorithm to actually measure improvement.

Whether or not data scientists will develop ethical standards for governance and create some sort of Hippocratic Oath remains to be seen. For her part, O’Neil thinks algorithmic systems shouldn’t be unleashed on an unsuspecting public without first being seen as suspect themselves. “Algorithms do not ask why, they do not have brains,” she told the audience. “They codify practices. If we already had a perfect way to hire people, maybe we’d want that.”

Continue reading the full article here.