Beka Steorts: Human Rights Meets Big Data

Duke statistician is helping interpret the death toll in Syria’s armed conflict

Beka Steorts

Statistician Beka Steorts is developing new techniques for a more accurate accounting of human rights abuses in Syria and other conflicts. Photo by Megan Mendenhall/Duke Photography

Measuring the human costs of war is notoriously difficult. Dangerous conditions and difficult access can overwhelm hospitals, morgues, law enforcement and non-governmental groups responsible for counting the dead.

These problems are complicated by counts using different time periods, or disagreements about whether victims of conflict-related famine or disease count the same as victims of direct acts of violence such as shootings or torture.

Such is the case with the ongoing armed conflict in Syria, where estimates of the number of people killed range from around 140,000 to more than 330,000.

Statistician Beka Steorts, who joined the Duke faculty this year, is developing state-of-the-art machine learning and statistical methods to help human rights groups get as close as possible to the true count -- information the international community relies on when deciding whether and how to intervene.

“I like to work on real-world problems,” Steorts said.

Steorts has never been to Syria. But in 2013, while a visiting assistant professor in the Statistics Department at Carnegie Mellon University, she began collaborating with the Human Rights Data Analysis Group, a nonprofit organization.

At that time the group was charged by the United Nations with enumerating how many people have been killed in Syria since the conflict began in March 2011.

The project entailed analyzing hundreds of thousands of death records spread across multiple databases compiled by the Syrian government and various nongovernmental groups.

Each death record contains the victim’s name, date of death and where they were killed.

But because of typos, inconsistent spellings and missing information, merging these datasets and making sure that nobody is counted twice is easier said than done.

One dataset, for example, might list a victim as “John Doe, 20-40 years old, died on 8/7” and another might list “J Doe, 18-25 years old, died early August.”

To identify and weed out duplicate entries, Steorts teaches computers to comb through the available data and determine, for any pair of records, how likely it is that they both represent the same person.

The work involves sifting through an enormous volume of information. Hunting for duplicates in two lists of 10,000 records, for example, could mean as many as 400 million pairwise comparisons.

“It’s incredibly time-consuming,” Steorts said.

But by grouping similar records together, Steorts is able tackle half a million records in 10 minutes.

The number of records in each of the underlying datasets continues to grow as previously undocumented deaths come to light.

For her analyses of human rights data, Steorts was recently named one of the “Innovators Under 35” by MIT Technology Review, an honor she shares with Google cofounder Sergey Brin and Facebook’s Mark Zuckerberg.

“Human rights is an area I plan to work on for my entire career. It’s something that’s become important to me,” Steorts said.

Steorts was raised in Bluefield, WVa., and Winston-Salem. It was a high school teacher who first kindled her interest in math. 

“That’s when I realized that I love to solve problems,” Steorts said. “It surprised my parents. And it surprised me too, because I had never been really particularly good at math before.”

Steorts earned her undergraduate degree in mathematics from Davidson College in 2005. After graduating, she went on to Clemson University, where she received a master’s in mathematical sciences in 2007. Her interest in learning more about a branch of statistics called decision theory eventually took her to the University of Florida to pursue a doctorate in statistics.

Steorts is currently an assistant professor in the Department of Statistical Science with affiliations in the Social Science Research Institute and the Information Initiative at Duke.

In the coming years, she plans to use the data analysis techniques she has developed to merge medical data as well, to help doctors treat patients whose health histories might otherwise be hard to piece together because of changes over time in their name or address.

She is also working on similar problems for the U.S. Census Bureau, where she serves as a consultant.

Steorts was a Blue Devils fan long before landing a job at Duke. She has been to two national championship games, her face painted blue and white.

“My first basketball coach went to Duke,” Steorts said. “When I was in middle school I asked for a Duke Blue Devils starter jacket for Christmas.”

That didn’t sit too well with her father, who gave her a Washington Redskins jacket instead.

“I’m still a little bit angry about it.”

She also makes occasional guest appearances as a singer and songwriter for The Imposteriors, a band of Ph.D. statisticians known for performing parodies of popular hit music at the closing banquets of big statistics conferences. She made her debut performance in 2012 with a song she co-wrote called “Bayesian State of Mind,” a spoof based on “Empire State of Mind” by the rapper Jay-Z.

“I’m terrible at singing,” Steorts said.

Steorts lives in Durham’s Rockwood neighborhood with her dog, a miniature schnauzer named Noah.