Skip to main content

From Spreadsheets to Stories: Data-Driven Journalism

A New York Times writer discusses journalistic implications of big data

Derek Willis of the New York Times speaks at Duke Monday, Oct. 7. Photo by Duke Photography
Derek Willis of the New York Times speaks at Duke Monday, Oct. 7. Photo by Duke Photography

Derek Willis believes spreadsheets filled with data can be interviewed just as people can. Doing so is so important that Willis, an interactive news developer at the New York Times, fears journalism will hasten "its road to irrelevancy" if writers refuse to start analyzing data.

"Declining the opportunity to master data is a dereliction of our duties as journalists," Willis said. "We can't keep bringing knives to gun fights."

In his Monday talk "Dead as a Mutton Again: Journalism's Modernity Problem," Willis urged aspiring journalists to become more comfortable with digital tools to process large amounts of raw information. The talk was the first installment of "Data + Journalism," co-organized by Duke's DeWitt Wallace Center for Media and Democracy and the Department of Computer Science. <

Knight Professor of Journalism Bill Adair, who also created the fact-checking website, said the series is an effort to dispel the notion that traditional journalism is dying. Rather, Adair and co-organizer Jun Yang, a computer science professor, want to demonstrate that digital media has made reporting even more critical -- if journalists can learn to turn massive amounts of information into compelling stories.

"I know that for a lot of people in journalism, this is a depressing time," Adair said. "I look upon it as a time of reinvention. Originally, data projects were primarily done by big news organizations because it took a lot of computer horsepower and expertise to do them. But now that we're well into the digital age, more journalists can create great new data-based projects."

Willis first dove into data analytics while a reporter on Capitol Hill, tracking politicians' voting records and poll standings. But data can be used far more broadly, he said Monday. He discussed how reporters in Fort Lauderdale were able to prove that police officers themselves exceed posted speed limits. Pulling hundreds of public-access toll booth spreadsheets from the Florida department of transportation's website, the reporters calculated the average speed of each squad car from the distance between toll booths and time elapsed between toll booths.Willis said the story is one of the best examples of data's utility in crafting an interesting story.

"Data is decisive, not anecdotal," Willis said. "If we want to present certain facts, we've got to get rid of this hideous pseudo-intellectualism that journalists don't need math."

Willis urged all the aspiring journalists in the audience to experiment with data online, from mastering Microsoft Excel spreadsheets to understanding how websites are built.

"Journalists who can even partly manage the flow of information well will not only have a job but job security," Willis said. "As a profession we still have so many problems to tackle, and we have to get started before we go out of business."

Senior Julian Spector said Willis' talk has motivated him to increase his familiarity with basic programming and statistical science. He hopes to gain some exposure to the software and tools Willis mentioned in his talk, such as free web apps GitHub and Haduko, before embarking on his journalism career.

"Journalists as a whole have not done their due diligence with using data in their reporting," Spector said. "There's a real demand for more insightful data analysis, and if I want to be a journalist, I'm going to have to go back to the drawing board to gain some of these skills."

For more information on "Data + Journalism" and to view a complete schedule of upcoming events in the series, click here.