How I learned to stop worrying and love data journalism
Many years ago, not long after the fall of the Berlin Wall, I spent a winter of long gray afternoons in the archives of the Stasi (the East German secret police), researching stories about the former German Democratic Republic. In order to view a document, you first had to know that it existed (there was no available index). Then you had to prove a need for it (as a journalist this meant providing a story outline). Then you waited weeks to months for an appointment in the reading room of the former Stasi offices, which was the only place the documents would be made accessible. Headlines at the time included the story of Vera Wollenberger who, while going over her own file in that reading room, learned that her husband had first courted her on orders of the Stasi. This is also where the historian Timothy Garton Ash found the leads to uncover which of his friends had spied on him for the Stasi, a story he told in The File: A Personal History.
There were three possible outcomes after reading a file: you found the information you needed for a great story, you found nothing of interest, or you found a lead which would require requesting a new file and starting the entire process again. This was, of course, very time-consuming, and so as a freelancer I had to content myself with the easier feature-type stories about the Stasi itself (the athletic doping programs it oversaw, the bizarre snooping methods it employed, and so on).
Fast-forward 20 years. Thanks to new technologies, and advancements in access to information initiatives – many of which were accomplished through OGP commitments and action plans – governments around the world are now proactively putting documents online. This has made it significantly easier and more cost-efficient for journalists to access, research and compare documents. It’s also given rise to “data journalism,” a term that used to worry me, as it seemed to imply a degree of tech savvy that had little to do with the ink-stained journalist I’d been.
A crash-course in data journalism has made me completely revise my opinion. Yes, data journalism is based on the premise that there are many important, untold stories hidden in the data that governments and other actors are providing today. But the tools involved are easy to master, the benefits mirror those of moving from pen to mouse, and the goal of writing an illuminating story has remained the same.
The event that changed my mind was the OGP Data Bootcamp, hosted by the OGP communications and civil society teams on the eve of our 2016 OGP Global Summit in Paris. Held at France’s premier journalism school, the Centre de Formation des Journalistes, close to 100 journalists and civil society activists from a dozen countries listened to presentations by data journalists Romina Colman, Cedric Lombion, Johannes Fiedrich, Christine Jeavans, Andres Snitcofsky and Cecile Gallegoat.
The most famous examples of data journalism are certainly the International Consortium of Investigative Journalism’s LuxLeaks and Panama Papers series, where leaked data was shared by dozens of journalists around the globe, resulting in scores of critical, far-reaching investigations. However, not all data journalism is investigative in nature – case studies presented during the Data Bootcamp ranged from analysis of Brexit and US election maps, to Florida’s failing schools, jihadist attacks, Global Forest Watch, and the existential: “will a robot take my job?”
In data journalism, data can be the main source of information, but it can also be a means of collating information, or the tool with which the story is told — or all three. This means that the breadth and speed of what can be researched becomes exponentially greater; the ability to compare and contrast is stronger, and the possibilities for illustrating become much more interesting.
Imagine, for example, that the Stasi files I reviewed had been digitally archived at the time, and that information such as gender, age, financial transactions, and geolocation were made easily available. Investigations by the German press into the activities of East German politicians might have been more thorough; journalists based around the world could have worked together to uncover Cold War trails of spying, bribery and state espionage. Just imagine the fascinating research that could have been carried out on the social impacts of the differing political systems in East and West Germany.
There remain many challenges to successful data journalism, not least of which is ensuring that information is provided in computer-readable, searchable data sets. During the Bootcamp, we were presented with several instances of governments boasting about having made data accessible, when in fact they had uploaded illegible, handwritten documents that were nearly impossible to decipher. Similar issues arise when data is uploaded in formats that can’t be accessed or processed in Microsoft Excel or other data-sorting tools.
During OGP’s first five years, we’ve seen many governments commit to make information easily accessible to the public online. In our next five years, we need to make sure this information is put to use to improve the lives of people around the world. One way to do this is to ensure that governments proactively make the information they share machine-readable, and that data bootcamps are held with enough frequency so that they demystify and inform the process. Moreover, this all needs to happen quickly. The Stasi files were put online two years ago. Although of continued interest to historians and many others, the data is likely too old to be of significant use for political investigations or to benefit those who suffered under the East German regime.