The Ethics of Working With Open Data in Closed Societies
The concept that sold open data was: “open all the data, people will do wonderful things, and the world will be a better place”. Has open data been sold as something more simple than it actually is?
Hello everyone. If you haven’t been following our blog, we recently organised a conference called Open & Shut, where we brought together experts and data practitioners to discuss the challenges and opportunities posed by open data in closed societies.
This article is a summary of the discussion notes from our table on ‘Ethics and Open Data’.
Ethics of open data in closed societies
As a group, our discussion table agreed that the concept that sold open data was: “open all the data, people will do wonderful things, and the world will be a better place”. Has open data been sold as something more simple than it actually is?
We first asked: “which topics are relevant to this idea of ethics?”
The first two major issues that came to our minds were privacy and the ownership of data, both of which are obviously complex issues and could have a discussion table all to themselves!
But opening up data may also have negative and perhaps unintended or unexpected consequences. It is naive to assume that the data released cannot cause harm. Could we be causing or contributing to these consequences by opening data / mapping things / and making it easier to access things that were difficult to find before?
But open data is great! Right?
Is the rhetoric of the open data community realistic or is it aspirational? There is a lot of enthusiasm to release (unleash) data with the good-natured intent that information should be made available to everybody in the name of transparency.
But we talked about the fact that it is important to think about who is going to use the data that is published, and what the possible impact of opening up that data could be. If we focus too much on opening up the data and not the use, then we might end up undermining the purpose of opening up the data in the first place.
Should we be doing risk assessments before releasing data? Should we be asking ourselves: “what could go wrong?”
One contributor at our table talked about the belief that government data should be open by default rather than closed by default. A subtle difference in phrasing perhaps, but in this model, governments need to justify why they are keeping something closed. But what should governments be allowed to hide? And what shouldn’t they collect in the first place? If they are collecting it and not sharing it, it could still be leaked.
Who should be the gatekeepers? Who should decide what gets published and not published?
We also talked about the additional challenge of ‘openwashing’. Openwashing means “having an appearance of open-source and open-licensing for marketing purposes, while continuing proprietary practices.”
If we push for governments to be ‘open by default’ does that provide a smoke screen? Does opening the floodgates of data provide more opportunity to hide corruption?
What about opening data, against copyright, but in the public interest?
This was a very important question at our table, as it relates to many of the projects going on to open up data in closed and restricted environments. Essentially we wanted to see if we could find an answer to whether or not it is ok to go against copyright law of a particular country to improve access to information?
Scenario: the government of a particular country publishes a dataset in a hard to reach corner of the internet. The dataset is a downloadable .pdf and it has a copyright symbol on it. The open data advocates find this dataset, clean it, make it machine readable, and include it on their website, making it easier for journalists / researchers / academics to find and use.
We didn’t think it was inherently bad to go against copyright and that it should be okay to publish ‘private’ information, if there was a valid justification for the public to have access to that information. But we did wonder what the legal implications would be.
(What counts as what should be private and what should be open is both a political and a social issue.)
We would be interested in knowing if there is a legal precedent for this, or whether perhaps such practices might come under fair use?
What about inaccurate data?
We also discussed the ethical implications involved in publishing data you know is inaccurate. We found this happening in many of our case study countries.
Governments in restricted and closed societies are publishing datasets that show them in a favourable light. Multiple versions of the same ‘data’ will be published with conflicting tallies, because different departments of the government have different targets to achieve.
What happens if we make this data easier to find and use?
We thought that people tend to empirically ‘trust’ data and that publishing flawed data is probably problematic. Again we talked about the catch-22 of how providing access to this information is important. Open data platforms collecting and presenting datasets from various resources probably do have a duty to include some form of disclaimer about the quality / standard / accuracy of the data.
The Ethical Toolkit
In our table discussion we ended up asking more and more questions. There weren’t any (very many) answers.
We talked about a lot of resources that are available and wanted to share some links with you. If you have any links to share with us, please add them in the comments!
We found a lot of toolkits and loved that they’re lists of questions, not just checklists where one answer fits all. We liked that they’re always about discussion and review, and not just about saying we’re ‘good to go’ because we’ve ticked some boxes.
- From “Open” to Justice #OpenCon2014
- Here are the transcript and slides from the talk I gave this morning at OpenCon 2014.hackeducation.com
- Responsible Data Handbook
- A primer on why Responsible Data is a relevant concern for international development work.responsibledata.io
- The Data Ethics Canvas | Open Data Institute
- The ODI has developed a new approach for organisations to identify and manage data ethics considerations: the Data Ethics Canvas
Also, Why We Need the Data Ethics Canvas, Open Data Government Index and The Government Innovation Hype Cycle are some examples of organisations within the open source domain trying to map out some unexplored territory in the world of data ethics.
We’re in the early days of discussing data ethics, and we need to ask more questions, discuss more with other organisations, share, and not be afraid to open ourselves up to tough conversations. We hope you’ll join us in starting them.
Pssst! Our previous blogpost has all the video sessions from the conference. Click the story below to catch up.
And if you want to be a part of the Open&Shut community, do check out #OpenShut on Twitter.