Principles of ‘Good Data’

Good Data edited by Angela Daly, S. Kate Devitt and Monique Mann will be published by INC in January 2019. The book launch will be 24 Januari @ Spui25. In anticipation of the publication, we publish a series of posts by some of the authors of the book.
“Moving away from the strong body of critique of pervasive ‘bad data’ practices by both governments and private actors in the globalized digital economy, this book aims to paint an alternative, more optimistic but still pragmatic picture of the datafied future. The authors examine and propose ‘good data’ practices, values and principles from an interdisciplinary, international perspective. From ideas of data sovereignty and justice, to manifestos for change and calls for activism, this collection opens a multifaceted conversation on the kinds of futures we want to see, and presents concrete steps on how we can start realizing good data in practice.”

The ‘Good Data’ Project
By S. Kate Devitt, Monique Mann and Angela Daly 
The Good Data project was initiated by us when we were all based together at Queensland University of Technology (QUT) in Brisbane/Meanjin – located on traditional Turrbal and Jagera land in what is now known as Brisbane, Australia – in late 2017. Each of us had been working on research engaging with social science aspects of data and digital technologies, and Angela and Monique had also been very active in digital rights activism. The situation in Australia was then, and still is, far from ‘best practice’ in data and digital issues. Australia  lacks an enforceable constitutional right to privacy, the Australian government exhibits ongoing digital colonialism perpetuated against Indigenous peoples, refugees and other marginalised people and there are a myriad of other unethical data practices being implemented (e.g., robo-debt fiasco, #censusfail and the ‘war on maths’. However, these issues are not only confined to Australia – ‘bad’ data practices permeate the digital society and economy globally.
While we had spent a lot of time and energy criticising bad data practices from both academic and activist perspectives, we realised that we had not presented an alternative more positive alternative. This became our main focus, leading us to consider the idea of what this positive vision of data could look like. The concept of ‘Good data’ was first interrogated  in  a good data workshop bringing together interdisciplinary academics, activists, information consultants, and then through soliciting contributions to a book on Good Data about to be published by the Institute of Network Cultures in Amsterdam.
In this blog post we present some of the lessons that we have taken away from the Good Data project. Many of the contributions we received for the book advocated data methods to dismantle existing power structures through the empowerment of communities and citizens. We have synthesised the contributions to present 15 (preliminary) principles of Good Data. Of course all principles are normative and their exact meaning and application differ dependent on context (particularly as regards to tensions between open data and data protection and privacy) – although these context-dependent distinctions should be fairly obvious.
Data for challenging colonial and neoliberal data practices
The first Good Data principle we advance is a critique of colonial and neoliberal data practices, economic systems and capitalist relations. In their Good Data book chapter Indigenous Data Sovereignty (IDS) scholars and advocates Lovett and colleagues examined how western-colonial data practices affect self-determination and autonomy within Indigenous and First Nations groups that have been subject to various forms of data domination. Their findings relate to the first Good Data principle:
Principle #1: Data collection, analysis and use must be orchestrated and mediated by and for data subjects, rather than determined by those in power.
Account must be taken of the specificities of particular data subjects and their communities, cultures and histories. Indigenous Data Sovereignty principles may overlap with aspects of western data protection laws but may also impose different and additional requirements beyond this legislation, which reflect First Nations peoples’ own laws and sovereignty rights.
In another contribution to the Good Data book, Ho and Chuang critique neoliberal data protection models which emphasize individual autonomy and choice through concepts such as consent and anonymization – indeed, this is an argument that has previously been made in the data protection literature. Instead, they propose that communal data sharing models present a good data alternative to the current widespread proprietary and extractive models. This would mean moving towards data cooperatives where the value of data, and the governance of the system, is shared by data subjects.
Principle #2 Communal data sharing can assist community participation in data related decision-making and governance.
Such an approach presents an alternative means of governing and using data based on data subjects’ participation and power over their own data, and also moves away from an individualistic approach to data ownership, use and sharing, presenting a paradigm shift from the current extractive model which leaves data subjects largely powerless regarding meaningful control over their own data.
As part of a bigger discussion around use of data for societal sustainability, Kuch and colleagues argue that individuals and collectives should have access to the data about the energy they produce and consume (e.g. solar) to hasten take-up and implementation of sustainable energy in a sustainable, communal way. This also relates to principle 2 above as regards to communal data sharing and community participation in decision-making and governance.
Principle #3 Individuals and collectives should have access to their own data to promote sustainable, communal living.
Data for empowering citizens
The second set of principles we consider concern methods for empowering citizens against governments and corporations that use data for social control and to perpetuate structural inequalities. Poletti and Gray suggest that Good Data can be used to critique power dynamics associated with the use of data, and with a focus on economic and technological environments in which they are generated.
Principle #4 Good data reveals and challenges the political and economic order.
Good Data can also improve the landscape for citizens vis-à-vis governments and corporations. One example can be found in Ritsema van Eck’s chapter where he argues that data in smart cities needs to undergo Data Protection Impact Assessments (DPIAs) to identify risks at an early stage. These should be consultative so that representatives of local (disadvantaged) groups and citizen groups can proactively shape the environments in which they live. Valencia and Restrepo also argue for citizen-led data initiatives to produce bottom-up smart cities instead of top-down controlled environments. Taken together and at a higher level of abstraction, our fourth principle is that inclusive and citizen-led social data practices empower citizens, encompassing Good Data process and Good Data outcomes.
Principle #5 Citizen led data initiatives lead to empowered citizens.
Ozalp, among many other authors including ourselves, argues for strong information security to support citizen activism to produce good democracies using the case study of the ByLock persecutions in Turkey. In this way, Good Data can contribute to achieving broader societal goals such as good democracy.
Principle #6 Strong information security, online anonymity and encryption tools are integral to a good democracy.
Open data is also a Good Data way of empowering citizens. Gray and Lämmerhirt examined how the Open Data Index (ODI) influences participation and data politics, comparing indexes to the political mobilization afforded by rallies, petitions and hashtags.
Principle #7 Open data enables citizen activism and empowerment.
Good Data does not have to be perfect data though. Gutiérrez advocates for ‘good enough data’ to progress social justice issues, i.e. the provision of sufficient data to sustain ongoing legal investigations while deficits are acknowledged (see her blog post on this).
Principle #8 Social activism must proceed with ‘good enough data’ to promote the use of data by citizens to impose political pressure for social ends.
Data for Justice
The third set of principles specifically relate to how data intersects with various conceptions of justice. Bosua and colleagues argue that users should be able to exert greater control over the collection, storage and use of their personal data. Personal data empowerment can be achieved through design that make data flows more transparent to users.
Principle #9 Users must be able to understand and control their personal data.
Societal changes from reliance on connected devices within groups is examined by Flintham and colleagues research on interpersonal data. They note specific issues that arise when personal information is shared with other members and has consequences for the ongoing relationships in intimate groups.
Principle #10: Data driven technologies must respect interpersonal relationships (i.e. data is relational).
As regards to genomic data Arnold and Bonython argue data collection and use must embody respect for human dignity which ought to manifest, for instance, in truly consensual, fair and transparent data collection and use. This also relates to sovereignty and control over data – whether individual or group.
Principle #11 Data collection and use must be consensual, fair and transparent.
McNamara and colleagues examined algorithmic bias in recidivism prediction methods with the objective of identifying and rectifying racial bias perpetuated in the criminal justice system. In doing so they argue that criterions and meanings of ‘fairness’ (and by extension other values) attributed to data or that are adopted in models should be explicit. What looks ‘fair’ or ‘just’ to a computer scientist looks different to a philosopher or a criminologist – that is, there are subjective meanings of ‘goodness’, and these should be explicit to enable evaluation.
Principle #12 Measures of ‘fairness’ and other values attributed to data should be explicit.          
Good data practices
Trenham and Steer set out a series of ‘Good Data’ questions that data producers and consumers should ask, constituting three principles which can be used to guide data collection, storage, and re-use, including:
Principle #13 Data should be usable and fit for purpose.
Principle #14 Data should respect human rights and the natural world.
Data collection structures, processes and tools must be considered against potential human rights violations and impacts on the natural world, including environmental (e.g. the energy impacts of mining cryptocurrencies).
Principle #15 Good data should be published, revisable and form useful social capital where appropriate to do so.
Good data should be open to enable the data activism and the communal data sharing practices outlined above unless there are ethical reasons to withhold this information. We acknowledge the tension between open data and misuse of this data by institutions, corporations and governments to protect and retain power. Principle #6 defends security and encryption and principle #9 ensures individual’s rights to their own data that must be considered aligned with and the broader goal of communal data practices discussed throughout.
In sum, Good Data must be orchestrated and mediated by and for data subjects (Principle 1), including communal sharing for community decision-making and self-governance (Principle 2, 3). Good Data should be collected with respect to humans and their rights and the natural world (Principle 14). It is usable and fit for purpose (Principle 13); consensual, fair and transparent (Principle 9, 11 & 12), and must respect interpersonal relationships (Principle 10). Good data reveals and challenges the existing political and economic order (Principle 3) so that data empowered citizens can secure a good democracy (Principle 5, 6, 7, 8). Dependent on context, and with reasonable exceptions, Good Data should be open / published, revisable and form useful social capital (Principle 15). Our 15 principles of ‘Good Data’ are presented in the table below.
We look forward to launching the Good Data book which includes the contributions we have drawn on above, and more, in late January 2019. Join us at our book launch on 24 January 2019 at Spui 25 in Amsterdam.
 
 
Forthcoming Good Data Chapters in A Daly, SK Devitt & M Mann (eds), Good Data. Amsterdam: Institute of Network Cultures.
Arnold, B. & Bonython, W. (2019). Not as Good as Gold? Genomics, Data and Dignity.
Bosua, R., Clark, K., Richardson, M. & Webb, J. (2019). Intelligent Warming Systems: ‘Technological Nudges’ to Enhance User Control of IoT Data Collection, Storage and Use.
Flintham, M., Goulden, M., Price, D., & Urquhart, L. (2019). Domesticating Data: Socio-Legal Perspectives on Smart Homes and Good Data Design.
Gray, D. & Lämmerhirt, D. Making Data Public? The Open Data Index as Participatory Device.
Gutierrez, M. (2019). The Good, the Bad and the Beauty of ‘Good Enough Data’.
Ho, CH. & Chuang, TR. (2019). Governance of Communal Data Sharing.
Kuch, D., Stringer, N., Marshall, L., Young, S., Roberts, M., MacGill, I., Bruce, A., & Passey, R. (2019). An  Energy  Data  Manifesto.
Lovett, R., Lee, V., Kukutai, T., Cormack, D., Carroll Rainie, S., & Walker, J. (2019). Good data practices for Indigenous Data Sovereignty and Governance.
Poletti, C. & Gray, D. (2019). Good Data is Critical Data: An Appeal for Critical Digital Studies.
Ritsema van Eck, G. (2019). Algorithmic Mapmaking in ‘Smart Cities’: Data Protection Impact Assessments as a means of Protection for Groups.
Trenham, C. & Steer, A. (2019). The Good Data Manifesto.
Valencia, JC. & Restrepo, P. (2019). Truly Smart Cities. Buen Conocer, Digital Activism and Urban Agroecology in Colombia.