Sourcing the 100 Questions that Matter: State of Play, Lessons Learned, and Next Steps

Alexandra Shaw and Stefaan Verhulst
It was launched in late May 2019 by The GovLab with initial funding from Schmidt Futures. We seek to map the world’s 100 most pressing, high-impact questions that could be answered if relevant datasets were leveraged in a responsible manner. As such, we developed a new participatory methodology for defining and prioritizing questions that can steer the creation of purpose driven data collaboratives. We capture the 100 Questions by creating and sourcing curated communities of “bilinguals” – practitioners across fields who possess both domain knowledge and data science expertise. 

Now, over one year later, the program has reached its halfway point with the launch in partnership with the OECD of a fifth domain: disinformation. This domain joins the four other areas previously launched: migration, gender, air quality, and the future of work — as well as one other area to be launched soon: governance. Reaching this milestone provides us an opportunity to take stock and reflect back on our progress, identify lessons from our first five domains, and consider ways to strengthen the methodology.
Our work thus far would not have been possible without the collaboration and support of our partners. We are grateful to Schmidt Futures, the International Organization for Migration (IOM), the Joint Research Centre (JRC) at the European Commission, Data2X at the United Nations Foundation, the World Resources Institute, the Bertelsmann Foundation, and the Open Government Unit at the Organization for Economic Cooperation and Development (OECD). In seeking to reflect on our progress so far and the path forward, we have also sought their inputs, and asked them to provide highlights of their experiences as well as areas for possible improvement.
In what follows, we briefly describe four key areas with regard to The 100 Questions Initiative: the 1)  current “state of play”, 2) its defining characteristics, 3) lessons learned and 4) areas where we can advance. 
I. Bilingual Analysis
Over the past eighteen months, The 100 Questions Initiative has engaged 455 bilinguals across the five domains. Out of this total, 207 bilinguals are men and 248 bilinguals are women.
Binary Gender Breakdown of Bilinguals 
As for numbers of bilinguals by cohort, migration and air quality are on par with one another, with 76 and 79 bilinguals respectively. The gender and future of work domains consist of 92 and 94 bilinguals, respectively. And finally, disinformation has the highest count, with a total of 114 bilinguals!
Number of Bilinguals per Domain 

From an organizational standpoint, academia represents 40% of our total number of bilinguals across all cohorts, followed by non-profits, consisting 18%, think-tanks at about 15% then multilateral organization at 12%, government at 8% and finally, business at about 6.5%. 
Bilingual Organizational Breakdown 

In terms of geographic representation, most of our bilinguals are based in North America followed by Europe and Central Asia, constituting 38% and 27%, respectively. Latin America, South Asia, as well as the rest of Asia and the South Pacific each roughly constitutes 8% of the bilingual community. Finally, a total of 12% of our bilinguals come from the Middle East and North Africa, and the rest of the African continent. We intend to improve our geographic distribution as we move forward — more information about our upcoming strategies is shared later on in this piece.
Geographic Distribution of Bilinguals 

II. Question Analysis 
And lastly, our bilinguals have sourced a total of 474 questions across all five domains thus far! The migration domain supplied 80 questions, our gender cohort contributed 94 questions, air quality bilinguals contributed 90 questions, the future of work cohort released 89 questions, and finally, the disinformation domain produced 121 questions! And the top 10 questions for migration and gender, the two domains that have completed public voting, reached a total of 1,700 votes from the public!  

Below we offer a few defining characteristics of the initiative’s approach as identified by the partners and tested the last few months.
I. Building Participatory Engagement
The 100 Questions Initiative seeks to provide an innovative approach to community collaboration. In particular it identifies and engages bilinguals to source useful and impactful questions. The effort subsequently seeks input from the public through an interactive online voting platform to help determine the most important questions confronting society. 
Emily Courey Pryor, Executive Director of Data2X, our partner for the gender domain, praised this two-pronged approach to participation. “The 100 Questions Initiative is powerful because of the way it prioritizes community engagement,” she said. “By convening some of the sharpest ‘bilingual’ minds on gender and data around the world and inviting the general public to vote on the most pressing questions about gender equality that data can answer, we are working to show that by making the demand for gender data democratic and inclusive, we can create more targeted and sustained change.” 
II. Data Supply vs. Data Demand
Too often, we rely on data holders or “owners” to frame the issues that matter to society. The questions policymakers and researchers ask are determined not by what really matters but simply by what data is available. Projects derived from this supply-led approach often offer insights of limited impact. The 100 Questions Initiative seeks to address this shortcoming by moving from a focus on data to questions. 
Partners in the effort agree with the value of this approach. As Jessica Seddon, Global Lead on Air Quality at the World Resources Institute, our partner for the air quality domain, noted: “The move from the supply side—i.e. ‘I have data’ – to focus on understanding demand—i.e. ‘I have a question’—is definitely a much-needed shift in perspectives on investment in data.”
III. Determining Questions that Matter
Although it’s important to have a set of questions, how do we start with the “right question” that can be transformative and can be answered by data?.  The 100 Questions Initiative sources questions that are specific in each domain and that can provide actionable insight. Each question must result in one of the following four types of data insights: improved situational awareness, better cause and effect analysis, more accurate predictions, or enhanced impact assessment.  
These efforts to identify the questions that matter have resonated with our partners. According to Jeffrey Brown, Head of Technology Policy at the Bertelsmann Foundation, our domain partner on the future of work: “The project has already succeeded in its topline goal: making the case that, rather than jumping to solutions, we should first pour our energy into deciding whether or not we are asking the right questions.”
IV. COVID-19 & Addressing Complex Questions
While our initiative started by focusing on identifying questions for known problems, the COVID-19 epidemic has alerted us to the critical importance of the challenges that we are less prepared for. As with much of the world, we have had to make adjustments to adapt to new realities, and have sought to update our approach to take account of COVID-19 and more generally the existence of complex, unpredictable challenges.
Our partners have found The 100 Questions Initiative a useful lens for dealing with the pandemic and other complex issues. Alessandro Bellantoni, Head of the Open Government Unit at the OECD, our partner for the disinformation domain argued, “dealing with the fallout of the COVID-19 ‘infodemic’, which brought new urgency to the international challenge of disinformation, has made clear how little we still need to understand about this problem. The 100 Questions model in this sense brought the useful recognition that the current ecosystem for research and our approach to global issues may need a re-think and re-scale.”
In dealing with the aftermath of the pandemic in the coming months and years, we need  more rapid and cross-functional evidence-based approaches to public problems. The various effects of COVID-19 have informed the sourced questions from our recent domains, and we look forward to seeing how our approach can inform meaningful solutions to disease spread that make use of new and untapped assets. 
V. The Value of Data Collaboratives 
In seeking to find the right questions to address such challenges, we have found it useful to move beyond traditional data sources and methods of data sharing. In particular, we have focused on the role that data collaboratives can play in addressing challenges that straddle multiple areas of research. Data collaboratives are an emerging structure that allow for sharing of data across organizations, and often across sectors. By pooling data and combining insights from multiple sources, they allow for innovations in the types of problems addressed, and the way in which they are addressed.
These formations allow us to build on our insights and translate those insights into action. For instance, the next phase of the migration domain is advancing through the Big Data for Migration Alliance (BD4M), a joint effort from IOM, JRC at the European Commission and The GovLab. BD4M facilitates new forms of public-private partnerships among the business, policy, non-profit and scientific communities, supports peer-to-peer learning, shares good practices and builds capacity on migration data innovation.
Marzia Rango, research and data officer at IOM’s Global Migration Data Analysis Centre, reflected, “through this work, we now have a greater understanding of where to collectively focus our data innovation and collaboration efforts through the BD4M.” Michele Vespe, project manager at the European Commission’s Joint Research Centre, also added, “these joint efforts ensure we have a list of credible, validated, and compelling prompts to address through data science, and we’re excited that the BD4M provides a promising path forward for determining solutions.”
The GovLab has conducted significant research on the value and limitations of data collaboration, identifying that inter-sectoral collaboration can both increase access to information (for example, the vast stores of data held by private companies) as well as unleash the potential of that information to serve the public good.
The 100 Questions Initiative has reached an important milestone and undoubtedly made much progress. Nonetheless, there are evidently areas for improvement and refinement, and we look forward to working on these areas with our partners in the coming months.
Among the areas our partners identified is the need to make our methodology more multi-dimensional. Seddon suggests that the new science of questions could be taken one step further, arguing that “the process needs to accommodate the different reasons how or why a question can be important. For example, is it important for fundamental understanding? Or for accountability, or for tracking equity? And so on.”  
Some partners have also suggested that the domains we have selected may be too expansive, and that the initiative could benefit by narrowing its fields of focus into more specific sub-topics. For instance, Jeffrey Brown notes that “the future of work is notoriously hard to define and map. After analyzing the submitted questions, a more forceful definition of what the future of work is (and isn’t) is needed.”  Seddon also echoes this sentiment, arguing that “we need to put boundaries on very broad domains so that the topic can be covered more effectively.”
We recognize that our domain areas are indeed rather broad, and this largely by design. Our goal has been to show the depth and breadth of these complex challenges. In addition, our assumption was that by taking a wider perspective, we would also be able to include a more diverse and inclusive bilingual community.
Nonetheless, over the course of the first year, we have realized that we may not be fully equipped to map the full extent of issues in each domain area. Among other challenges we have confronted, it is difficult to reach underrepresented groups in faraway geographies, and especially to achieve gender parity or greater organizational diversity.  Seddon spoke to these challenges in her assessment of the air quality domain, stating that “this bilingual group cannot be said to represent the broad spectrum of air quality.”
We therefore recognize the need to find new ways of incentivizing involvement and making room for others whose perspectives may differ from our own. For example, moving forward we plan to leverage the wide networks and issue-specific knowledge of our domain partners, who are more keenly aware of which experts need to be invited to help increase diversity within our bilingual communities. Similarly, bilinguals themselves are often eager to recommend and provide introductions to experts who are difficult to get a hold of or are outside of our usual networks. And our international advisory board has also offered to make necessary introductions for new domain partnerships on areas such food security, global health, transportation, etc. We greatly appreciate all of these offers of assistance and will continue to extend our reach in order to make The 100 Questions Initiative a truly global effort. 
Though we’ve made great strides in the first half of The 100 Questions Initiative, and are taking stock of lessons learned, we’re eager to advance in a number of innovative ways that our advisory board (see members below) and others have recommended.
Thus far, in addition to sourcing questions that result in one of the following four types of data insights (improved situational awareness, better cause and effect analysis, more accurate predictions, or enhanced impact assessment), we previously required criteria typical of producing research questions. These four criteria included 1) practical and scientific impact, 2) novelty, 3) feasibility and 4) quality. Members of the advisory board pushed us to consider how we could take these specifications one step further and ensure that all of the questions sourced by bilinguals are actionable. We will aim to do this by developing prompts at both the sourcing and prioritization of the questions stages and by adjusting our evaluation criteria emphasizing the real world impact potential of the questions. 
We also plan to explore opportunities at the city and community level to develop a customized 100 Questions Initiative. Additionally, we intend to connect with  regional networks to co-develop new approaches to problem definition and policy consultation, using the 100 Questions methodology. Some ideas include experimenting with new tools and methodologies such as network analysis in order to identify patterns in the voting process, streamline topic mapping and curation of experts, and tagging and clustering of questions. Another related suggestion includes possible usage of collaborative platforms to enable deliberation and (quadratic) voting among the bilinguals.
Finally, we are eager to begin working on the second half of The 100 Questions Initiative. Among the important projects we plan to include in the second half is a collaboration with The Asia Foundation to launch our sixth domain: governance. While the number of democratic governments have doubled compared to five decades ago, several observers are raising their concerns about the current trend of global democracy, calling it “in retreat,” “backsliding,” or “in a state of malaise.”  At the same time, the rise of the internet, big data, and artificial intelligence that started in the early 21st century are creating positive effects for many aspects of society, including access to more information and participation in policymaking. However, such advances in technology also come with a number of downsides, including negative outcomes such as autocracy, surveillance and censorship. These new challenges facing democratic institutions globally call upon new explorations by scholars to practitioners to investigate the ways in which principles of democratic governance can thrive in a changing world. 
If you have further suggestions for bilinguals, partners, and/or domains for The 100 Questions Initiative, and how we can improve our methodology please contact Stefaan Verhulst, Co-Founder and Chief R&D Officer of The GovLab, at or Alexandra Shaw, project lead for The 100 Questions Initiative at
Our advisory board:
The 100 Questions Initiative is supported by a global advisory board comprising data science and subject matter experts from the public, corporate and non-profit sectors. Members include Ciro Cattuto, scientific director of ISI Foundation; Gabriella Gómez-Mont, founder and former director at Laboratorio Para La Ciudad; Thomas Kalil, Chief Innovation Officer at Schmidt Futures; Molly Jackman, Content Leader of Product Data Science and Engineering at Netflix; Vivienne Ming, founder of Socos Labs; Wilfred Ndifon, director of research at AIMS Global Network; Denice Ross, fellow at Georgetown University’s Beeck Center for Social Impact and Innovation; and Matthew Salganik, professor of sociology at Princeton University.
Full Post: Sourcing the 100 Questions that Matter: State of Play, Lessons Learned, and Next Steps