From Census to Cloud: Platformed Regimes of Quantification in Official Statistics

I met Oscar D’Alva, a researcher at the Brazilian Institute of Geography and Statistics (IBGE), at  a bar in Botafogo during the pre-conference of the Association of Internet Researchers (AoIR) taking  place in Rio de Janeiro in mid-October 2025. Oscar’s PhD thesis, entitled “Estatísticas oficiais e  capitalismo de plataforma: a transição para um regime de dataficação no Brasil” has received multiple major  social sciences awards in the region, including a prize from CAPES, Brazil’s federal agency for  graduate education and research, ANPOCS, the national association of advanced studies in social  sciences, and AoIR. When Oscar realized that the conference was (modestly) funded by Microsoft – as it has been for the past twenty years – he decided to turn down the latter prize, sparking a  debate inside a conference held for the first time on the Global South and whose main topic was  anticolonialist perspective on digital sovereignty. The interview took place at a dim, wood-lined bar  behind Praia do Flamengo, Rio. 
What is your trajectory and how did you get interested in statistics and platform capitalism? 
I have been working at the Brazilian Institute of Geography and Statistic (Instituto Brasileiro de  Geografia e Estatística, IBGE) for the past fifteen years. Official statistics is a very traditional exercise: we use structured methodologies and established research practices, and we carry a public service  ethos. After a decade or so in that world, I began to notice how new technologies and new data  sources were creeping into our routines: mobile phone data, satellite imagery, data from connected  sensors…I wanted to understand whether these new sources and methods truly demanded a  transformation in what we do. 
That curiosity took me to a PhD in sociology on official statistics and platform capitalism. It let  me examine the meeting point between a long-standing public field and a newer, corporate-driven  data ecosystem. My earlier path had taken me somewhere else. I did a master’s in geography on  carnaúba palm extractivism in Ceará, where I am from, after years working with social movements  and an NGO focused on rural development. It sounds far from statistics, but it taught me to watch  how states and markets refashion old practices under new regimes. I studied an old activity, palm  extractivism, which goes back to the colonial period, and looked at how the state tried to manage  it through policy and how it fit into capitalist development and waves of change. With official  statistics you can tell a similar story. It is an old activity tied to the emergence and centralization of  modern states. What interests me is how the production of information changes over time and  whether we are entering a new regime of quantification. 
What your thesis is about, and what was your interest in that topic? 
The research is about official statistics in platform capitalism. Empirically, I focused on IBGE and  on the implementation in Brazil of a project initiated at the UN Statistical Commission, the UN  Big Data project. The plan envisioned four regional hubs in the Global South: Brazil for Latin  America; Rwanda for Africa; the United Arab Emirates for the Middle East; Indonesia for Asia.  The goal was to ‘modernize’ official statistics by introducing big data and AI into national statistical  offices. The hubs gave me an entry point into the present, where old methods and new  infrastructures overlap and collide.
To make sense of this I used the work of Pierre Bordieau, and its concepts of field, habitus, and  capital. My hypothesis is that big data and machine learning created an intersection, often a  collision, between two fields of practice. On one side is the statistical field, historically linked to  nation states, with its public service orientation, the census, the autonomization of the state, and  the work of centralizing informational capital. Then it takes also the Foucauldian of biopolitics and  how statistics are important for the organization of the state. But this is a field with its own codes,  structures, and ethos. The ‘state statistician’ is the symbol of this profession that was embodied on this field; it was oriented by an epistemology of statistics which is the frequentist epistemology.  
On the other side is what I call the algorithmic field, linked primarily to private corporations, especially  big tech but not only them, because it is the whole chain of activities, spanning data extraction,  management, and analysis. It is where data science appears as a hybrid of statistics and computer  science. These fields come from different genealogies and values. Where they overlap, we find a  transition toward a new regime of quantification, that I call the datafication regime. 
What do you mean by frequentist, and how does Bayesian thinking fit into this? 
In statistics there are mainly two branches: frequentist and Bayesian. It helps to consider how each  field treats probability and evidence. Frequentist statistics, dominant in official statistics of the 20th,  understands probability as something real and objective: the long-run frequency of events in a well defined population. Its strength is structured data, designed samples, and clear frames. Bayesian  reasoning treats probability as a degree of belief that is updated with new evidence. 
In Bayesian statics there is a kind of inversion. You still talk about events that you observe, but you  don’t know the cause. It is often more comfortable with prediction, unstructured data, and contexts  where the population is not well defined in advance. When you have new pieces of information  you can add them. Those are conditions that are common in big data and AI. The growing use of  model-based inference with data that were not designed for representation is a key part of the  intersection I am describing. The Bayeasian ideas are very important to this kind of method.  
You develop a historical account of quantification regimes. Could you outline it? 
Before the current datification regime, there were earlier statistical regimes. Building on Alain  Desrosières’s work in the sociology of quantification, I argue that as the state’s actions—and the  ideas and theories about the economy and society that inspire them—change over time, so do the  statistical tools it uses. I also take the ideas of the French Regulation Schools, which points out that  capitalism has its periods and crisis, and the latter are good moments to understand changes and  capital accumulation patterns.  
In period of crisis, the statistical regimes is always demanded to give answers. I trace five regimes. First, a pre-statist regime of accounting, where numbers serve monarchical finance and remain largely  secret. With the rise of nation states and the bourgeois revolution in the nineteenth century, we  enter the enumeration regime, when censuses and descriptive statistics make populations legible within  territories. Since political tools are always combined with epistemological tools, the census here is  the tool, and the idea is to create a Bernulli earn that was equal to the state. 
The grid to see the reality.  
Yes, but you have to enumerate everything, we are talking about descriptive but not yet inferential  statistics, which is the fourth regime. After the 1929 crash, in the 1930s and 1940s, inferential  statistics and sampling theory became central as states tried to manage economies and society at  scale. They needed to intervene quickly. I call this the precision regime, consolidated after World War 
II alongside the creation of the UN, and two years after the Statistical Commission, which gather  all the representative. You also have the modernization of statistical offices, the use of computers,  and nation accounts. The system of official statistics was built in this period, which is the period of  managed capitalism and Keynesian macroeconomics.  
The crises of the 1970s and 1980s – oil, petrodollar, etc. – with the emergence of neoliberalism I  understand this to create a new regime with the influence of financialization on all aspects of life.  With finance as a driver of capitalism, you have this idea that everything needs to be quantified.  There is a new politics of indicators that are used for everything. 
Up until the OECD which tells poor countries which indicators need to be used. 
Yes, and the indicators that go into people’s lives and work. This is what I call the commensuration regime. The 2030 Agenda and Sustainable Development Goals (SDGs) are the mirror of this  commensuration regime: the idea that you can solve big political problems by quantificating  everything. We have measures for every single social problem. It’s interesting that this technocratic  ideal can be traced back to the political creation of the statistical field in the mid-19th century by  Adolphe Quetelet, who is the father of this kind of statistics. With the international statistical  conference in the beginning of the 1830s, the idea was very similar to SDGs, that we can unite all  statisticians from the world and create technocratic government by measuring everything. It was a  big failure. Finally, after the 2008–2011 crisis, we see the transition to a datafication regime: large 
scale data infrastructures, AI/ML methods, and platformization. The UN Big Data Project  presented as a “data revolution for sustainable development” is an important player for its  diffusion. 
And how did the UN Big Data Project play out in practice? 
After 2008 the UN became more dependent on private funding and corporate money. Big tech,  especially Microsoft but also Google and Meta, gained influence in statistical initiatives. At the UN,  the data revolution narrative argued that national statistical offices were not prepared to measure  the SDGs and therefore needed help from private corporations in order to use the whole potential  of big data. The platformization agenda proposes to bring state statisticians together with corporate  data scientists, often through public and private partnerships (PPPs), aiming to create new data  markets in a field that historically manages data as public goods. But PPPs often produce conflicts  of interests on activities that are based on trust. 
The UN has created a platform, controlled by an NGO called Global Partnership for Sustainable  Development Data created in the USA and linked to the UN Foundation created by Ted Turner,  founder of CNN, which comprises a lot of big corporations that both do donations and lobby  within the UN. The global partnership is the manager of the platform that comprises the four regional hubs in the Global South. 
Can it get more colonial than that?  
It actually can. I did an interview with the person responsible for this idea. He was saying that they  use cutting-edge tools and experts from the Global North and from big tech, within a federated  cloud so agencies in the Global South can upload data and benefit from analysis. In return,  corporations and Northern institutions build APIs, products, and legitimacy for institutions around  that data. It is an alluring vision, but it produces conflicts of interest in a domain where public trust  is foundational. 
Are you pointing out a case of corporate capture? 
They are trying to do that, but the process is only beginning. This also happened in the first wave  of neoliberalism, when companies tried to do that but there were reactions. But I see two main  obstacles to this transition to the datification regimes. The first is ontological, what data is. In  official statistics, data collected from people and firms to produce public information is treated as  a public good. In the algorithmic field, data is a commodity to be extracted, traded, and optimized  for private value, whether advertising, product development, or market advantage. When these  ontologies collide, tensions and practical problems multiply. 
The second is epistemological, how knowledge is produced. Official statistics proceeds deductively,  from societal questions to carefully designed data collections that represent populations. Much of  data science proceeds inductively, feeding models with large, often non-representative data and  inferring patterns, reconstructing representation after the fact by modeling. When statisticians and  data scientists work side by side, they bring different logics, methods, and purposes, and the friction  is political as well as technical.  
Are there counter-movements to this commodification? 
Following Karl Polanyi, I would say that commodification tends to trigger protective counter movements. In Europe, early enthusiasm for public–private experiments with big data around  2014–2017 gave way to the realization that such pilots go nowhere without access to what is now  called “privately held data” rather than big data. The European Statistical System has since pushed  for a regulatory route—recognizing certain privately held data as public-interest data to which  statistical authorities can have access under clear legal bases and with strong safeguards. The Data  Act has a provision that goes in this direction. It’s imperfect and contested, but it signals a path  that does not simply subordinate public statistics to market logic.  
In Brazil, I also found internal resistance within IBGE: a preference for regulatory solutions—say,  to access mobile phone data—and for developing in-house capabilities, rather than relying  primarily on corporate partnerships. 
This introduces the topic of statistical sovereignty. How does it relate to digital sovereignty, especially in Latin  America? 
I am developing the concept now. The thesis touches on sovereignty as state capacity to produce  public information, but the concept itself needs elaboration. In Latin America, with our histories  of state-building and national development, the public character of official information carries a  different weight than in many European debates, which often emphasize limiting state power in  the name of civil liberties. My starting point is people’s sovereignty. Citizens are squeezed between  governmental power and market power. Statistical information should be a right, high-quality,  reliable public information about the economy, society, and the impacts of platforms on work and  life, so that people can understand and govern their collective existence. If we lose a trusted public  provider, or if access becomes priced, fragmented, or corporately captured, people’s sovereignty is  diminished. Statistical sovereignty, understood as a right to public knowledge about our collective  life, is a necessary strand within broader digital sovereignty. 
Now, the debate on digital sovereignty has become much more concrete. It is clearer where  corporations stand, that they align with the USA government that can use this data infrastructure  as a tool of power. You can close the server, and you don’t have access to your data anymore. This  is a huge thing. With Trump’s government, it became crucial that governments build their own digital infrastructure. 
But for instance, what we are seeing in Brazil is the whole speech of sovereignty from the  government which is being captured by the corporations again. We have a project of “sovereignty  cloud” with SERPRO, the national server of technology. It is a federate cloud, very similar to the  project of the UN, aggregating providers such as AWS, Azure, and Google, Huawei. The new  president of IBGE is signing a contract to move large datasets, census microdata, survey files,  economic statistics, from secure in-house data centers into this “sovereignty cloud”. If public data  is migrated into big tech infrastructures under a sovereignty label, we should acknowledge the  contradiction and debate it. 
So when we say digital sovereignty, what do we mean? Keeping data within a certain territory?  Legal guarantees of safety and access? Public infrastructures? Trustworthy use of international  infrastructure under strong regulation? The answers vary and must be clarified, otherwise  sovereignty becomes a slogan that anyone can appropriate. 
In Europe we tried something similar to a sovereign could with Gaia-x, which was quite a failure. Finally, what do  you think are practical steps a country can take in the next five years? 
We have already missed opportunities. In Brazil, public universities had their own data centers and  moved rapidly to corporate clouds, accelerated by the pandemic, without a coordinated public  strategy. It solved short-term constraints but created long-term dependency. You had an  opportunity to build a network of public services, but you need incentives. If you leave the decision  to a single IT responsible for a university, it is easier to go to Google.  
That is also because platforms thrive by offering easy solutions where capitalism has already created structural  problems and a lack of resources.  
Yes, and in this case, we have lost opportunities. Now it is also happening in health. We should  have a public policy that at least understands the strategic statecraft areas that are important to  protect: defense, official statistics, health, education. I think that is why I decided not to receive the  AoIR prize: the goal of big tech is to anticipate critical thinking, and we need to work to avoid that.