National statistics offices (NSOs) are uniquely placed to promote and deliver on open data principles.
Critical actors in national data collection and dissemination efforts, NSOs are required to be responsive to the information and data demands of both policy makers and the public.
Globally, there have been several context-specific initiatives that have achieved openness in their local statistical systems. In Asia, Georgia, Mongolia and the Philippines have been the front-runners in developing more open statistics generation processes.
Making NSOs open is fraught with political, economic and technical challenges. As a result, long-term successes are still hard to come by.
The main responsibilities of national statistics offices (NSOs) is to uphold data collection standards, implement analysis methods, and disseminate data publicly. NSOs also host rich demographic data to create a record of social and economic conditions via census, surveys, and administrative records1. These datasets are crucial to tracking the SDGs, for example. While NSOs are often underfunded and under resourced, their existing responsibilities could embody many key aspects of open government data practices.
At the same time, merely qualifying data as “public” does not guarantee accordance with open data principles. Embedding open data structures in an NSO involves practicing transparent data collection processes, opening up collected data, and ensuring that data is non-proprietary, machine readable, and reusable2. More specifically, NSOs should assess their strengths and weaknesses, prepare an implementation plan, and develop a process to monitor progress and evaluation3. Well executed implementation plans can enable citizen engagement with NSOs, thereby granting citizens greater agency in the data being generated about them.
Because NSOs are public bodies, non-rivalrous, and non-exclusive, opening up data is economically efficient. As the data ecosystem expands, NSOs are expected to take on new roles and incorporate both private and public actors in their operations4. While the context of various NSOs varies, the political nature of statistical data is indisputable. Therefore, NSOs must actively engage with their clients, seek political support in the form of funding and open data policies, and redevelop data infrastructures and statistical capacity. Met with slow progress and a lack of investment, open data advocates have opined that open data principles pave the path for NSOs to innovate data practices and demonstrate their relevance to policy makers and the public5. With institutionalised support and shifting legal frameworks, NSO data has the potential to increasingly inform decision making processes, thereby demonstrating the heightened indispensability of open NSO data6.
Colonial legacies underpin the power struggles associated with NSO data collection today.
In India, British colonisers were eager to categorise the Indian demographic to tax widely and police antecedents7. The population was ordered according to arbitrary social stratification which had real impacts on the economic and social well-being of individual citizens. In order to legitimise this large-scale reorganisation, “scientific” methods were sought out to standardise a formal process for national surveys8. The data collected was considered to be scientific knowledge acquired through scientific means, however, the “objectivity” of data was swayed by the subjectivity of political motivations and goals.
In post-colonial India, bodies responsible for national statistics have undergone several changes while still retaining colonial legacies. In 1950, the government launched the Indian National Sample Survey under P.C. Mahalanobis. Specifications of national surveys regarding employment and consumer spending were initiated throughout the remainder of the century. In 2005, the National Statistics Commission (NSC) was created to govern the autonomous National Sample Survey Organisation (NSSO) founded in 19709.
In the past year, the resignation of NSC leaders, delays in releasing reports, and mergers of national statistical bodies have raised questions around the sustainability of the NSSO as an autonomous body10. Citing supposed data quality issues to justify withholding data, the government has limited the information with which legislative actions can be appraised. Simultaneously, no efforts to strengthen the NSSO have been initiated.. Thus, political actions and motivations still play a large role in the ability of national bodies to collect information, the methods used to collect the information collected, and consequently, the very nature of the information produced.
Even so, we cannot discount the potential of bodies such as the Ministry of Statistics and Programme Implementation (MoSPI) to regulate and monitor quality statistical practices11. As the name suggests, the body simultaneously monitors best practices and implements these strategies into their work. In collaboration with civil society, autonomous and open government bodies can allow citizens to better communicate whether or not their needs are being recognized by national statistics.
The open national statistics movement is marked by the launch of tools such as Data.gov in 2009 in the US. From 2013 onwards, several G8 and UN reports and events were geared towards addressing the significance of NSOs in open data agendas12.
Since 2013, several tools measuring the success of NSOs have been developed. Some of the most prolific ones include Open Data Inventory (ODIN), a measure of openness of data produced by national statistical systems, Open Data Barometer (ODB), and the Global Open Data Index (GODI), which are concerned with measuring the openness of non-statistical datasets13. Additionally, initiatives working towards assessing best practices have been devised. Some initiatives playing a significant role include the Open Data Readiness Assessment (ODRA), Common Assessment Framework, Statistical Data and Metadata Exchange, International Household Survey Network, and The Data Documentation Initiative14. These tools and initiatives work together to measure open data readiness while also compiling a set of resources to implement best practices and subsequently evaluate their performance.
Internationally, the Data Documentation Initiative and the Statistical Data and Metadata Exchange provide standards for conducting census and surveys and aggregating time-sensitive statistics15. Many countries have contextualised these standards to bolster their national data collection practices. In Australia, Austria, and Canada, the national initiatives are more focused on forging new partnerships, facilitating dialogue, and prioritising outreach16. Initiatives in Kenya, Namibia, and Sierra Leone have identified the critical need for open national statistics data and have established new government run portals, increased the availability of open formatted data, and strengthened the role of NSOs with increased representation in open data activities17.
Globally, Rwanda and Mexico18 set the stage for open NSOs. Rwanda has the highest ODIN score among low-income countries, while Mexico has managed to foster a strong culture of support for open data through sustained political support. According to their specific contexts, data is made available to work towards improving social conditions, whether this be land data or data on educational institutions19.
In Asia, Georgia, Mongolia, and the Philippines are exemplary in their commitments to openness and assessment of NSO strengths and weaknesses20. Georgia is focusing on the use and dissemination of statistical data, Mongolia is discussing the creation of a centralised database to make public data usable and accessible, and the Philippines is also discussing the creation of an open data portal with open metadata formats21. The Department of Statistics in Malaysia has developed an official open data portal as a “one-service” centre for citizens to access official datasets across all government sectors22.
When dealing widely encompassing data portals it is important to question who the totality of “national” statistics applies to. The Open Development Mekong organisation recently wrote a blog post outlining the significance of Indigenous Data Sovereignty (IDS). More specifically, applying the IDS framework to open data initiatives allows for “open data to be used as a tool for transparency and that [can] uphold equal rights for all”23. Upholding equal rights involves controlling the governance and disclosure of data, maintaining privacy and seeking consent, and allowing for self-determination of how the data is collected, interpreted, and implemented within policy24. According to Open Data Watch, under-counting and under-representing indigenous communities is prolific across the globe25.
The Open Data Inventory (ODIN) assesses open data initiatives and data publication by how well they are disaggregated. When it comes to reporting on Indigenous demographics, India ranks 17th, the Philippines 20th, and Malaysia 98th26. Certainly, Asian countries whose populations are largely heterogeneous, relative to the global North, must seriously consider how indigenous data sovereignty can play a greater role in the collection, publication, and use of national official statistics27. Recognising this is especially important for parliamentarians in the global South who aim to disconnect from and disband colonial legacies that are so deeply and structurally rooted in our histories. If we acknowledge the populations that were commonly forgotten, perhaps we can visibilise realities within different Asian countries whose invisibilisation has been normalised.
To address the challenges listed below, organisations such as Open Data Watch produce Open Data Inventories annually to assess open practices specifically in developing countries, with a focus on the statistical systems in place28.
In addition to the necessity of creating context specific solutions, as the above cases make abundantly clear, the challenges faced by NSOs also vary according to context. Although there are few successful cases and the proliferation of some monitoring tools, open data indexes have not shown significant improvement in the last few years29. NSOs struggle with setting confidentiality protocols and managing data in a way that balances public safety, acquiring and correctly employing technological innovations in data collection and publication, mitigating the risk of the misuse and de-contextualisation of data, and taking a structured approach to openness30.
Disaggregated readable data: Disaggregating data is difficult not only when data formats have been fixed and standardised for several years, but also when privacy and legal concerns about data confidentiality are raised. Making data available for reuse requires a robust anonymisation infrastructure, especially when the data is obtained from individuals or minority groups31. Furthermore, accessibilising data requires developing a set of criteria to justify withholding some data and outlining the timeline of its future release. From a legal and policy standpoint, NSOs must distinguish between publicly available data and open data, as the latter can be republished while the former cannot.
Tensions between the central government and NSOs: Much of the tension between government bodies and NSOs is financial. As both parties are aware, large scale data collection efforts come at a high cost - this cost is often incurred by the domestic taxpayer or aid money. NSOs continue to release data as outputs in consistent fixed formats; changes to the data analysis or format are often available for a fee. The pay structure acts as a barrier for the majority of potential users32. Alternatively, by implementing open data standards, in its potential reuse, the value of data can amplify greatly.
Additionally, the manifestations of power asymmetries in census and population data, which undergirds national statistics to this day, reminds users and citizens of the political nature of data. The operations of NSOs and the nature of knowledge and data they produce can differ greatly depending on the political context. In climates where censorship, overvaluation, or misinformation is encouraged for political gain, NSOs, who are reliant on central government bodies, suffer.
SDGs and barriers to development: In addition to implementing open data principles, NSOs are responsible for manning the bulk of monitoring and determining the extent of progress made towards accomplishing SDG targets and goals. Despite the fact that these bodies exist to create data, and have existed for several years, they have been largely absent in the larger open data movement and have not engaged the user ecosystem in a way that increases the need for their technical assistance and management skills. The meagre profile and influence of NSOs on average, has disadvantaged data quality, the efficiency of data collection methods, and the price of sharing data. Within this context, SDGs, which already do not capture all necessary disaggregated data subsets, may not be monitored sufficiently in contexts where resources are scarce. Inaccurate data used to represent existing inequalities may further exacerbate them and continue to misrepresent realities on the ground.
Implementation and resources: Considering the challenges mentioned above, implementing open data necessitates major shifts in the operations and governance of NSOs. Participating in the open data movement comes with a heavy initial investment, as resources, personnel, and data collection methods need to be overhauled. The integration of tools and new technical expertise to support open data principles is also included among the large-scale reforms. For this shift to take hold, NSOs must position themselves as a strong player in the open data movement and collaborate with multilateral, national, and local initiatives in order to enhance their own capabilities.