What is data? The fuel for sustainable development

August, 2021

Jamiil Touré Ali
Researcher in the Data for Sustainable Development Unit 
j.toure@cepei.org

Data is the backbone of many investigations. With the burst of new technology, its ubiquity and pervasiveness are non-negotiable as many companies, institutions, and governments are more data-driven. In the Sustainable Development Goals (SDG) ecosystem, the key tools to track progress of SDG targets are data to measure their indicators. So, what is data? and what is data in the context of sustainable development?

Data is the oil that is so big and powerful

According to the OECD, data is the characteristics or information, usually numerical, that are collected through observation. In a more technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. Furthermore, statistical data refers to data from a survey or administrative source used to produce statistics. What is data then?

Personal information, news, and facts are some of the various forms in which we encounter data. Similar to the stain generated by oil on a tissue, personal information, news, and facts generate information that constantly grows and finds its utility depending on our objectives. For example, when a person is born they are issued a birth certificate. When he gets married his marital status is updated from single to married. When he/she has kids this information gets updated further from married to married with kids. This kind of footprint information is available for almost 7.9 billion people on this earth. The 2030 Agenda through its 17 goals (SDG), 169 targets, and 232 unique indicators seek to collect such data to guarantee a better and safer world for all. Which data exactly? For instance, SDG 3 target 3.1 aims to reduce the global maternal mortality ratio to less than 70 per 100,000 live births by 2030, and is fulfilling it by collecting information about maternal mortality and births attended by skilled health personnel which are then converted into indicators 3.1.1 and 3.1.2, respectively maternal mortality ratios and proportion of births attended by skilled health personnel. As a result, the personal information or medical records of about 7.9 billion people is the big data that we need to achieve these targets.

Data collection or data gathering is really important in achieving the 17 SDGs of the 2030 Agenda. That is why the UN Statistical Commission disposes of a data calendar on different national entities which provide data. However, it is important to mention that the acquisition of information and its transformation into data is not sufficient to produce information that could serve for decision-making. Data quality is what facilitates decision-making.

Data quality is at the heart of decision-making

A simple solution to reduce the death rate could be to provide good quality air and food for all. The former is one of the most overlooked by humankind yet we can’t live without oxygen for 3 minutes. Moreover, the quality of air should be assessed so we don’t fall sick of breathing bad air quality. Similarly, data is omnipresent in our lives and its quality should be of high importance for trusted decisions. What is data quality then? and how does it impact decision-making?

Data quality is a measure of assessing information to prove its state of being:

accurate: reflects the reality.

consistent: agrees with the different possible information sources

complete: covers all possible levels of disaggregation such as age, sex, state, zone (urban or rural).

and up to date:  the most recent possible.

For the 17 SDGs and indicators, information is produced by national statistical offices, and it is the official statistics that is used to report on the progress of the goals, hence their importance to be of high quality. The United Nations National quality assurance framework for official statistics is a great tool that ensures the trustworthiness of the decisions made based on data shared by national and international institutions working with the UN.

On the other hand, the data revolution also presents an opportunity to work on the Sustainable Development Goals with statistics flooding from multiple sources. Nevertheless, investigations conducted based on those inputs should reflect the true story so a decision made is trusted. Data quality is for decision-making what air quality is for health.

What is data for Cepei?

Considering the two views given above on what data is, Cepei firmly believes that data is indeed the fuel for sustainable development which should be collected and published in line with quality checks for policymakers to make better decisions that allow progress in the implementation of the 2030 agenda. As one of the leading organizations following up on the 2030 Agenda in the LAC region, Cepei has engaged in many activities for promoting the creation and dissemination of high-quality data which support decision-making. Below are some of our contributions:

DataRepublica-Conecta | An online data catalog linked to SDGs in seven Latin American and the Caribbean countries which includes each data source's description and metadata.  It is available to users seeking relevant information to measure progress in the implementation of the 2030 Agenda.

Peer-to-peer exchange on water information systems:  Mexico - Paraguay | A virtual exchange event to present Mexico’s experience in structuring and consolidating the country’s National Water Information System (SINA) through three elements: governance, indicator system, and architecture.

Progress and challenges in measuring the SDGs: GPSDD Latam partners meeting |  A working meeting, organized by Cepei and the Global Partnership, which brought together participants from five Latin American countries represented by their National Statistics Offices interested in sharing their experiences and challenges in the production of disaggregated and timely statistics, as well as highlighting the partnerships and projects they are working on.

Data Reconciliation Process, Standards, and Lessons |  A case study on the integration of subnational and private sector data in Colombia’s national statistics. Produced by  Cepei and SDSN TreNDS, with the support of the National Department of Statistics (DANE) of Colombia and the Chamber of Commerce of Bogota.

Concluding remarks

Data is the oil of the 21st century. And it is so powerful that we can use it to accomplish the SDGs by tracking their progress based on data collected to measure their indicators. Also, data is not useful for decision-making unless it undergoes quality checks. SDG Indicators Global Database shows the significance of what data is in the context of sustainability: This is the information needed to help us work together towards the attainment of the SDGs and build a world where no one is left behind. Cepei believes that data is the fuel for sustainable development,  and needs to be collected and published along with quality checks, such as those provided by DataRepublica-Conecta: A hub of high-quality SDGs data in prioritized LAC countries for sustainable development advocates eager to follow-up on the implementation progress of the SDGs.

[1] https://unstats.un.org/unsd/methodology/dataquality/un-nqaf-manual/

[2] https://www.undatarevolution.org/report/