How to transform data into high-impact visualizations? Best practices in data visualization

September, 2021

Jaime Gallego 
Data Researcher
j.gallego@cepei.org

In any decision-making process, information is the main tool to identify what problems to address and the best strategies to implement, regardless of whether we are dealing with a company or a government institution in charge of developing public policies.

Data visualization is a valuable tool for presenting information through graphic formats and visual content to aid understanding. It also makes it easier to interpret data and convey complex ideas or concepts, since it is common for patterns, trends, and correlations to go unnoticed in numerical outputs, which in turn leads to a waste of data.

Although visualizations provide great benefits, it can prove challenging to create them. Here are five elements to keep in mind for better visualizations:

1. Define what you want to visualize

Understanding the type of data will help you define a clear goal for your visualization. Identify what you want to visualize, what is possible from the available information, and your target audience or main user for the visualization. This will also help you define the level of complexity of the graphs and the language with which you address the topics. For example, an audience made up of experts in data and statistics will not need a detailed explanation of the results produced by the use of graphs like box plots, while less experienced users might need a broader context on how the graph works and the usefulness of making outliers visible.

2. Ensure data quality

To create a good data visualization, it is essential to start with clean information, that is, with data that have previously undergone a validation and debugging process where possible errors associated mainly with inconsistent values are corrected.

During the debugging process, relevant aspects of quality assurance are reviewed, such as the source of the data, especially if we are dealing with data from unofficial sources; the currency of the data; the identification of inaccurate, duplicate or incomplete values; the correspondence of the values (that is, if the values or measuring units are coherent with the defined variables for the data set); and the consistency of the format to ensure uniformity with the structure of the data.

It is highly advised to use specialized programs (such as R or Python) that facilitate the debugging process and optimize time, particularly if you are working with large data volumes or if you are using connections between different data sources.

The relevance of data quality for the Sustainable Development Goals (SDGs)

The document “Contexto de la calidad de los datos en la medición de los ODS” (Context of Data Quality in the Measurement of SDGs, available in Spanish), created by Cepei (2019), includes recommendations aimed at guaranteeing minimum quality standards in the production, analysis, and communication of data for the measurement of the SDGs. One of the main challenges addressed by the research is the development of interoperability schemes in both official and non-official data generating institutions to promote synergies in the processes of information transfer and exchange.

3. Analyze different types of charts

Not all charts adjust to the data you want to visualize, therefore, it is necessary to explore among the different types of visualization possible to choose the format that best suits your needs.

For example, for plotting time series data, bar or line graphs are a great choice. If you want to establish comparisons between magnitudes, pie charts are your best option. If you want to analyze the relationship between two variables, scatter plots or bubble graphs work very well. If, on the other hand, you want to analyze data in a geographic context, maps would be ideal.

4. Interactivity as a differentiating element of visualizations

Creating visualizations that represent large volumes of data (variables, categories, or figures) without using too many visual elements that can distort the analysis and complicate the understanding of the data is a major challenge. Interactivity plays an important role in avoiding this. It is an attribute that allows information to be segmented, using tools such as filters, which are useful for customizing data sets and transforming them into graphs, tables and indicators that display specific information.

Dashboards are one of the clearest examples of how the interactivity component works in visualizations. They can be used to compare multiple metrics and present the results as a summary, highlighting the most important data. The main objective of dashboards is to make them intuitive so that users understand how they work and understand the different graphical outputs offered by the tool.

Figure 1. Dashboard on Global COVID-19 Tracker

Source: Coronavirus (COVID-19) Data Hub

5. Dissemination for visualizations

Finally, it is necessary to define the media or channels for disseminating visualizations. It should be noted that digital media work very well to present dynamic visualizations that can be embedded in websites, while static visualizations work great in printed reports or infographics.

Currently, DataRepública’s website serves as a collaborative space where different actors in Latin America and the Caribbean can build and publish data stories with different visual formats, such as dashboards, with the objective of generating new knowledge to strengthen decision-making processes in relation to the fulfillment of the SDGs.

Final considerations

The use of visualizations facilitates the presentation of data, especially when large volumes of information are available. However, a common mistake is to saturate visualizations with too much data and confuse the user, hinder understanding, and cause misinterpretations. For this reason, it is advisable to prioritize what information is relevant for the visualization exercise and determine what is worth visualizing, or not.

Creating visualizations requires effort and planning, it requires tactics and skill to communicate effectively, but this is achieved when the visualizers put themselves in the end users’ shoes to understand their needs and limitations.