A slick chart, an interactive data-exploration interface or a KPI-based dashboard; all of these are data visualization products. They garner a lot of attention because they are a finished product, and look nice as well. However, for many companies engaged in data visualization, those final deliverables aren’t the most important benefit of data visualization. Instead, it’s the insights into the quality of their collected data that truly leads to success.
Data visualization provides 3 key insights into data:
- Is the data complete?
- Is the data valid?
- Is the data well-organized?
Without knowing those 3 elements, data collection and business intelligence processes become much more expensive, labor intensive, and may end up abandoned when the data doesn’t demonstrate what is intended. Using the insights from data visualization, these projects can have a much higher likelihood of completion and success.
Insight into Data #1: Is the data complete?
The most straightforward insight that visualization can give you about your data is its completeness. With a few quick charts, areas where data is missing show up as gaps or blanks on the report (called the “Swiss Cheese” effect).
In addition to learning which specific data elements are missing, visualizations can show trends of missing data. Those trends can tell a story about the data collection process and provide insight into changes necessary in the way data is gathered.
A Data Completeness Example: After creating a visualization on a collection of survey data regarding movie-going habits, it’s clear than there are a significant number of blanks after question 14 on the survey. The visualization helps the survey company recognize that those specific records need to be abandoned, but also that the survey should be shortened to accommodate for “respondent fatigue”, the likely cause of the incompletions.
Insight into Data #2: Is the data valid?
The importance of visualization among data validation techniques has been discussed before. It’s clear, then, that visualization can play a pivotal role in understanding data’s validity. By executing a quick, preliminary visualization on collected data, trends that indicate problems in the complete data can be found.
A Data Validation Example: A collected dataset is designed to demonstrate the difference in male population statistics between Alaska and Florida. Examination of individual records and outliers show that the data is valid – there are a significantly higher percentage of males in Alaska than in Florida, this is expected. However, a visualization of the entire dataset shows that there are more males in Alaska than Florida. This is a red flag because, even with the gender ratio differences, Florida’s larger population means that it should have a higher total number of males.
A well-designed, preliminary visualization can give insight into the validity of collected data that is difficult, or even impossible, to gain with traditional methods.
Insight into Data #3: Is the data well-organized?
Poorly organized data can be the bane of the final step of a data collection or business intelligence process. Using data organization tools from the start can help streamline later steps of the process.
During collection, the data is often organized in a way that optimizes the gathering process. However, that same organizational scheme can be a problem when the time comes to act. The data visualization process serves to highlight the organizational challenges of your data and provides insights into how it might be done better.
A Data Organization Example: A client wishes to use their collected customer data to develop a customer profile that defines demographic breakouts of snack-food purchases indexed by time of day. Their data visualization partner asks them where that data is stored and it is discovered that the transactional data is stored separately from the customer profile information, and that data can only be intersected through yet another correlational dataset. While all the data is technically available, the data needs to be reorganized to be functional in decision making.
Data visualization isn’t just data organization and analysis tool; it can play a crucial role in the entire data gathering and management process. With a well-executed visualization, taking time to understand what is to be learned from the data and how the information will be gathered, companies are able to cut costs and eliminate the waste that comes from having to re-gather or re-organize their data.