Data Quality Management: What it is and How to Do it

DQM

Data Quality Management: What it is and How to Do it

Data is at the core of every business decision. Whether you’re predicting customer expectations, planning marketing campaigns, or strategizing sales, data informs the competitive and historical insights for effective business strategies. The success of data-led initiatives ties in directly with the quality of your data. Data quality management ties together the people, processes, and technologies necessary to ensure your data is continually trusted and properly reflects your business requirements. In this post, we’ll dive into data quality management to uncover what it is, why it’s important, and how its phases tie together to create a complete data quality strategy.

 What is Data Quality Management?

Data quality management (DQM) is a business principle that relies on the skillsets of people, processes, and technologies to ensure high-quality data throughout your organization. The goal of DQM extends beyond maintaining and improving data quality. In the long term, well managed data is the key to making educated business decisions and attaining your desired business outcomes. Teams must be able to trust that the data they’re accessing is correct, current, and consistent across the board.

Why is Data Quality Management Important?

The benefits of DQM ripple across every department in your organization. Reliable data reveals trends, informs proactive strategies, and boosts team performance and efficiency. Here are a few more benefits you can expect from data quality management:

  • Correct data creates more efficient data processes and more informed business decisions
  • Quality analytics offer a better view of customers, prospects, vendors, partners, etc.
  • Departmental alignment increases when analytical insights can be shared across teams
  • Businesses save money in the long term by avoiding initiatives that are unsupported by their data
  • Proper data quality helps streamline data governance procedures as well
  • Satisfaction and trust improve when a business can reference comprehensive customer data

The Phases of Data Quality Management

Data quality management is a marathon, not a sprint. To bring it all together, you not only need buy-in from employees and management, but you also need an ongoing plan. The image below is a visual representation of that process.

Photo Credit: BobsGuide

Let’s discuss the phases of data quality management in the order they should be completed (and repeated).

Data Profiling

This is the very first step towards proper data quality, and one that should never be skipped. In this initial phase of DQM, businesses review their data in detail to uncover issues as they relate to their quality goals. Data profiling looks at whether the format and content of the data matches its metadata and whether it is accurate, complete, and valid. Are there blank values? Duplicate data? Strange patterns? These types of questions can be answered with proper data profiling.

To perform profiling, consider these three aspects:

  1. Data structure discovery: How is the data formatted? Is that consistent with the data standards the business is envisioning? Does the data pass mathematical checks such as the ability to accurately calculate sums?
  2. Data content discovery: Are there any fields or values that are incorrect or null? Do any specific rows in a table have issues? Are there any recurring patterns of concern? For example, are most phone numbers missing area codes?
  3. Data relationship discovery: How is the data interrelated? Does the metadata align?

Data Rules

It’s no use going through the process of data profiling if you’re not going to maintain the new ideals going forward. That’s why businesses doing DQM must create a set of technical and business rules for standardizing how the data should be formatted and accessed going forward. These rules can lay out protocols for properly formatting numbers and dates (ex: DD/MM/YY vs. MM-DD-YY) or email addresses.

One key piece of advice on data rules: don’t overdo it. Too many rules can be just as bad as no rules at all. Get together with stakeholders from different departments in the company to understand the data that is most valued to each of these departments. From there, you can work together to build data rules that fulfill everyone’s standards.

Data Monitoring

Once your data rules are in place, data monitoring ensures data is periodically checked against those standards to maintain its integrity. Any deviations from the data rules should be reported through an automated process.

Business intelligence software can often assist with this aspect of DQM by capturing the irregularities before they infiltrate your data sets and sending alerts to management. Monitoring the entire swath of data can run the risk of information overload, so instead, it’s best practice to only monitor the data that drives business decisions. Any information considered vital or sensitive should be monitored through your automated processes.

Data Remediation

Monitoring the data is necessary to reveal the deviations of your data away from your set rules. Data remediation initiates a process for resolving the data quality concerns that come from that monitoring. One of the most important aspects of data remediation is examining the root cause of the defective data. Was it human error? Processing issues?

If the data is found to be corrupt or inaccurate, your organization should decide whether that “bad data” should be modified, or more simply, deleted. Who will be in charge of this? Should stakeholders and team members be notified before the modifications/deletions take place? An agreed upon approach for how to handle data remediation is part of your long game for data quality management.

Review data quality rules again to determine if anything should be adjusted or updated. If you uncover any data processes that are affected by the bad data, you’ll need to re-initiate them and align them with the new adjustments you’ve set in place. 

Data Quality Reporting

Data quality reporting helps teams easily keep track of the information gathered during the data quality management process. One of the easiest ways to keep data quality top-of-mind is to create and share data quality reports with your team. CRM dashboards can be utilized to show this data across departments, or quarterly scorecards can be created and emailed to the entire organization to keep everyone looped in.

Whichever way you choose to approach it, creating a standardized method for cataloguing data quality reports is essential. It keeps everyone thinking about data quality and the role they play in maintaining it. It’s also a tangible way to reflect on your data quality progress (Or setbacks…but hopefully not!) and share that progress with stakeholders in the organization.

Data Discovery

The final (but often overlooked) step of the DQM lifecycle is the data discovery process. Data discovery deepens DQM by gathering, analyzing, and reporting on the metadata associated to your data.

As a simple definition, metadata is just data…about data. It adds context to your data to reveal deeper insights on where data is located, what it means, how it’s being used, and more. For example, let’s say you have a .png image on your computer. The metadata for that .png would be its file size, height and width, image location, and the image resolution. Just as the image file in our example has underlying data tied to it, so does your business information.

Metadata management improves data quality because it helps identify “bad data” and allows organizations to better categorize and regulate how information fits with their data quality management strategy. Towards this end, metadata must be defined, captured, and managed along with your other data. Keeping metadata in a repository (aka metadata database) or using an ETL (like StarfishETL!) are two tools that can help you manage your metadata.

A solid data discovery strategy requires:

  • Defining the metadata’s role in your organization
  • Setting metadata usage guidelines and update procedures
  • Defining who is responsible for monitoring and enforcing the guidelines
  • Drafting clearly written documentation on meta data for users to reference
  • Having a method to consolidate metadata from multiple sources
  • Having a designated place to store that consolidated information

There's a lot of information to digest here, but this is the basic gist of data quality management. If you're interested in ETL and/or data cleaning services for your data management, contact our team. We can help you on your data quality management journey.