Data Quality Rules

What are Data Quality Rules (Data Validation Rules)?

Data quality rules (also known as data validation rules) are, like automation rules, special forms of business rules. They clearly define the business requirements for specific data. Ideally, data validation rules should be "fit for use", i.e. appropriate for the intended purpose.

Since data quality rules can be used to check the quality of data and data records, these are the primary tools for determining data quality. By checking against the validation rules, it is possible to test whether the data meet the defined criteria and possess the required attributes. In this way, potential weak points (e.g., in processes) can be detected and recommendations for action can be derived.

Data quality rules allow for the measurement of different data quality dimensions, such as

  • The contextual accuracy of values (correctness, accuracy)
  • The consistency among values (consistency)
  • The allowed format of values (representational consistency, accuracy)
  • The completeness of values

About the author


Simon Schlosser
Head of Product

"Data quality rules are the hidden champions of data quality management because they enable both in-depth analyses of data quality as well as efficient data cleansing."


How does a data quality rule look?

A ready-to-use data quality rule is an algorithm that checks the structure, format, and arrangement of data and matches it with previously defined requirements. However, data validation rules are first generally conceptualized and described before developing an executable, machine-readable validation rule. The development process usually follows this scheme:

  1. Documentation of the technical requirements:
    • Technical description of the data quality rule
    • Determination of the importance/priority of the rule for company processes
    • Clear overview showing dependent data elements and data quality rules
  2. Translation into machine-readable format:
    • Development of a machine-readable validation rule
    • Definition of test cases
  3. Realization or implementation

The data rule should be documented in a technically oriented manner, i.e., written in the language of the specialist department so that the functionality and the requirements for the data can be understood by all those involved.

A concrete example of this would be a simple data rule that checks the postal codes in Germany within the company master data; they are only accepted as complete if their values meet the following criteria:

  • The postal code may only contain numbers and no letters or special characters (Representational consistency, format rule)
  • The postal code must be five integers long (Representational consistency, format rule)
  • The postal code must be consistent with the specified city (consistency)
  • The postal code must actually be assigned (accuracy)

Why do you need data validation rules?

Data quality rules can be used to analyze and evaluate the quality of specific data sets. The insights gained from this, such as problem areas and recommendations for action, then provide the basis for subsequent data cleansing or data enrichment, such as the removal of duplicates or the addition of missing data elements. For example, they can be used to supplement incomplete address data values.

The analysis and cleansing of data, such as business partner data, can be automated with the help of data quality rules; thus, saving on resources. Thanks to powerful cloud technologies, the processing of large data volumes is no longer a problem.


Ready to use data quality rules from CDQ Data Management Solutions

Save time and effort in development, documentation and review of quality rules.

Benefit from knowledge and good practices of global companies.

Get support for customized data quality rules implementation.

Community-based data quality rules save resources during development

By using ready-made data quality rules, the effort for implementation is reduced considerably because the resources for development, documentation and verification are largely eliminated.

Using AI/ML can even further optimize an existing set of data rules. For example, in so-called data quality rule mining, new data validation rules can be identified by pattern analysis of data sets, which ultimately leads to an improvement of existing rules.

  On average, our member companies save over 85 man-days in designing and creating data quality rules

On average, our member companies use 30% of the approximately 1,300 data quality rules. This means that business and data management professionals spend a total of 2,275 man-hours on research, documentation and testing only. These can be saved by using the already tested ready-to-use CDQ data quality rules. Another benefit is that IT saves on implementation costs for each individual rule. These often amount to several hundred euros per rule.


  Continuous savings in the maintenance of data validation rules

In addition, the companies save the annual maintenance costs for these rules, which amounts to about 85 man-days per year, in the following years.


Use of data quality rules within CDQ Data Management Services

Data quality rules form the basis of our data management services. Through their use in data validation or continuous data quality measurement, they also ensure ongoing data quality assurance for shared data among the data sharing community.

CDQ currently has more than 1,300 data quality rules, which are continually being improved upon through cooperation with the companies in our community. In this way, the effort for maintenance and further development is not only spread over several shoulders, but everyone can also benefit from the know-how of the fellow member companies.

Some rules are also checked against reference sources, such as European VAT numbers in special databases or, as in the example above, the postal codes in Germany. The community also maintains its own reference data for which there are no, or no trustworthy, external sources, such as official legal forms of businesses in a particular country.

If a company has special business requirements for a certain data format for which there is no explicit data quality rule yet, our software specialists work together with the customer to develop a fitting solution. In this way, we also enable customer-specific extensions, e.g., for individual data fields that are not otherwise used by any other member of the community.


Data Management Services

Whether data quality analytics, address validation, deduplication or bank account verification: Our innovative data management services help you to enhance the quality of your vendor and customer master data. Data Management Services

Data Sharing Community

Better quality, less manual efforts through Data Sharing: Our Data Sharing Community helps companies improve their business partner data together, share the burden of data maintenance, and reliably protect all participants against invoice fraud. Want to know more? CDQ Data Sharing Community

'Introducing DQR-1297 or The Good Data Quality Rule'

Meet our virtual employee DQR-1297 of our international AI team of data quality rules: Its main task is to identify invalid legal forms from customer and vendor organizations within all data sets. The real challenge is remembering all valid legal forms around the globe. Introducing DQR-1297
Go to top