Data Integration Quality Techniques
Data integration experts consider information quality the main attribute for business users. Not only does the user need information to be delivered on time, but s/he also wants this information to be of a certain quality. Thus, data integration quality criteria are required.
Data integration technical quality criteria, such as metrics and thresholds, should be defined first. These data quality criteria are business independent, contemporary data integration technologies can automatically evaluate them. These criteria are:
- Data types
- Data domain compliance (domains refer to a set of allowable values. For structured data, this can be a list of values, such as postal codes, a range of values between 1 and 100, etc.)
- Statistical features of the data set (maximum value, minimum value, population distributions)
- Referential relationships
Other measures of data quality involve business rules compliance. For example, a mobile operator may establish as a business rule the number of months an account has a positive credit balance. All accounts with more than two months of negative balance are considered invalid. These criteria cannot be automatically evaluated by a data integration system, although data integration technologies allow business rules to be programmed.
Business rules can automate the decisions that the company makes in its day-to-day operations. This type of data integration rules can be used to audit data for compliance with both external and internal regulations and policies.
As I have mentioned in my previous posts, before acquiring a data integration solution, an organization should establish a set of business rules that later have to be transformed into the data integration tool. Now you can see why—business rules are efficient means to achieve enhancement in data quality.