Available with Data Reviewer license.
Determine quality business needs
One of the challenges of identifying data quality control methods is translating them into technical data quality requirements for your organization. It is important to identify and understand the business requirements for your data before translating those into technical data quality requirements.
Every organization needs to understand the final outcome of the data and its purpose, how will it be used, and what maintenance processes will be engaged. The following diagram illustrates an example of different sources that can be used to identify data quality requirements. Data quality requirements are essential for building Data Reviewer rules.
Reviewer rules are preconfigured checks that enable you to validate features based on specific conditions that are not compliant with established data quality requirements defined by your organization.
Reviewer rules are created for the purpose of detecting anomalies with features, attributes, and relationships in your database. Reviewer rule results are logged in a Reviewer session, which is used to manage the life cycle of the data analysis.
Depending on your industry, data quality requirements can often be found in a project requirements document or a quality assurance plan document.
Quality assurance plan
A quality assurance (QA) plan is a document that identifies which quality standards are relevant to a project and determines how to satisfy them. A QA plan is a great resource to identify technical data quality requirements that need to be converted to Reviewer rules.
The following are QA techniques or documents that are important for identifying data quality requirements:
- Requirements Traceability Matrix—A document that co-relates any two-baseline documents (one being the source requirements collected for the project and the other being the capabilities of a software product) and requires a many-to-many relationship to check the completeness of the relationship. It is used to track the requirements and to ensure the current project requirements are met.
A traceability matrix is created by associating requirements with the software products that satisfy them.
- ISO/TC 211 Geographic information/Geomatics—Responsible for the ISO geographic information series of standards for geographic information to define methods, tools, and services for data management for acquiring, processing, analyzing, accessing, presenting, and transferring such data in digital form among users, systems, and locations.
Data quality requirements
Data quality requirements describe a certain aspect required for a dataset to be used and accurate. Data quality requirements can be sorted into the following six basic categories:
The presence or absence of features, their attributes, and relationships in a data model.
A degree of adherence to preestablished rules of a data model's structure, attribution, and relationships as defined by an organization or industry. Many industries follow standards that are reflected in a geospatial data model as value domains, data formats, and topological consistency of how the data is being stored.
The accuracy of the position of features in relation to Earth.
The accuracy of attributes within features and their appropriate relationships.
The quality of temporal attributes and temporal relationship of features.
A data quality requirement to an application and its related functional requirements.