DQ Profile
Definition of DQ and data fitness for use in a given context.
This page presents a brief practical introduction on how to tackle Data Quality (DQ) on the light of the conceptual framework ont the Biodiversity Informatics context, specially, but not exclusive, in the TDWG and GBIF community context. Due to the comprehensiveness of the conceptual framework, it allows different interpretations and manners of using it according to different stakeholders. To contextualize how those different stakeholders can take advantage of the conceptual framework, we selected four stakeholders to describe their role in DQ context:
The original version of the framework can be found in [Veiga 2017], which is highly comprehensive and formal, composed by 29 interrelated concepts.
In this page we present a lite view of the framework according to a practical perspective. In this context, the framework will be approached according to three main components: DQ Profile, DQ Solutions and DQ Report.
DQ Profile defines a structure to describe the meaning of data fitness for use in a given context. A DQ Profile describes DQ needs requirements for a given context/scope. In order to implement and apply such requirements on data, it is necessary to use a set of DQ Solutions that involve methods and mechanisms applied to meet DQ Profiles requirements.
Definition of DQ and data fitness for use in a given context.
DQ Solutions define a structure to describe methods (technical specifications) and mechanisms (tools that act on data) in order meet the DQ Profile requirements. DQ Solutions operate on Data Resources (both single records and multi records) and generate DQ Assertions assigned to each Data Resource. A set of selected DQ Assertions represents a DQ Report.
Set of methods (technical specifications) and mechanisms (tools that act on data) in order meet the DQ Profile requirements.
A DQ Report defines a set of selected DQ Assertions assigned to a Data Resources according to a DQ Profile requirements. With a DQ Report assigned to a Data Resource, data users, holders, aggregators and custodians are enabled to assess and improve the quality of the Data Resource according to the related DQ Profile definition.
Set of selected DQ Assertions assigned to a Data Resources according to a DQ Profile requirements.
Due to the idiosyncratic nature of the concept of “quality”, it is essential to understand what "data fitness for use" means according to the data user/handler’s perspective in order to enable the DQ assessment and management.
In this context, defining “data fitness for use” involves to define three elements: use, data and fitness. Accordingly, DQ Profile encompasses these elements by five main components:
In this context we propose a method to define a DQ Profile composed by five steps: (1) Define a Use Case; (2) Define the valuable IE in the context of the Use Case; (3) Define a DQ Measurement Policy in the Use Case context; (4) Define a DQ Validation Policy in the Use Case context and; (5) Define a DQ Improvement Policy in the Use Case context. Next, we present a brief description of each step.
This is an interactive tool for building a DQ Profile following 4 main steps.
Soon.
Content under development. Subscribe GitHub to be informed on updates.
Soon.
Soon.
Soon.
Content under development. Subscribe GitHub to be informed on updates.
Soon.
Soon.
Soon.
Content under development. Subscribe GitHub to be informed on updates.
Soon.
Soon.
Soon.