Community
Data is increasingly valued as an asset for companies, so ensuring that data is of high quality is imperative. Progressively, in the world of IoT, we are seeing machines make decisions. These decisions either provide insights into what's most beneficial for customers or reduce their service-relationship anxiety. In order for machine learning models to generate actionable insights, diverse data of high quality must be available in real-time.
In the 1960s, data was supposedly managed in silos, often physically while there were also limited skillsets to churn insights. However, people who were investing in curating superior quality information reaped better revenues.
Then there is the emergence of business intelligence, which can be termed as a vintage capability today. Yet, it is an effective way of consuming data for reports and analytical models. The quality of data is assessed before it is loaded into a warehouse in terms of contextual dimensions of quality, such as validity. Such models can be termed as generation-1 data quality management models.
However, to formalize the management of quality, a function can be set up. Standardization of data quality can be emphasized through data governance. Data Governance will ensure that certain actors, follow repeatable processes to complete data quality assessment, root cause analysis, and issue management to recover and resolve data issues. As a result of the policies and guidelines, which define roles and responsibilities, and processes for ensuring accountability and ownership of data, active management of data quality is possible. Certain important dimensions to assess quality of data -
Completeness — Does the data meet your expectations of what's complete? Column, Row, or Group completeness; Fill rate
Consistency — ensuring structural, semantic consistency and enforcing business-policy
Timeliness — Is data having a system or manual lag?
Validity — Is data streamed in a designated format and is it usable as per standards
Uniqueness — Does similar information exist as an instance within the data structure or ecosystem?
This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.
Boris Bialek Vice President and Field CTO, Industry Solutions at MongoDB
11 December
Kathiravan Rajendran Associate Director of Marketing Operations at Macro Global
10 December
Barley Laing UK Managing Director at Melissa
Scott Dawson CEO at DECTA
Welcome to Finextra. We use cookies to help us to deliver our services. You may change your preferences at our Cookie Centre.
Please read our Privacy Policy.