The Management of Data Quality Assessment in Big Data Presents a Complex Challenge, Accompanied by Various Issues Related to Data Quality

Authors

  • D. B. Shanmugam Department of Computer Applications, SRM Institute of Science and Technology, Ramapuram Campus, Chennai, India.
  • J. Dhilipan Department of Computer Applications, SRM Institute of Science and Technology, Ramapuram Campus, Chennai, India.
  • T. Prabhu Department of Computer Applications, Dr. MGR Educational & Research Institute, Chennai, India.
  • A. Sivasankari Department of Computer Science and Applications, Shanmuga Industries Arts and Science College, Tiruvannamalai, India.
  • A. Vignesh Department of Computer Applications, SRM Institute of Science and Technology, Ramapuram Campus, Chennai, India.

DOI:

https://doi.org/10.9734/bpi/rhmcs/v8/18858D

Keywords:

Total data quality management, data quality metrics, data quality assessment

Abstract

The precondition for effective utilization and analysis of big data is high-quality data. However, comprehensive research and examination of effective estimation and appraisal techniques for big data are still lacking. This paper aims to provide an overview of data quality studies and address the challenges faced in the context of big data. The proposed data quality framework includes measurements, attributes, and files, with a focus on meeting the needs of data users. Additionally, a dynamic appraisal technique for data quality is developed based on this framework, which is scalable and flexible to address the challenges associated with big data quality appraisal.

Maintaining the quality of data is crucial for effective decision-making in support management. However, the sources of big data are vast and the data systems are complex, often leading to issues such as errors, inconsistencies, and noise. Data cleansing is a process that involves identifying and removing such issues to enhance data quality. There are four approaches to data cleansing: manual execution, use of specialized software, generic data cleaning, and domain-specific data cleaning. Among these approaches, the third one holds great practical value and can be effectively applied.

Published

2023-04-08

How to Cite

D. B. Shanmugam, J. Dhilipan, T. Prabhu, A. Sivasankari, & A. Vignesh. (2023). The Management of Data Quality Assessment in Big Data Presents a Complex Challenge, Accompanied by Various Issues Related to Data Quality. Research Highlights in Mathematics and Computer Science Vol. 8, 78–91. https://doi.org/10.9734/bpi/rhmcs/v8/18858D