Catur Approach to Assess the Quality of Big Data Using Decision Tree and Multidimensional Model

dc.contributor.authorK., Pradheep Kumar
dc.date.accessioned2023-01-17T10:30:43Z
dc.date.available2023-01-17T10:30:43Z
dc.date.issued2015
dc.description.abstractThis paper is intended to design and develop multidimensional and decision tree based frameworks, for assessing the quality of a big data. Since the datasets represented in a big data environment is both complex and multidimensional, the quality of big data can be better viewed through multiple dimensions. Most enterprises face number of challenges in managing the quality of the big data during their initial setup or migration from traditional database or after building the big data. This paper uses multidimensional model proposed for Knowledge Management System for designing critical quality dimensions for big data. Based on the extensive literature review, this work proposes a classification of big data quality into many quality factors such as accessibility, consistency, integrity, usability, relevance, completeness, compatibility, conformity and accuracy. Since there are very few appropriate data stewards or frameworks available for confirmation of quality dimensions, this paper aims to develop some hybrid approaches using multi-dimensional model and decision tree based methods for automatic quality checks. Using decision tree, multiple if-then rules can be formed to decide on the quality of data based on the specific constraints developed for big data. The paper also aims to provide the quality framework and measures which can serve as a data quality firewall just like an internet firewall to proactively find the quality issues and apply the rules based on the decision tree algorithms to prevent bad or inconsistent or invalid data or access entering in to the big data environment.en_US
dc.identifier.urihttp://www.ajbasweb.com/old/ajbas/2015/July/503-508.pdf
dc.identifier.urihttp://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8525
dc.language.isoenen_US
dc.publisherAENSI Publisheren_US
dc.subjectComputer Scienceen_US
dc.subjectBig Dataen_US
dc.subjectDecision Treeen_US
dc.subjectData Qualityen_US
dc.subjectAssessmenten_US
dc.subjectMeta dataen_US
dc.titleCatur Approach to Assess the Quality of Big Data Using Decision Tree and Multidimensional Modelen_US
dc.typeArticleen_US

Files

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: