DSpace logo

Please use this identifier to cite or link to this item: http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/8456
Full metadata record
DC FieldValueLanguage
dc.contributor.authorChalla, Jagat Sesh-
dc.contributor.authorSharma, Yashvardhan-
dc.date.accessioned2023-01-11T10:45:44Z-
dc.date.available2023-01-11T10:45:44Z-
dc.date.issued2012-
dc.identifier.urihttps://www.academia.edu/24606146/HADCLEAN_A_Hybrid_Approach_to_Data_Cleaning_in_Data_Warehouses-
dc.identifier.urihttp://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8456-
dc.description.abstractData cleaning is an essential step in populating and maintaining data warehouses. Owing to likely differences in conventions between the external sources and the target data warehouse, as well as due to a variety of errors, data from external sources may not conform to the standards and requirements at the data warehouse. Therefore, data has to be transformed and cleaned before it is loaded into the warehouse so that downstream data analysis is reliable and accurate. This is usually accomplished through an Extract-Transform-Load (ETL) process. Typical data cleaning tasks include record matching, de-duplication, and column segmentation which often go beyond traditional relational operators. This has led to the development of a broad range of methods intending to enhance the accuracy and thereby the usability of existing data. Data cleansing is the first step, and most critical, in a Business Intelligence (BI) or Data Warehousing (DW) project, yet easily the most underestimated. T. Redman [1] suggests that the cost associated with poor quality data is about 8-12% of the revenue of a typical organization. Thus, it is very significant to perform data cleaning process for building any enterprise data warehouse.en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.subjectComputer Scienceen_US
dc.subjectPNRSen_US
dc.subjectHADCLEANen_US
dc.subjectTransitive closureen_US
dc.subjectPhonetic algorithmen_US
dc.subjectData Warehouseen_US
dc.titleHADCLEAN: A hybrid approach to data cleaning in data warehousesen_US
dc.typeArticleen_US
Appears in Collections:Department of Computer Science and Information Systems

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.