A Context-Based Performance Enhancement Algorithm for Columnar Storage in MapReduce with Hive

Sharma, Yashvardhan

DSpace Home
→
BITS Faculty Publications
→
Department of Computer Science and Information Systems
→
View Item

dc.contributor.author	Sharma, Yashvardhan
dc.date.accessioned	2023-01-02T09:34:46Z
dc.date.available	2023-01-02T09:34:46Z
dc.date.issued	2013
dc.identifier.uri	https://www.igi-global.com/article/a-context-based-performance-enhancement-algorithm-for-columnar-storage-in-mapreduce-with-hive/105509
dc.identifier.uri	http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8207
dc.description.abstract	To achieve high reliability and scalability, most large-scale data warehouse systems have adopted the cluster-based architecture. In this context, MapReduce has emerged as a promising architecture for large scale data warehousing and data analytics on commodity clusters. The MapReduce framework offers several lucrative features such as high fault-tolerance, scalability and use of a variety of hardware from low to high range. But these benefits have resulted in substantial performance compromise. In this paper, we propose the design of a novel cluster-based data warehouse system, Daenyrys for data processing on Hadoop – an open source implementation of the MapReduce framework under the umbrella of Apache. Daenyrys is a data management system which has the capability to take decision about the optimum partitioning scheme for the Hadoop's distributed file system (DFS). The optimum partitioning scheme improves the performance of the complete framework. The choice of the optimum partitioning is query-context dependent. In Daenyrys, the columns are formed into optimized groups to provide the basis for the partitioning of tables vertically. Daenyrys has an algorithm that monitors the context of current queries and based on the observations, it re-partitions the DFS for better performance and resource utilization. In the proposed system, Hive, a MapReduce-based SQL-like query engine is supported above the DFS.	en_US
dc.language.iso	en	en_US
dc.publisher	IGI Global	en_US
dc.subject	Computer Science	en_US
dc.subject	Algorithm	en_US
dc.subject	Data Warehouse System	en_US
dc.title	A Context-Based Performance Enhancement Algorithm for Columnar Storage in MapReduce with Hive	en_US
dc.type	Article	en_US

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Department of Computer Science and Information Systems [1099]

Show simple item record

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

A Context-Based Performance Enhancement Algorithm for Columnar Storage in MapReduce with Hive

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account