Skip to main content

Research Repository

Advanced Search

VPCH : a consistent hashing algorithm for better load balancing in a hadoop environment

Liu, Q; Cai, W; Shen, J; Wang, B; Fu, Z; Linge, N

Authors

Q Liu

W Cai

J Shen

B Wang

Z Fu

N Linge



Abstract

MapReduce (MR) is a popular programming model for the purposes of processing large data sets among data clusters or grids, e.g. a Hadoop environment. Load balancing as a key factor affecting the performance of map resource distribution, has recently gained high concerns to optimize. Current MR processes in the realization of distributing tasks to clusters use hashing with random modulo operations, which can lead to uneven data distribution and inclined loads, thereby obstruct the performance of the entire distribution system. In this paper, a virtual partition consistent hashing (VPCH) algorithm is proposed for the reduce stage of MR processes, in order to achieve such a trade-off on job allocation. According to the results, using our method can reduce task execution time with or without MJR (mapreduce.job.reduce.slowstart.completedmaps) parameter set.

Citation

Liu, Q., Cai, W., Shen, J., Wang, B., Fu, Z., & Linge, N. (2015). VPCH : a consistent hashing algorithm for better load balancing in a hadoop environment. In 2015 Third International Conference on Advanced Cloud and Big Data (69-72). IEEE. https://doi.org/10.1109/CBD.2015.21

Publication Date Jan 1, 2015
Deposit Date Dec 15, 2016
Pages 69-72
Book Title 2015 Third International Conference on Advanced Cloud and Big Data
ISBN 9781467385374
DOI https://doi.org/10.1109/CBD.2015.21
Publisher URL http://dx.doi.org/10.1109/CBD.2015.21

Downloadable Citations