The Use of Deep Data Locality towards a Hadoop Performance Analysis Framework

Authors

  • WICKRAMASINGHE .K Information Institute of Technology, Colombo, Sri Lanka

Keywords:

deep data locality, HDFS, data locality, Hadoop, MapReduce

Abstract

In big data systems, one of the base models in the recent past entails Hadoop, which holds that it is cheaper to move computation, compared to the decision to move data. Indeed, using Hadoop comes with an increase in the HDFS system’s data locality, translating into improved system performance. Also, in bid data systems, there is a reduction in the network traffic between or among the selected nodes due to machine data-local increase. Currently, however, a mathematical performance framework for Hadoop data locality is yet to be established. In this study, a framework for analyzing Hadoop performance was proposed based on data locality, seeking to ensure that the MapReduce procedure is analyzed in the entirety. The objective was to discern the extent to which the framework for analyzing Hadoop performance could be applied towards an improvement in the Hadoop system performance, especially through the making of a deep data locality. In the findings, it was established that when three tests in the form of a physical test, a cloud test, and a simulation base test are applied, the deep data locality approach yields a significant improvement in the Hadoop performance. Particularly, the use of the deep data locality technique led to a 34 percent improvement in the Hadoop system. Thus, it was concluded that the superiority of the proposed approach arises from its ability to yield a reduction in the HDFS data movement.

Downloads

Published

2023-05-20

How to Cite

.K, W. (2023). The Use of Deep Data Locality towards a Hadoop Performance Analysis Framework. International Journal of Communication and Computer Technologies, 8(1), 5–8. Retrieved from https://ijccts.org/index.php/pub/article/view/120

Issue

Section

Research Article