MapReduce Performance Scaling Using Data Prefetching

Lee, Jung and Tae Kim, Kyung and Youn-Chen, Tso (2022) MapReduce Performance Scaling Using Data Prefetching. International Journal of Computer Techniques, 9 (3). pp. 26-31. ISSN 2394-2231

Text (MapReduce Performance Scaling Using Data Prefetching)
IJCT_TSOCHEN.pdf - Published Version

Download (745kB) | Preview
Official URL:


Recently, due to the advent of social networks, bio-computing, and the Internet of Things, more data is being generated than in the existing IT environment, and as a result, research on efficient large-capacity data processing techniques are being conducted. MapReduce is an effective programming model for data-intensive computational applications. A typical MapReduce application includes Hadoop, which is being developed and supported by the Apache Software Foundation. This paper proposes a data prefetching technique and a streaming technique to improve the performance of Hadoop MapReduce. One of the performance issues of Hadoop MapReduce is work delay due to input data transmission in the MapReduce process. In order to minimize this data transfer time, a prefetching thread in charge of data transfer was created separately, unlike the existing MapReduce. As a result, data transmission became possible even during the MapReduce operation of data, reducing the overall data processing time. Even with this prefetching technique, the job waits for the first data transmission due to the characteristics of Hadoop MapReduce. To reduce this waiting time, the streaming technique was used to further reduce the waiting time due to data transmission. Mathematical modeling was performed to measure the performance of the proposed method, and as a result of the performance measurement, it was confirmed that the performance of MapReduce to which the streaming method was additionally applied was improved compared to MapReduce to which only the existing Hadoop MapReduce and prefetching methods were applied.

Item Type: Article
Keywords (Kata Kunci): MapReduce,Prefetching,Streaming.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Divisions: Fakultas Matematika dan ilmu Pengetahuan Alam
Depositing User: Dr Tso Chen
Date Deposited: 13 Jul 2022 04:27
Last Modified: 13 Jul 2022 04:27

Actions (login required)

View Item View Item