Date of Award
5-2017
Culminating Project Type
Starred Paper
Degree Name
Computer Science: M.S.
Department
Computer Science and Information Technology
College
School of Science and Engineering
First Advisor
Jie Hu Meichsner
Second Advisor
Andrew A. Anda
Third Advisor
Jim Q. Chen
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Keywords and Subject Headings
Parallel processing, MapReduce, Hadoop, Big Data
Abstract
No doubt we are entering the big data epoch. The datasets have gone from small to super large scale, which not only brings us benefits but also some challenges. It becomes more and more difficult to handle them with traditional data processing methods. Many companies have started to invest in parallel processing frameworks and systems for their own products because the serial methods cannot feasibly handle big data problems. The parallel database systems, MapReduce, Hadoop, Pig, Hive, Spark, and Twister are some examples of these products. Many of these frameworks and systems can handle different kinds of big data problems, but none of them can cover all the big data issues. How to wisely use existing parallel frameworks and systems to deal with large-scale data becomes the biggest challenge. We investigate and analyze the performance of parallel processing for big data. We review and analyze various parallel processing architectures and frameworks, and their capabilities for large-scale data. We also present the potential challenges on multiple techniques according to the characteristics of big data. At last, we present possible solutions for those challenges.
Recommended Citation
Luo, Cheng, "Survey of Parallel Processing on Big Data" (2017). Culminating Projects in Computer Science and Information Technology. 18.
https://repository.stcloudstate.edu/csit_etds/18
Comments/Acknowledgements
I would like to thank my advisor Dr. Meichsner for offering a lot of valuable help and suggestions to my paper work. Without her help, I cannot finish this paper smoothly. I would also like to thank the committee members Dr. Anda and Dr. Chen for sharing their valuable time and advice on my paper research work.