Date of Award
12-2016
Culminating Project Type
Thesis
Degree Name
Information Assurance: M.S.
Department
Information Assurance and Information Systems
College
Herberger School of Business
First Advisor
Dennis Guster
Second Advisor
Jim Chen
Third Advisor
Mark Schmidt
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Abstract
The goal of this thesis is to establish a benchmark comparison of custom Java based code efficiency as it relates to similar MapReduce jobs. Four separate tasks were completed with custom Java and MapReduce code to produce the identical output. Network pcap data was analyzed with tshark, and the resulting text file used as input for the programs to be run. Each code base was required to determine the following information from the tshark data: a summation of the number of port access attempts by source IP address, the total traffic volume by IP protocol, the average packet length by source IP address, and the percentage of traffic volume by source IP address. All tests were performed within an Amazon Web Services environment, and multiple test runs were executed to ensure the overall efficiency was not affected by possible shared resources. A cost-benefit analysis was performed to determine a point in which MapReduce and Hadoop clusters are worth the extra cost of additional hardware based upon the cost comparison of one AWS EC2 instance versus a four cluster HDFS system.
Recommended Citation
Munsch, Jonathan C., "Network Log Analysis Performance Comparison - Java vs. MapReduce" (2016). Culminating Projects in Information Assurance. 15.
https://repository.stcloudstate.edu/msia_etds/15
Comments/Acknowledgements
I would like to thank my wife, Andrea Munsch for all her support and guidance during the entire thesis process. I could not have completed this without her love, encouragement, and advice.