Date of Award

5-2020

Culminating Project Type

Thesis

Degree Name

Information Assurance: M.S.

Department

Information Assurance and Information Systems

College

Herberger School of Business

First Advisor

Akalanka Mailewa Dissanayaka

Second Advisor

Mark Schmidt

Third Advisor

David Robinson

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Keywords and Subject Headings

Autoencoders, Deep Learning, Anomaly Detection, Artificial Intelligence, Machine Learning, Information Assurance, Intrusion Detection System

Abstract

With the recent advances in Internet-of-thing devices (IoT), cloud-based services, and diversity in the network data, there has been a growing need for sophisticated anomaly detection algorithms within the network intrusion detection system (NIDS) that can tackle advanced network threats. Advances in Deep and Machine learning (ML) has been garnering considerable interest among researchers since it has the capacity to provide a solution to advanced threats such as the zero-day attack. An Intrusion Detection System (IDS) is the first line of defense against network-based attacks compared to other traditional technologies, such as firewall systems. This report adds to the existing approaches by proposing a novel strategy to incorporate both supervised and unsupervised learning to Intrusion Detection Systems (IDS). Specifically, the study will utilize deep Autoencoder (DAE) as a dimensionality reduction tool and Support Vector Machine (SVM) as a classifier to perform anomaly-based classification. The study diverts from other similar studies by performing a thorough analysis of using deep autoencoders as a valid non-linear dimensionality tool by comparing it against Principal Component Analysis (PCA) and tuning hyperparameters that optimizes for 'F-1 Micro' score and 'Balanced Accuracy' since we are dealing with a dataset with imbalanced classes. The study employs robust analysis tools such as Precision-Recall Curves, Average-Precision score, Train-Test Times, t-SNE, Grid Search, and L1/L2 regularization. Our model will be trained and tested on a publicly available datasets KDDTrain+ and KDDTest+.

Share

COinS