DORA: Feature selection for network-based intrusion detection models

Date of Publication

2012

Document Type

Bachelor's Thesis

Degree Name

Bachelor of Science in Computer Science

College

College of Computer Studies

Department/Unit

Computer Science

Thesis Adviser

Arlyn Verina L. Ong

Abstract/Summary

Intrusion Detection System (IDS) use models as a basis for detecting intrusions. To ensure that these models are comprehensive enough, a huge and highly-dimensional data must be fed to the system. In this study, the data set will contain a huge amount of normal traffic data and a sufficient number of network intrusions data to ensure that the model will be able to correctly classify intrusions. Often, data set are noisy – meaning, it contains a lot of redundant data along with the irrelevant features that can only compromise the classification accuracy and performance of the generated model. To avoid this, the redundant data must be filtered and irrelevant features must be dropped. The goal of this study is to determine what the best features are for an intrusion detection model, which is highly dependent upon the feature selection algorithms that will be tested against the same data set. The findings of the study shows that the combined packet headers and n-grams s feature set can dramatically increase the classifications accuracy of the model being built. The results also proved that selecting only the best features from the entire feature set can increase the classification accuracy of the intrusion detection model even further. Based on the test results, the best performing algorithms are Decision Trees while the best feature selection algorithm is the N-Gram Information Gain, given the data set.

Abstract Format

html

Language

English

Format

Print

Accession Number

TU16770

Shelf Location

Archives, The Learning Commons, 12F, Henry Sy Sr. Hall

Physical Description

1 v. (various foliations) ; 28 cm.

This document is currently not available here.

Share

COinS