Unsupervised Host Profiling Based on Traffic Behaviour
Ahde, Heini (2018-11-01)
Unsupervised Host Profiling Based on Traffic Behaviour
Ahde, Heini
(01.11.2018)
Tätä artikkelia/julkaisua ei ole tallennettu UTUPubiin. Julkaisun tiedoissa voi kuitenkin olla linkki toisaalle tallennettuun artikkeliin / julkaisuun.
Turun yliopisto
Tiivistelmä
As a consequence of digitization, cyberattacks have become a more prevalent threat to
organizations. One option to detect attack attempts is an intrusion detection system that
is designed to monitor a system and alerts when it comes across abnormal behaviour. Intrusion
detection systems can be categorized into host-based (HIDS) or network-based
(NIDS) systems by an architectural solution. These systems can either make a decision
on abnormal behaviour based on predetermined rules or behaviour can be classified as
harmful by looking for traffic exceptions through profiling. The later mentioned is a more
reliable approach to detecting intrusions, as it also detects novel attacks. However, data
stored in the world is constantly growing and an increasing amount of datasets represent
so-called unsupervised data of which the content or structure is unknown. Creating profiles
requires a prior knowledge of what features differentiate profiles from each other and
therefore, data mining may be an effective way to be integrated into the intrusion detection
system as it finds unknown knowledge from a large amount of data. In addition, data
mining can reduce irrelevant data so that intrusion attempts can be detected almost in real
time.
This master’s thesis deals with intrusion detection systems, especially with the focus on
unsupervised host behavioural profiling. The literature section examines various data
mining techniques for intrusion detection systems. It also describes the complications
and possible solutions of the implementations in these sub-areas. The experimental study
section describes methods that can be used to implement host profiling and presents the
network connection clustering where experimental unsupervised data was collected using
F-Secure Rapid Detection Service.
The results of this master’s thesis indicate that data mining methods bring added value
to intrusion detection systems that observe unsupervised data. Despite this, multidimensional
datasets and the systems random behaviour brings challenges to the reliability of
these intrusion detection systems and may occur as false alarms. This thesis provides a
good basis for further development of profiling for Rapid Detection Service and other
unsupervised data processors.
organizations. One option to detect attack attempts is an intrusion detection system that
is designed to monitor a system and alerts when it comes across abnormal behaviour. Intrusion
detection systems can be categorized into host-based (HIDS) or network-based
(NIDS) systems by an architectural solution. These systems can either make a decision
on abnormal behaviour based on predetermined rules or behaviour can be classified as
harmful by looking for traffic exceptions through profiling. The later mentioned is a more
reliable approach to detecting intrusions, as it also detects novel attacks. However, data
stored in the world is constantly growing and an increasing amount of datasets represent
so-called unsupervised data of which the content or structure is unknown. Creating profiles
requires a prior knowledge of what features differentiate profiles from each other and
therefore, data mining may be an effective way to be integrated into the intrusion detection
system as it finds unknown knowledge from a large amount of data. In addition, data
mining can reduce irrelevant data so that intrusion attempts can be detected almost in real
time.
This master’s thesis deals with intrusion detection systems, especially with the focus on
unsupervised host behavioural profiling. The literature section examines various data
mining techniques for intrusion detection systems. It also describes the complications
and possible solutions of the implementations in these sub-areas. The experimental study
section describes methods that can be used to implement host profiling and presents the
network connection clustering where experimental unsupervised data was collected using
F-Secure Rapid Detection Service.
The results of this master’s thesis indicate that data mining methods bring added value
to intrusion detection systems that observe unsupervised data. Despite this, multidimensional
datasets and the systems random behaviour brings challenges to the reliability of
these intrusion detection systems and may occur as false alarms. This thesis provides a
good basis for further development of profiling for Rapid Detection Service and other
unsupervised data processors.