Verification and Implementation of Call Sequence Analysing Algorithm
Toikka, Santeri (2016-05)
Verification and Implementation of Call Sequence Analysing Algorithm
Toikka, Santeri
(05 / 2016)
Turun yliopisto
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2025031417980
https://urn.fi/URN:NBN:fi-fe2025031417980
Tiivistelmä
In this thesis, a Call Sequence Analysing algorithm is analysed and verified (CSAalgorithm). The goal of the algorithm is to label similar calls to groups for more accurate error analysis.
Thesis presents different machine learning main categories and common differences between studies. It also presents different related data types and environments, where CSAalgorithm is designed to operate.
CSA reads network management data series as input and groups similar series together. Algorithm performance is evaluated with the help of 94 manually sorted sample series. Series data is collected from live 3G Radio Network Controller. Two series are compared and a describing figure distance is calculated. This thesis focuses specifically to Hamming-distance.
Basing on the describing figure distance, series are grouped together. Adjusting the maximum allowed distance, which defines the limit how dis-similar two series can be before they are considered to belong into a same group, a cluster is formed. Increment of the maximum allowed distance also increases the number of false positives.
Clustering composition is evaluated against manually sorted reference clustering. Performance is presented as a function for maximum allowed distance.
Thesis shows, how selected attribute edit distance, does not perform as a distance metric, and the cluster composition does not reach accepted level. As a further study, as toolset and working methodology is suggested for faster prototyping.
Thesis presents different machine learning main categories and common differences between studies. It also presents different related data types and environments, where CSAalgorithm is designed to operate.
CSA reads network management data series as input and groups similar series together. Algorithm performance is evaluated with the help of 94 manually sorted sample series. Series data is collected from live 3G Radio Network Controller. Two series are compared and a describing figure distance is calculated. This thesis focuses specifically to Hamming-distance.
Basing on the describing figure distance, series are grouped together. Adjusting the maximum allowed distance, which defines the limit how dis-similar two series can be before they are considered to belong into a same group, a cluster is formed. Increment of the maximum allowed distance also increases the number of false positives.
Clustering composition is evaluated against manually sorted reference clustering. Performance is presented as a function for maximum allowed distance.
Thesis shows, how selected attribute edit distance, does not perform as a distance metric, and the cluster composition does not reach accepted level. As a further study, as toolset and working methodology is suggested for faster prototyping.