Change search
ReferencesLink to record
Permanent link

Direct link
Data Analysis of Minimally-Structured Heterogeneous Logs: An experimental study of log template extraction and anomaly detection based on Recurrent Neural Network and Naive Bayes.
KTH, School of Computer Science and Communication (CSC).
2016 (English)Independent thesis Advanced level (degree of Master (Two Years)), 80 credits / 120 HE creditsStudent thesis
Abstract [en]

Nowadays, the ideas of continuous integration and continuous delivery are under heavy usage in order to achieve rapid software development speed and quick product delivery to the customers with good quality. During the process ofmodern software development, the testing stage has always been with great significance so that the delivered software is meeting all the requirements and with high quality, maintainability, sustainability, scalability, etc. The key assignment of software testing is to find bugs from every test and solve them.

The developers and test engineers at Ericsson, who are working on a large scale software architecture, are mainly relying on the logs generated during the testing, which contains important information regarding the system behavior and software status, to debug the software. However, the volume of the data is too big and the variety is too complex and unpredictable, therefore, it is very time consuming and with great efforts for them to manually locate and resolve the bugs from such vast amount of log data.

The objective of this thesis project is to explore a way to conduct log analysis efficiently and effectively by applying relevant machine learning algorithms in order to help people quickly detect the test failure and its possible causalities. In this project, a method of preprocessing and clusering original logs is designed and implemented in order to obtain useful data which can be fed to machine learning algorithms. The comparable log analysis, based on two machine learning algorithms - Recurrent Neural Network and Naive Bayes, is conducted for detecting the place of system failures and anomalies. Finally, relevant experimental results are provided and analyzed.

Place, publisher, year, edition, pages
2016. , 88 p.
Keyword [en]
Data analysis, Log analysis, RNN, Naive Bayes
National Category
Engineering and Technology
URN: urn:nbn:se:kth:diva-191334OAI: diva2:956109
External cooperation
Educational program
Master of Science - Embedded Systems
2016-08-26, 12:24 (English)
Available from: 2016-08-30 Created: 2016-08-29 Last updated: 2016-08-30Bibliographically approved

Open Access in DiVA

fulltext(2803 kB)176 downloads
File information
File name FULLTEXT01.pdfFile size 2803 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
School of Computer Science and Communication (CSC)
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 176 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 30 hits
ReferencesLink to record
Permanent link

Direct link