CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Log Classification using a Shallow-and-Wide Convolutional Neural Network and Log Keys
KTH, School of Electrical Engineering and Computer Science (EECS).
2018 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Logklassificering med ett grunt-och-brett faltningsnätverk och loggnycklar (Swedish)
Abstract [en]

A dataset consisting of logs describing results of tests from a single Build and Test process, used in a Continous Integration setting, is utilized to automate categorization of the logs according to failure types. Two different features are evaluated, words and log keys, using unordered document matrices as document representations to determine the viability of log keys. The experiment uses Multinomial Naive Bayes, MNB, classifiers and multi-class Support Vector Machines, SVM, to establish the performance of the different features. The experiment indicates that log keys are equivalent to using words whilst achieving a great reduction in dictionary size. Three different multi-layer perceptrons are evaluated on the log key document matrices achieving slightly higher cross-validation accuracies than the SVM. A shallow-and-wide Convolutional Neural Network, CNN, is then designed using temporal sequences of log keys as document representations. The top performing model of each model architecture is evaluated on a test set except for the MNB classifiers as the MNB had subpar performance during cross-validation. The test set evaluation indicates that the CNN is superior to the other models.

Abstract [sv]

Ett dataset som består av loggar som beskriver resultat av test från en bygg- och testprocess, använt i en miljö med kontinuerlig integration, används för att automatiskt kategorisera loggar enligt olika feltyper. Två olika sorters indata evalueras, ord och loggnycklar, där icke- ordnade dokumentmatriser används som dokumentrepresentationer för att avgöra loggnycklars användbarhet. Experimentet använder multinomial naiv bayes, MNB, som klassificerare och multiklass-supportvektormaskiner, SVM, för att avgöra prestandan för de olika sorternas indata. Experimentet indikerar att loggnycklar är ekvivalenta med ord medan loggnycklar har mycket mindre ordboksstorlek. Tre olika multi-lager-perceptroner evalueras på loggnyckel-dokumentmatriser och får något högre exakthet i krossvalideringen jämfört med SVM. Ett grunt-och-brett faltningsnätverk, CNN, designas med tidsmässiga sekvenser av loggnycklar som dokumentrepresentationer. De topppresterande modellerna av varje modellarkitektur evalueras på ett testset, utom för MNB-klassificerarna då MNB har dålig prestanda under krossvalidering. Evalueringen av testsetet indikerar att CNN:en är bättre än de andra modellerna.

Place, publisher, year, edition, pages
2018.
Series
TRITA-EECS-EX ; 2018-635
Keywords [en]
log classification, log keys, natural language processing
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:kth:diva-239561OAI: oai:DiVA.org:kth-239561DiVA, id: diva2:1265893
External cooperation
Joakim Oscarsson, Cybercom
Educational program
Master of Science in Engineering - Information and Communication Technology; Master of Science - Machine Learning
Presentation
2018-09-11, 4523, Lindstedtsvägen 5, Stockholm, 11:00 (English)
Supervisors
Examiners
Available from: 2018-12-03 Created: 2018-11-26 Last updated: 2018-12-03Bibliographically approved

Open Access in DiVA

fulltext(1080 kB)5 downloads
File information
File name FULLTEXT01.pdfFile size 1080 kBChecksum SHA-512
fb214856df038181cdfc8105a6eb49cb60e2f803cca3c8e2e86b00a0304fd6192740cf78f6c44df0149fe5ffe240a923ec68162fc54b2798d5354ae2efd513a9
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Other Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 5 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 32 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf