kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Automating the Detection of Code Vulnerabilities by Analyzing GitHub Issues
Delft University of Technology.ORCID iD: 0009-0007-7989-6725
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0009-0000-4604-1180
RISE Research Institutes of Sweden.ORCID iD: 0000-0002-9780-873X
Red Hat.ORCID iD: 0000-0002-0722-2656
Show others and affiliations
2025 (English)In: Proceedings of IEEE/ACM International Workshop on Large Language Models for Code 2025, LLM4Code 2025, Institute of Electrical and Electronics Engineers (IEEE), 2025Conference paper, Published paper (Refereed)
Abstract [en]

In today’s digital landscape, the importance of timely and accurate vulnerability detection has significantly increased. This paper presents a novel approach that leverages transformer-based models and machine learning techniques to automate the identification of software vulnerabilities by analyzing GitHub issues. We introduce a new dataset specifically designed for classifying GitHub issues relevant to vulnerability detection. We then examine various classification techniques to determine their effectiveness. The results demonstrate the potential of this approach for real-world application in early vulnerability detection, which could substantially reduce the window of exploitation for software vulnerabilities. This research makes a key contribution to the field by providing a scalable and computationally efficient framework for automated detection, enabling the prevention of compromised software usage before official notifications. This work has the potential to enhance the security of open-source software ecosystems.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025.
Keywords [en]
Vulnerability Detection, Transformer-based Models, Large Language Models, LLMs, Embedding Models
National Category
Computer Systems Computer Sciences Computer Vision and Learning Systems
Identifiers
URN: urn:nbn:se:kth:diva-374904DOI: 10.1109/LLM4Code66737.2025.00010ISI: 001554529600006Scopus ID: 2-s2.0-105009082420ISBN: 979-8-3315-2615-3 (print)OAI: oai:DiVA.org:kth-374904DiVA, id: diva2:2025736
Conference
2025 IEEE/ACM International Workshop on Large Language Models for Code, LLM4Code 2025, Ottawa, ON, Canada, May 3, 2025
Projects
Digital Futures
Funder
Knut and Alice Wallenberg FoundationVinnova, 2023-03003Swedish Research Council, 2021-0421
Note

Part of ISBN 979-8-3315-2615-3

QC 20260108

Available from: 2026-01-07 Created: 2026-01-07 Last updated: 2026-01-08Bibliographically approved

Open Access in DiVA

fulltext(1320 kB)24 downloads
File information
File name FULLTEXT01.pdfFile size 1320 kBChecksum SHA-512
c5b160ff17325e97f8b229e410b45040651d65be826cdd255a79b7f5454fb40176cb2261b358154a53f964591c6389768cb7f45e440d84638d666f7da729c78f
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Wang, ChangjieScazzariello, MarianoKostic, DejanChiesa, Marco

Search in DiVA

By author/editor
Cipollone, DanieleWang, ChangjieScazzariello, MarianoFerlin, SimoneIzadi, MalihehKostic, DejanChiesa, Marco
By organisation
Software and Computer systems, SCS
Computer SystemsComputer SciencesComputer Vision and Learning Systems

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 5061 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf