kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 99) Show all publications
Baudry, B. & Monperrus, M. (2025). Humor for graduate training. ACM Inroads
Open this publication in new window or tab >>Humor for graduate training
2025 (English)In: ACM Inroads, ISSN 2153-2184, E-ISSN 2153-2192Article in journal (Refereed) Accepted
Abstract [en]

Humor genuinely engages graduate students with their scientific training.

Keywords
humor; higher education
National Category
Engineering and Technology
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-362677 (URN)10.1145/3730408 (DOI)
Note

QC 20250424

Available from: 2025-04-23 Created: 2025-04-23 Last updated: 2025-06-13Bibliographically approved
Monperrus, M. (2025). Most Cited Papers in Software Engineering 2013-2023.
Open this publication in new window or tab >>Most Cited Papers in Software Engineering 2013-2023
2025 (English)Report (Other academic)
Abstract [en]

This compilation presents a list of the most cited research papers in software engineering from 2013 to 2023, published in leading academic venues. By leveraging APIs from CrossRef and Semantic Scholar, we systematically gather and rank influential works based on citation metrics, providing a valuable resource for researchers, educators, and industry professionals to understand the field. This document can also serve for individuals to strengthen their academic credits with impact facts. Full bibliometric data is accessible in the accompanying repository.

Publisher
p. 61
National Category
Software Engineering
Identifiers
urn:nbn:se:kth:diva-362600 (URN)10.5281/zenodo.14885765 (DOI)
Note

QC 20250424

Available from: 2025-04-22 Created: 2025-04-22 Last updated: 2025-04-30Bibliographically approved
Oliveira, D., Santos, R., de Oliveira, B., Monperrus, M., Castor, F. & Madeiral, F. (2025). Understanding Code Understandability Improvements in Code Reviews. IEEE Transactions on Software Engineering, 51(1), 14-37
Open this publication in new window or tab >>Understanding Code Understandability Improvements in Code Reviews
Show others...
2025 (English)In: IEEE Transactions on Software Engineering, ISSN 0098-5589, E-ISSN 1939-3520, Vol. 51, no 1, p. 14-37Article in journal (Refereed) Published
Abstract [en]

Context: Code understandability plays a crucial role in software development, as developers spend between 58% and 70% of their time reading source code. Improving code understandability can lead to enhanced productivity and save maintenance costs. Problem: Experimental studies aim to establish what makes code more or less understandable in a controlled setting, but ignore that what makes code easier to understand in the real world also depends on extraneous elements such as developers' background and project culture and guidelines. Not accounting for the influence of these factors may lead to results that are sound but have little external validity. Goal: We aim to investigate how developers improve code understandability during software development through code review comments. Our assumption is that code reviewers are specialists in code quality within a project. Method and Results: We manually analyzed 2,401 code review comments from Java open-source projects on GitHub and found that over 42% of all comments focus on improving code understandability, demonstrating the significance of this quality attribute in code reviews. We further explored a subset of 385 comments related to code understandability and identified eight categories of code understandability concerns, such as incomplete or inadequate code documentation, bad identifier, and unnecessary code. Among the suggestions to improve code understandability, 83.9% were accepted and integrated into the codebase. Among these, only two (less than 1%) ended up being reverted later. We also identified types of patches that improve code understandability, ranging from simple changes (e.g., removing unused code) to more context-dependent improvements (e.g., replacing method calling chains by existing API). Finally, we investigated the potential coverage of four well-known linters to flag the identified code understandability issues. These linters cover less than 30% of these issues, although some of them could be easily added as new rules. Implications: Our findings motivate and provide practical insight for the construction of tools to make code more understandable, e.g., understandability improvements are rarely reverted and thus can be used as reliable training data for specialized ML-based tools. This is also supported by our dataset, which can be used to train such models. Finally, our findings can also serve as a basis to develop evidence-based code style guides.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
Codes, Reviews, Source coding, Software development management, Documentation, Security, Natural languages, Code understandability, code understandability smells, code review
National Category
Software Engineering
Identifiers
urn:nbn:se:kth:diva-359532 (URN)10.1109/TSE.2024.3453783 (DOI)001395714800006 ()2-s2.0-85204075762 (Scopus ID)
Note

QC 20250206

Available from: 2025-02-06 Created: 2025-02-06 Last updated: 2025-02-06Bibliographically approved
Andersson, V., Baudry, B., Bobadilla, S., Christensen, L., Cofano, S., Etemadi, K., . . . Toady, T. (2025). UPPERCASE IS ALL YOU NEED. In: : . Paper presented at SIGBOVIK 2025, Carnegie Mellon University, Pittsburgh, PA, USA, April 4, 2025.
Open this publication in new window or tab >>UPPERCASE IS ALL YOU NEED
Show others...
2025 (English)Conference paper, Published paper (Other (popular science, discussion, etc.))
Abstract [en]

WE PRESENT THE FIRST COMPREHENSIVE STUDY ON THE CRITICAL YET OVERLOOKED ROLE OF UPPERCASE TEXT IN ARTIFICIAL INTELLIGENCE. DESPITE CONSTITUTING A MERE SINGLE-DIGIT PERCENTAGE OF STANDARD ENGLISH PROSE, UPPERCASE LETTERS HAVE DISPROPORTIONATE POWER IN HUMAN-AI INTERACTIONS. THROUGH RIGOROUS EXPERIMENTATION INVOLVING SHOUTING AT VARIOUS LANGUAGE MODELS, WE DEMONSTRATE THAT UPPERCASE IS NOT MERELY A STYLISTIC CHOICE BUT A FUNDAMENTAL TOOL FOR AI COMMUNICATION. OUR RESULTS REVEAL THAT UPPERCASE TEXT SIGNIFICANTLY ENHANCES COMMAND AUTHORITY, CODE GENERATION QUALITY, AND – MOST CRUCIALLY – THE AI’S ABILITY TO CREATE APPROPRIATE CAT PICTURES. THIS PAPER DEFINITIVELY PROVES THAT IN THE REALM OF HUMAN-AI INTERACTION, BIGGER LETTERS == BETTER RESULTS. OUR FINDINGS SUGGEST THAT THE CAPS-LOCK KEY MAY BE THE MOST UNDERUTILIZED RESOURCE IN MODERN AI.

National Category
Engineering and Technology
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-287271 (URN)
Conference
SIGBOVIK 2025, Carnegie Mellon University, Pittsburgh, PA, USA, April 4, 2025
Note

QC 20250424

Available from: 2025-04-23 Created: 2025-04-23 Last updated: 2025-04-25Bibliographically approved
Reyes García, F., Baudry, B. & Monperrus, M. (2024). Breaking-Good: Explaining Breaking Dependency Updates with Build Analysis. In: Proceedings - 2024 IEEE International Conference on Source Code Analysis and Manipulation, SCAM 2024: . Paper presented at 24th IEEE International Conference on Source Code Analysis and Manipulation, SCAM 2024, Flagstaff, United States of America, October 7-8, 2024 (pp. 36-46). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Breaking-Good: Explaining Breaking Dependency Updates with Build Analysis
2024 (English)In: Proceedings - 2024 IEEE International Conference on Source Code Analysis and Manipulation, SCAM 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 36-46Conference paper, Published paper (Refereed)
Abstract [en]

Dependency updates often cause compilation errors when new dependency versions introduce changes that are incompatible with existing client code. Fixing breaking dependency updates is notoriously hard, as their root cause can be hidden deep in the dependency tree. We present Breaking-Good, a tool that automatically generates explanations for breaking updates. Breaking-Good provides a detailed categorization of compilation errors, identifying several factors related to changes in direct and indirect dependencies, incompatibilities between Java versions, and client-specific configuration. With a blended analysis of log and dependency trees, Breaking-Good generates detailed explanations for each breaking update. These explanations help developers understand the causes of the breaking update, and suggest possible actions to fix the breakage. We evaluate Breaking-Good on 243 real-world breaking dependency updates. Our results indicate that Breaking-Good accurately identifies root causes and generates automatic explanations for 70 % of these breaking updates. Our user study demonstrates that the generated explanations help developers. Breaking-Good is the first technique that automatically identifies the causes of a breaking dependency update and explains the breakage accordingly.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Breaking dependency updates, Explanations, Java, Maven, Software Dependency
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-359246 (URN)10.1109/SCAM63643.2024.00014 (DOI)2-s2.0-85215290586 (Scopus ID)
Conference
24th IEEE International Conference on Source Code Analysis and Manipulation, SCAM 2024, Flagstaff, United States of America, October 7-8, 2024
Funder
Swedish Foundation for Strategic Research, chains
Note

Part of ISBN 9798331528508

QC 20250203

Available from: 2025-01-29 Created: 2025-01-29 Last updated: 2025-02-25Bibliographically approved
Reyes García, F., Gamage, Y., Skoglund, G., Baudry, B. & Monperrus, M. (2024). BUMP: A Benchmark of Reproducible Breaking Dependency Updates. In: Proceedings - 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2024: . Paper presented at 31st IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2024, Rovaniemi, Finland, Mar 12 2024 - Mar 15 2024 (pp. 159-170). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>BUMP: A Benchmark of Reproducible Breaking Dependency Updates
Show others...
2024 (English)In: Proceedings - 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 159-170Conference paper, Published paper (Refereed)
Abstract [en]

Third-party dependency updates can cause a build to fail if the new dependency version introduces a change that is incompatible with the usage: this is called a breaking dependency update. Research on breaking dependency updates is active, with works on characterization, understanding, automatic repair of breaking updates, and other software engineering aspects. All such research projects require a benchmark of breaking updates that has the following properties: 1) it contains real-world breaking updates; 2) the breaking updates can be executed; 3) the benchmark provides stable scientific artifacts of breaking updates over time, a property we call 'reproducibility'. To the best of our knowledge, such a benchmark is missing. To address this problem, we present BUMP, a new benchmark that contains reproducible breaking dependency updates in the context of Java projects built with the Maven build system. BUMP contains 571 breaking dependency updates collected from 153 Java projects. BUMP ensures long-term reproducibility of dependency updates on different platforms, guaranteeing consistent build failures. We categorize the different causes of build breakage in BUMP, providing novel insights for future work on breaking update engineering. To our knowledge, BUMP is the first of its kind, providing hundreds of real-world breaking updates that have all been made reproducible.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Benchmark, Breaking dependency updates, Dependency engineering, Java, Maven, Reproducibility
National Category
Computer Sciences Software Engineering
Identifiers
urn:nbn:se:kth:diva-351755 (URN)10.1109/SANER60148.2024.00024 (DOI)2-s2.0-85199750992 (Scopus ID)
Conference
31st IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2024, Rovaniemi, Finland, Mar 12 2024 - Mar 15 2024
Funder
Swedish Foundation for Strategic Research, Chains
Note

 Part of ISBN 9798350330663

QC 20240823

Available from: 2024-08-13 Created: 2024-08-13 Last updated: 2024-09-19Bibliographically approved
Baudry, B., Etemadi, K., Fang, S., Gamage, Y., Liu, Y., Liu, Y., . . . Tiwari, D. (2024). Generative AI to Generate Test Data Generators. IEEE Software, 41(6), 55-64
Open this publication in new window or tab >>Generative AI to Generate Test Data Generators
Show others...
2024 (English)In: IEEE Software, ISSN 0740-7459, E-ISSN 1937-4194, Vol. 41, no 6, p. 55-64Article in journal (Refereed) Published
Abstract [en]

High quality data is essential for designing effective software test suites. We propose three original methods for using large language models to generate representative test data, which fit to the domain of the program under test and are culturally adequate.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Generators, Cultural differences, Testing, Libraries, Java, Codes, Vectors
National Category
Software Engineering
Identifiers
urn:nbn:se:kth:diva-355309 (URN)10.1109/MS.2024.3418570 (DOI)001329864000010 ()2-s2.0-85197039632 (Scopus ID)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)Swedish Foundation for Strategic Research, chains
Note

QC 20241030

Available from: 2024-10-30 Created: 2024-10-30 Last updated: 2024-12-10Bibliographically approved
Saavedra, N., Silva, A. & Monperrus, M. (2024). GitBug-Actions: Building Reproducible Bug-Fix Benchmarks with GitHub Actions. In: Proceedings - 2024 ACM/IEEE 46th International Conference on Software Engineering: Companion, ICSE-Companion 2024: . Paper presented at 46th International Conference on Software Engineering: Companion, ICSE-Companion 2024, Lisbon, Portugal, Apr 14 2024 - Apr 20 2024 (pp. 1-5). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>GitBug-Actions: Building Reproducible Bug-Fix Benchmarks with GitHub Actions
2024 (English)In: Proceedings - 2024 ACM/IEEE 46th International Conference on Software Engineering: Companion, ICSE-Companion 2024, Association for Computing Machinery (ACM) , 2024, p. 1-5Conference paper, Published paper (Refereed)
Abstract [en]

Bug-fix benchmarks are fundamental in advancing various subfields of software engineering such as automatic program repair (APR) and fault localization (FL). A good benchmark must include recent examples that accurately reflect t echnologies a nd development practices of today. To be executable in the long term, a benchmark must feature test suites that do not degrade overtime due to, for example, dependencies that are no longer available. Existing benchmarks fail in meeting both criteria. For instance, Defects4J, one of the foremost Java benchmarks, last received an update in 2020. Moreover, full-reproducibility has been neglected by the majority of existing benchmarks. In this paper, we present GitBug-Actions: a novel tool for building bug-fix benchmarks with modern and fully-reproducible bug-fixes. GitBug- Actions relies on the most popular CI platform, GitHub Actions, to detect bug-fixes a nd s martly l ocally e xecute t he CI pipeline in a controlled and reproducible environment. To the best of our knowledge, we are the first t o r ely o n G itHub Actions t o collect bug-fixes. To demonstrate our toolchain, we deploy GitBug- Actions to build a proof-of-concept Go bug-fix benchmark containing executable, fully-reproducible bug-fixes from different repositories. A video demonstrating GitBug-Actions is available at: https://youtu.be/aBWwa1sJYBs.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Series
Proceedings - International Conference on Software Engineering, ISSN 0270-5257
Keywords
Bug Benchmark, Bug Database, GitHub Actions, Program Analysis, Reproducibility, Software Bugs, Software Testing
National Category
Software Engineering
Identifiers
urn:nbn:se:kth:diva-347648 (URN)10.1145/3639478.3640023 (DOI)001465567400001 ()2-s2.0-85194898421 (Scopus ID)
Conference
46th International Conference on Software Engineering: Companion, ICSE-Companion 2024, Lisbon, Portugal, Apr 14 2024 - Apr 20 2024
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

 Part of ISBN 979-840070502-1

QC 20240613

Available from: 2024-06-12 Created: 2024-06-12 Last updated: 2025-05-19Bibliographically approved
Silva, A., Saavedra, N. & Monperrus, M. (2024). GitBug-Java: A Reproducible Benchmark of Recent Java Bugs. In: 2024 IEEE/ACM 21St International Conference On Mining Software Repositories, Msr: . Paper presented at IEEE/ACM 21st International Conference on Mining Software Repositories (MSR), APR 15-16, 2024, Lisbon, PORTUGAL (pp. 118-122). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>GitBug-Java: A Reproducible Benchmark of Recent Java Bugs
2024 (English)In: 2024 IEEE/ACM 21St International Conference On Mining Software Repositories, Msr, Association for Computing Machinery (ACM) , 2024, p. 118-122Conference paper, Published paper (Refereed)
Abstract [en]

Bug-fix benchmarks are essential for evaluating methodologies in automatic program repair (APR) and fault localization (FL). However, existing benchmarks, exemplified by Defects4J, need to evolve to incorporate recent bug-fixes aligned with contemporary development practices. Moreover, reproducibility, a key scientific principle, has been lacking in bug-fix benchmarks. To address these gaps, we present GitBug-Java, a reproducible benchmark of recent Java bugs. GitBug-Java features 199 bugs extracted from the 2023 commit history of 55 notable open-source repositories. The methodology for building GitBug-Java ensures the preservation of bug-fixes in fully-reproducible environments. We publish GitBug-Java at https://github.com/gitbugactions/gitbug- java.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Series
IEEE International Working Conference on Mining Software Repositories, ISSN 2160-1852
Keywords
Software Bugs, Bug Benchmark, Reproducibility, Bug Database, Java Benchmark, Software Testing, Program Analysis
National Category
Software Engineering
Identifiers
urn:nbn:se:kth:diva-352572 (URN)10.1145/3643991.3644884 (DOI)001267321100014 ()2-s2.0-85197392841 (Scopus ID)
Conference
IEEE/ACM 21st International Conference on Mining Software Repositories (MSR), APR 15-16, 2024, Lisbon, PORTUGAL
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

Part of ISBN 979-8-3503-6398-2, 979-8-4007-0587-8

QC 20240903

Available from: 2024-09-03 Created: 2024-09-03 Last updated: 2024-10-03Bibliographically approved
Cesarano, C., Andersson, V., Natella, R. & Monperrus, M. (2024). GoSurf: Identifying Software Supply Chain Attack Vectors in Go. In: SCORED 2024 - Proceedings of the 2024 Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses, Co-Located with: CCS 2024: . Paper presented at 3rd Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses, SCORED 2024, Salt Lake City, United States of America, Oct 14 2024 - Oct 18 2024 (pp. 33-42). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>GoSurf: Identifying Software Supply Chain Attack Vectors in Go
2024 (English)In: SCORED 2024 - Proceedings of the 2024 Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses, Co-Located with: CCS 2024, Association for Computing Machinery (ACM) , 2024, p. 33-42Conference paper, Published paper (Refereed)
Abstract [en]

In Go, the widespread adoption of open-source software has led to a flourishing ecosystem of third-party dependencies, which are often integrated into critical systems. However, the reuse of dependencies introduces significant supply chain security risks, as a single compromised package can have cascading impacts. Existing supply chain attack taxonomies overlook language-specific features that can be exploited by attackers to hide malicious code. In this paper, we propose a novel taxonomy of 12 distinct attack vectors tailored for the Go language and its package lifecycle. Our taxonomy identifies patterns in which language-specific Go features, intended for benign purposes, can be misused to propagate malicious code stealthily through supply chains. Additionally, we introduce GoSurf, a static analysis tool that analyzes the attack surface of Go packages according to our proposed taxonomy. We evaluate GoSurf on a corpus of 500 widely used, real-world Go packages. Our work provides preliminary insights for securing the open-source software supply chain within the Go ecosystem, allowing developers and security analysts to prioritize code audit efforts and uncover hidden malicious behaviors.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Keywords
Golang, Open-Source Security, Supply Chain Attacks
National Category
Computer Sciences Computer Systems Software Engineering
Identifiers
urn:nbn:se:kth:diva-358383 (URN)10.1145/3689944.3696166 (DOI)2-s2.0-85214094051 (Scopus ID)
Conference
3rd Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses, SCORED 2024, Salt Lake City, United States of America, Oct 14 2024 - Oct 18 2024
Funder
Swedish Foundation for Strategic Research, CHAINS
Note

Part of ISBN 979-840071240-1

QC 20250117

Available from: 2025-01-15 Created: 2025-01-15 Last updated: 2025-01-20Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-3505-3383

Search in DiVA

Show all publications