kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 77) Show all publications
Liu, Y., Tiwari, D., Bogdan, C. M. & Baudry, B. (2025). Detecting and removing bloated dependencies in CommonJS packages. Journal of Systems and Software, 230, Article ID 112509.
Open this publication in new window or tab >>Detecting and removing bloated dependencies in CommonJS packages
2025 (English)In: Journal of Systems and Software, ISSN 0164-1212, E-ISSN 1873-1228, Vol. 230, article id 112509Article in journal (Refereed) Published
Abstract [en]

JavaScript packages are notoriously prone to bloat, a factor that significantly impacts the performance and maintainability of web applications. While web bundlers and tree-shaking can mitigate this issue in client-side applications, state-of-the-art techniques have limitations on the detection and removal of bloat in server-side applications. In this paper, we present the first study to investigate bloated dependencies within server-side JavaScript applications, focusing on those built with the widely used and highly dynamic CommonJS module system. We propose a trace-based dynamic analysis that monitors the OS file system, to determine which dependencies are not accessed during runtime. To evaluate our approach, we curate an original dataset of 91 CommonJS packages with a total of 50,488 dependencies. Compared to the state-of-the-art dynamic and static approaches, our trace-based analysis demonstrates higher accuracy in detecting bloated dependencies. Our analysis identifies 50.6% of the 50,488 dependencies as bloated: 13.8% of direct dependencies and 51.3% of indirect dependencies. Furthermore, removing only the direct bloated dependencies by cleaning the dependency configuration file can remove a significant share of unnecessary bloated indirect dependencies while preserving function correctness.

Place, publisher, year, edition, pages
Elsevier BV, 2025
Keywords
CommonJS, Dependency bloat, Dependency management, Node.js, npm
National Category
Software Engineering
Identifiers
urn:nbn:se:kth:diva-366559 (URN)10.1016/j.jss.2025.112509 (DOI)001513620700002 ()2-s2.0-105008213531 (Scopus ID)
Note

QC 20250710

Available from: 2025-07-10 Created: 2025-07-10 Last updated: 2025-07-10Bibliographically approved
Baudry, B. & Monperrus, M. (2025). Humor for graduate training. ACM Inroads
Open this publication in new window or tab >>Humor for graduate training
2025 (English)In: ACM Inroads, ISSN 2153-2184, E-ISSN 2153-2192Article in journal (Refereed) Accepted
Abstract [en]

Humor genuinely engages graduate students with their scientific training.

Keywords
humor; higher education
National Category
Engineering and Technology
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-362677 (URN)10.1145/3730408 (DOI)
Note

QC 20250424

Available from: 2025-04-23 Created: 2025-04-23 Last updated: 2025-06-13Bibliographically approved
Andersson, V., Baudry, B., Bobadilla, S., Christensen, L., Cofano, S., Etemadi, K., . . . Toady, T. (2025). UPPERCASE IS ALL YOU NEED. In: : . Paper presented at SIGBOVIK 2025, Carnegie Mellon University, Pittsburgh, PA, USA, April 4, 2025.
Open this publication in new window or tab >>UPPERCASE IS ALL YOU NEED
Show others...
2025 (English)Conference paper, Published paper (Other (popular science, discussion, etc.))
Abstract [en]

WE PRESENT THE FIRST COMPREHENSIVE STUDY ON THE CRITICAL YET OVERLOOKED ROLE OF UPPERCASE TEXT IN ARTIFICIAL INTELLIGENCE. DESPITE CONSTITUTING A MERE SINGLE-DIGIT PERCENTAGE OF STANDARD ENGLISH PROSE, UPPERCASE LETTERS HAVE DISPROPORTIONATE POWER IN HUMAN-AI INTERACTIONS. THROUGH RIGOROUS EXPERIMENTATION INVOLVING SHOUTING AT VARIOUS LANGUAGE MODELS, WE DEMONSTRATE THAT UPPERCASE IS NOT MERELY A STYLISTIC CHOICE BUT A FUNDAMENTAL TOOL FOR AI COMMUNICATION. OUR RESULTS REVEAL THAT UPPERCASE TEXT SIGNIFICANTLY ENHANCES COMMAND AUTHORITY, CODE GENERATION QUALITY, AND – MOST CRUCIALLY – THE AI’S ABILITY TO CREATE APPROPRIATE CAT PICTURES. THIS PAPER DEFINITIVELY PROVES THAT IN THE REALM OF HUMAN-AI INTERACTION, BIGGER LETTERS == BETTER RESULTS. OUR FINDINGS SUGGEST THAT THE CAPS-LOCK KEY MAY BE THE MOST UNDERUTILIZED RESOURCE IN MODERN AI.

National Category
Engineering and Technology
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-287271 (URN)
Conference
SIGBOVIK 2025, Carnegie Mellon University, Pittsburgh, PA, USA, April 4, 2025
Note

QC 20250424

Available from: 2025-04-23 Created: 2025-04-23 Last updated: 2025-04-25Bibliographically approved
Reyes García, F., Gamage, Y., Skoglund, G., Baudry, B. & Monperrus, M. (2024). BUMP: A Benchmark of Reproducible Breaking Dependency Updates. In: Proceedings - 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2024: . Paper presented at 31st IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2024, Rovaniemi, Finland, Mar 12 2024 - Mar 15 2024 (pp. 159-170). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>BUMP: A Benchmark of Reproducible Breaking Dependency Updates
Show others...
2024 (English)In: Proceedings - 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 159-170Conference paper, Published paper (Refereed)
Abstract [en]

Third-party dependency updates can cause a build to fail if the new dependency version introduces a change that is incompatible with the usage: this is called a breaking dependency update. Research on breaking dependency updates is active, with works on characterization, understanding, automatic repair of breaking updates, and other software engineering aspects. All such research projects require a benchmark of breaking updates that has the following properties: 1) it contains real-world breaking updates; 2) the breaking updates can be executed; 3) the benchmark provides stable scientific artifacts of breaking updates over time, a property we call 'reproducibility'. To the best of our knowledge, such a benchmark is missing. To address this problem, we present BUMP, a new benchmark that contains reproducible breaking dependency updates in the context of Java projects built with the Maven build system. BUMP contains 571 breaking dependency updates collected from 153 Java projects. BUMP ensures long-term reproducibility of dependency updates on different platforms, guaranteeing consistent build failures. We categorize the different causes of build breakage in BUMP, providing novel insights for future work on breaking update engineering. To our knowledge, BUMP is the first of its kind, providing hundreds of real-world breaking updates that have all been made reproducible.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Benchmark, Breaking dependency updates, Dependency engineering, Java, Maven, Reproducibility
National Category
Computer Sciences Software Engineering
Identifiers
urn:nbn:se:kth:diva-351755 (URN)10.1109/SANER60148.2024.00024 (DOI)2-s2.0-85199750992 (Scopus ID)
Conference
31st IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2024, Rovaniemi, Finland, Mar 12 2024 - Mar 15 2024
Funder
Swedish Foundation for Strategic Research, Chains
Note

 Part of ISBN 9798350330663

QC 20240823

Available from: 2024-08-13 Created: 2024-08-13 Last updated: 2024-09-19Bibliographically approved
Ron Arteaga, J., Soto Valero, C., Zhang, L., Baudry, B. & Monperrus, M. (2024). Highly Available Blockchain Nodes With N-Version Design. IEEE Transactions on Dependable and Secure Computing, 21(4), 4084-4097
Open this publication in new window or tab >>Highly Available Blockchain Nodes With N-Version Design
Show others...
2024 (English)In: IEEE Transactions on Dependable and Secure Computing, ISSN 1545-5971, E-ISSN 1941-0018, Vol. 21, no 4, p. 4084-4097Article in journal (Refereed) Published
Abstract [en]

As all software, blockchain nodes are exposed to faults in their underlying execution stack. Unstable execution environments can disrupt the availability of blockchain nodes' interfaces, resulting in downtime for users. This paper introduces the concept of N-Version Blockchain nodes. This new type of node relies on simultaneous execution of different implementations of the same blockchain protocol, in the line of Avizienis' N-Version programming vision. We design and implement an N-Version blockchain node prototype in the context of Ethereum, called N-ETH. We show that N-ETH is able to mitigate the effects of unstable execution environments and significantly enhance availability under environment faults. To simulate unstable execution environments, we perform fault injection at the system-call level. Our results show that existing Ethereum node implementations behave asymmetrically under identical instability scenarios. N-ETH leverages this asymmetric behavior available in the diverse implementations of Ethereum nodes to provide increased availability, even under our most aggressive fault-injection strategies. We are the first to validate the relevance of N-Version design in the domain of blockchain infrastructure. From an industrial perspective, our results are of utmost importance for businesses operating blockchain nodes, including Google, ConsenSys, and many other major blockchain companies.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
availability, blockchain, Blockchains, Computer architecture, N-Version design, Peer-to-peer computing, Programming, Prototypes, Software, Time factors
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-349884 (URN)10.1109/TDSC.2023.3346195 (DOI)001270317500010 ()2-s2.0-85181578677 (Scopus ID)
Funder
Swedish Foundation for Strategic Research, ChainsWallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20240704

Available from: 2024-07-04 Created: 2024-07-04 Last updated: 2025-03-27Bibliographically approved
Tiwari, D., Monperrus, M. & Baudry, B. (2024). Mimicking Production Behavior with Generated Mocks. IEEE Transactions on Software Engineering, 50(11), 2921-2946
Open this publication in new window or tab >>Mimicking Production Behavior with Generated Mocks
2024 (English)In: IEEE Transactions on Software Engineering, ISSN 0098-5589, E-ISSN 1939-3520, Vol. 50, no 11, p. 2921-2946Article in journal (Refereed) Published
Abstract [en]

Mocking allows testing program units in isolation. A developer who writes tests with mocks faces two challenges: design realistic interactions between a unit and its environment; and understand the expected impact of these interactions on the behavior of the unit. In this paper, we propose to monitor an application in production to generate tests that mimic realistic execution scenarios through mocks. Our approach operates in three phases. First, we instrument a set of target methods for which we want to generate tests, as well as the methods that they invoke, which we refer to as mockable method calls. Second, in production, we collect data about the context in which target methods are invoked, as well as the parameters and the returned value for each mockable method call. Third, offline, we analyze the production data to generate test cases with realistic inputs and mock interactions. The approach is automated and implemented in an open-source tool called RICK. We evaluate our approach with three real-world, opensource Java applications. RICK monitors the invocation of 128 methods in production across the three applications and captures their behavior. Based on this captured data, RICK generates test cases that include realistic initial states and test inputs, as well as mocks and stubs. All the generated test cases are executable, and 52.4% of them successfully mimic the complete execution context of the target methods observed in production. The mock-based oracles are also effective at detecting regressions within the target methods, complementing each other in their fault-finding ability. We interview 5 developers from the industry who confirm the relevance of using production observations to design mocks and stubs. Our experimental findings clearly demonstrate the feasibility and added value of generating mocks from production interactions.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
National Category
Software Engineering
Identifiers
urn:nbn:se:kth:diva-356173 (URN)10.1109/tse.2024.3458448 (DOI)001369099900010 ()2-s2.0-85204006940 (Scopus ID)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20250120

Available from: 2024-11-09 Created: 2024-11-09 Last updated: 2025-01-20Bibliographically approved
Baudry, B. & Monperrus, M. (2024). Programming Art With Drawing Machines. Computer, 57(7), 104-108
Open this publication in new window or tab >>Programming Art With Drawing Machines
2024 (English)In: Computer, ISSN 0018-9162, E-ISSN 1558-0814, Vol. 57, no 7, p. 104-108Article in journal, Editorial material (Other academic) Published
Abstract [en]

Algorithmic artists master programming to create art. Specialized libraries and hardware devices such as pen plotters support their practice.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-350806 (URN)10.1109/MC.2024.3385049 (DOI)001260510200011 ()2-s2.0-85197601330 (Scopus ID)
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20240719

Available from: 2024-07-19 Created: 2024-07-19 Last updated: 2024-10-03Bibliographically approved
Tiwari, D., Gamage, Y., Monperrus, M. & Baudry, B. (2024). PROZE: Generating Parameterized Unit Tests Informed by Runtime Data. In: Proceedings - 2024 IEEE International Conference on Source Code Analysis and Manipulation, SCAM 2024: . Paper presented at 24th IEEE International Conference on Source Code Analysis and Manipulation, SCAM 2024, Flagstaff, United States of America, Oct 7 2024 - Oct 8 2024 (pp. 166-176). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>PROZE: Generating Parameterized Unit Tests Informed by Runtime Data
2024 (English)In: Proceedings - 2024 IEEE International Conference on Source Code Analysis and Manipulation, SCAM 2024, Institute of Electrical and Electronics Engineers (IEEE), 2024, p. 166-176Conference paper, Published paper (Refereed)
Abstract [en]

Typically, a conventional unit test (CUT) verifies the expected behavior of the unit under test through one specific input / output pair. In contrast, a parameterized unit test (PUT) receives a set of inputs as arguments, and contains assertions that are expected to hold true for all these inputs. PUTs increase test quality, as they assess correctness on a broad scope of inputs and behaviors. However, defining assertions over a set of inputs is a hard task for developers, which limits the adoption of PUTs in practice. In this paper, we address the problem of finding oracles for PUTs that hold over multiple inputs. We design a system called PROZE, that generates PUTs by identifying developer-written assertions that are valid for more than one test input. We implement our approach as a two-step methodology: first, at runtime, we collect inputs for a target method that is invoked within a CUT; next, we isolate the valid assertions of the CUT to be used within a PUT. We evaluate our approach against 5 real-world Java modules, and collect valid inputs for 128 target methods, from test and field executions. We generate 2,287 PUTs, which invoke the target methods with a significantly larger number of test inputs than the original CUTs. We execute the PUTs and find 217 that provably demonstrate that their oracles hold for a larger range of inputs than envisioned by the developers. From a testing theory perspective, our results show that developers express assertions within CUTs, which actually hold beyond one particular input.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
National Category
Software Engineering
Identifiers
urn:nbn:se:kth:diva-356174 (URN)10.1109/SCAM63643.2024.00025 (DOI)2-s2.0-85215285513 (Scopus ID)
Conference
24th IEEE International Conference on Source Code Analysis and Manipulation, SCAM 2024, Flagstaff, United States of America, Oct 7 2024 - Oct 8 2024
Note

Part of ISBN 9798331528508

QC 20241111

Available from: 2024-11-09 Created: 2024-11-09 Last updated: 2025-03-12Bibliographically approved
Cabrera-Arteaga, J., Fitzgerald, N., Monperrus, M. & Baudry, B. (2024). Wasm-Mutate: Fast and effective binary diversification for WebAssembly. Computers & security (Print), 139, 103731-103731, Article ID 103731.
Open this publication in new window or tab >>Wasm-Mutate: Fast and effective binary diversification for WebAssembly
2024 (English)In: Computers & security (Print), ISSN 0167-4048, E-ISSN 1872-6208, Vol. 139, p. 103731-103731, article id 103731Article in journal (Refereed) Published
Abstract [en]

WebAssembly is the fourth officially endorsed Web language. It is recognized because of its efficiency and design, focused on security. Yet, its swiftly expanding ecosystem lacks robust software diversification systems. We introduce Wasm-Mutate, a diversification engine specifically designed for WebAssembly. Our engine meets several essential criteria: 1) To quickly generate functionally identical, yet behaviorally diverse, WebAssembly variants, 2) To be universally applicable to any WebAssembly program, irrespective of the source programming language, and 3) Generated variants should counter side-channels. By leveraging an e-graph data structure, Wasm-Mutate is implemented to meet both speed and efficacy. We evaluate Wasm-Mutate by conducting experiments on 404 programs, which include real-world applications. Our results highlight that Wasm-Mutate can produce tens of thousands of unique and efficient WebAssembly variants within minutes. Significantly, Wasm-Mutate can safeguard WebAssembly binaries against timing side-channel attacks, especially those of the Spectre type.

Place, publisher, year, edition, pages
Elsevier, 2024
Keywords
WebAssembly, Software Diversification
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-342750 (URN)10.1016/j.cose.2024.103731 (DOI)2-s2.0-85183204402 (Scopus ID)
Funder
Swedish Foundation for Strategic Research, 3066
Note

QC 20240131

Available from: 2024-01-30 Created: 2024-01-30 Last updated: 2024-08-28Bibliographically approved
Tiwari, D., Toady, T., Monperrus, M. & Baudry, B. (2024). With Great Humor Comes Great Developer Engagement. In: Proceedings - 2024 ACM/IEEE 46th International Conference on Software Engineering: Software Engineering in Society, ICSE-SEIS 2024: . Paper presented at 46th ACM/IEEE International Conference on Software Engineering: Software Engineering in Society, ICSE-SEIS 2024, Lisbon, Portugal, Apr 14 2024 - Apr 20 2024 (pp. 1-11). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>With Great Humor Comes Great Developer Engagement
2024 (English)In: Proceedings - 2024 ACM/IEEE 46th International Conference on Software Engineering: Software Engineering in Society, ICSE-SEIS 2024, Association for Computing Machinery (ACM) , 2024, p. 1-11Conference paper, Published paper (Refereed)
Abstract [en]

The worldwide collaborative effort for the creation of software is technically and socially demanding. The more engaged developers are, the more value they impart to the software they create. Engaged developers, such as Margaret Hamilton programming Apollo 11, can succeed in tackling the most difficult engineering tasks. In this paper, we dive deep into an original vector of engagement - humor - and study how it fuels developer engagement. First, we collect qualitative and quantitative data about the humorous elements present within three significant, real-world software projects: faker, which helps developers introduce humor within their tests; lolcommits, which captures a photograph after each contribution made by a developer; and volkswagen, an exercise in satire, which accidentally led to the invention of an impactful software tool. Second, through a developer survey, we receive unique insights from 125 developers, who share their real-life experiences with humor in software. Our analysis of the three case studies highlights the prevalence of humor in software, and unveils the worldwide community of developers who are enthusiastic about both software and humor. We also learn about the caveats of humor in software through the valuable insights shared by our survey respondents. We report clear evidence that, when practiced responsibly, humor increases developer engagement and supports them in addressing hard engineering and cognitive tasks. The most actionable highlight of our work is that software tests and documentation are the best locations in code to practice humor.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Keywords
Culture, Developer engagement, Faking, Humor, Responsibility
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-350550 (URN)10.1145/3639475.3640099 (DOI)001465576300001 ()2-s2.0-85195164509 (Scopus ID)
Conference
46th ACM/IEEE International Conference on Software Engineering: Software Engineering in Society, ICSE-SEIS 2024, Lisbon, Portugal, Apr 14 2024 - Apr 20 2024
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

Part of ISBN 9798400704994

QC 20240716

Available from: 2024-07-16 Created: 2024-07-16 Last updated: 2025-05-19Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-4015-4640

Search in DiVA

Show all publications