kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
PROZE: Generating Parameterized Unit Tests Informed by Runtime Data
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0003-0293-2592
Université de Montréal, Montréal, Canada.ORCID iD: 0009-0000-7537-4961
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.ORCID iD: 0000-0003-3505-3383
Université de Montréal, Montréal, Canada.ORCID iD: 0000-0002-4015-4640
2024 (English)In: Proceedings - 2024 IEEE International Conference on Source Code Analysis and Manipulation, SCAM 2024, Institute of Electrical and Electronics Engineers (IEEE), 2024, p. 166-176Conference paper, Published paper (Refereed)
Abstract [en]

Typically, a conventional unit test (CUT) verifies the expected behavior of the unit under test through one specific input / output pair. In contrast, a parameterized unit test (PUT) receives a set of inputs as arguments, and contains assertions that are expected to hold true for all these inputs. PUTs increase test quality, as they assess correctness on a broad scope of inputs and behaviors. However, defining assertions over a set of inputs is a hard task for developers, which limits the adoption of PUTs in practice. In this paper, we address the problem of finding oracles for PUTs that hold over multiple inputs. We design a system called PROZE, that generates PUTs by identifying developer-written assertions that are valid for more than one test input. We implement our approach as a two-step methodology: first, at runtime, we collect inputs for a target method that is invoked within a CUT; next, we isolate the valid assertions of the CUT to be used within a PUT. We evaluate our approach against 5 real-world Java modules, and collect valid inputs for 128 target methods, from test and field executions. We generate 2,287 PUTs, which invoke the target methods with a significantly larger number of test inputs than the original CUTs. We execute the PUTs and find 217 that provably demonstrate that their oracles hold for a larger range of inputs than envisioned by the developers. From a testing theory perspective, our results show that developers express assertions within CUTs, which actually hold beyond one particular input.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024. p. 166-176
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:kth:diva-356174DOI: 10.1109/SCAM63643.2024.00025Scopus ID: 2-s2.0-85215285513OAI: oai:DiVA.org:kth-356174DiVA, id: diva2:1911847
Conference
24th IEEE International Conference on Source Code Analysis and Manipulation, SCAM 2024, Flagstaff, United States of America, Oct 7 2024 - Oct 8 2024
Note

Part of ISBN 9798331528508

QC 20241111

Available from: 2024-11-09 Created: 2024-11-09 Last updated: 2025-03-12Bibliographically approved
In thesis
1. Augmenting Test Oracles with Production Observations
Open this publication in new window or tab >>Augmenting Test Oracles with Production Observations
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Software testing is the process of verifying that a software system behaves as it is intended to behave. Significant resources are invested in creating and maintaining strong test suites to ensure software quality. However, in-house tests seldom reflect all the scenarios that may occur as a software system executes in production environments. The literature on the automated generation of tests proposes valuable techniques that assist developers with their testing activities. Yet the gap between tested behaviors and field behaviors remains largely overlooked. Consequently, the behaviors relevant for end users are not reflected in the test suite, and the faults that may surface for end-users in the field may remain undetected by developer-written or automatically generated tests.

This thesis proposes a novel framework for using production observations, made as a system executes in the field, in order to generate tests. The generated tests include test inputs that are sourced from the field, and oracles that verify behaviors exhibited by the system in response to these inputs. We instantiate our framework in three distinct ways.

First, for a target project, we focus on methods that are inadequately tested by the developer-written test suite. At runtime, we capture objects that are associated with the invocations of these methods. The captured objects are used to generate tests that recreate the observed production state and contain oracles that specify the expected behavior. Our evaluation demonstrates that this strategy results in improved test quality for the target project.

With the second instantiation of our framework, we observe the invocations of target methods at runtime, as well as the invocations of methods called within the target methods. Using the objects associated with these invocations, we generate tests that use mocks, stubs, and mock-based oracles. We find that the generated oracles verify distinct aspects of the behaviors observed in the field, and also detect regressions within the system.

Third, we adapt our framework to capture the arguments with which target methods are invoked, during the execution of the test suite and in the field. We generate a data provider using the union of captured arguments, which supplies values to a parameterized unit test that is derived from a developer-written unit test. Using this strategy, we discover developer-written oracles that are actually generalizable to a larger input space.

We evaluate the three instances of our proposed framework against real-world software projects exercised with production workloads. Our findings demonstrate that runtime observations can be harnessed to generate complete tests, with inputs and oracles. The generated tests are representative of real-world usage, and can augment developer-written test suites.

Abstract [sv]

Programvarutestning är processen för att verifiera att ett mjukvarusystem fungerar som det är tänkt att fungera. Betydande resurser investeras i att skapa och underhålla starka testsviter för att säkerställa mjukvarukvalitet. Interna tester återspeglar dock sällan alla scenarier som kan uppstå när ett mjukvarusystem körs i produktionsmiljöer. Litteraturen om automatiserad testgenerering föreslår värdefulla tekniker för att hjälpa utvecklare i deras testaktiviteter. Ändå förbises gapet mellan testade beteenden och beteenden i produktionsmiljöer till stor del. Följaktligen återspeglas inte beteenden som är relevanta för slutanvändare i testsviten, och de fel som kan visas för slutanvändare i reella situationer kan förbli oupptäckta av utvecklarskrivna eller automatiskt genererade tester.

Denna avhandling föreslår ett nytt ramverk för att använda produktionsobservationer, gjorda när ett system exekverar i produktionsmiljö, för att generera tester. De genererade testen inkluderar testindata som kommer från reella användare och orakel som verifierar beteenden som uppvisas av systemet som svar på dessa indata. Vi instansierar vårt ramverk på tre olika sätt.

Först, för ett målprojekt, fokuserar vi på metoder som är otillräckligt testade av den utvecklarskrivna testsviten. Vid körning registrerar vi objekt som är associerade med anropen till dessa metoder. De registrerade objekten används för att generera tester som återskapar det observerade produktionstillståndet och innehåller orakel som anger det förväntade beteendet. Vår utvärdering visar att denna strategi resulterar i förbättrad testkvalitet för målprojektet.

Med den andra instansieringen av vårt ramverk observerar vi anrop till målmetoder vid körning, såväl som anrop till metoder som anropas inom målmetoderna. Med hjälp av objekten som är associerade med dessa anrop genererar vi tester som använder mocks, stubs och mock-baserade orakel. Vi finner att de genererade oraklen verifierar distinkta aspekter av beteenden som observerats i produktionsmiljöer, och även upptäcker regressioner inom systemet.

För det tredje anpassar vi vårt ramverk för att registrera de argument med vilka målmetoder anropas, under körning av testsviter och i produktion. Vi genererar en dataleverantör med hjälp av sammansättningen av registrerade argument, som tillhandahåller värden till ett parameteriserat enhetstest härlett från ett utvecklarskrivet enhetstest. Med den här strategin upptäcker vi utvecklarskrivna orakel som faktiskt är generaliserbara till ett större inmatningsutrymme.

Vi utvärderar de tre fallen av vårt föreslagna ramverk mot verkliga programvaruprojekt som körs med produktionsbelastning. Våra resultat visar att körtidsobservationer kan utnyttjas för att generera kompletta tester, med indata och orakel. De genererade testerna är representativa för användning i verkligheten och kan utöka utvecklarskrivna testsviter.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2024. p. ix, 71
Series
TRITA-EECS-AVL ; 2024:87
Keywords
Test generation, Test oracles, Production observations, Testgenerering, Testorakel, Produktionsobservationer
National Category
Software Engineering
Identifiers
urn:nbn:se:kth:diva-356183 (URN)978-91-8106-109-3 (ISBN)
Public defence
2024-12-13, https://kth-se.zoom.us/j/64605922145, Kollegiesalen, Brinellvägen 6, Stockholm, 14:00 (English)
Opponent
Supervisors
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20241112

Available from: 2024-11-12 Created: 2024-11-12 Last updated: 2024-11-18Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopusPaper in conference programConference websitearXiv manuscript

Authority records

Tiwari, DeepikaGamage, YogyaMonperrus, MartinBaudry, Benoit

Search in DiVA

By author/editor
Tiwari, DeepikaGamage, YogyaMonperrus, MartinBaudry, Benoit
By organisation
Software and Computer systems, SCSTheoretical Computer Science, TCS
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 51 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf