kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Software Diversification for WebAssembly
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0001-9399-8647
2024 (English)Doctoral thesis, comprehensive summary (Other academic)Alternative title
Mjukvarudiversifiering för WebAssembly (Swedish)
Abstract [en]

WebAssembly, now the fourth ocially recognized web language, enables web browsers to port native applications to the Web. Furthermore, WebAssembly has evolved into an essential element for backend scenarios such as cloud and edge computing. Therefore, WebAssembly finds use in a plethora of applications, including but not limited to, web browsers, blockchain, and cloud computing. Despite the emphasis on security since its design and specification, WebAssembly remains susceptible to various forms of attacks, including memory corruption and side-channels. Furthermore, WebAssembly has been manipulated to disseminate malware, particularly in cases of browser cryptojacking. 

Web page resources, including those containing WebAssembly binaries, are predominantly served from centralized data centers in the modern digital landscape. In conjunction with browser clients, thousands of edge devices operate millions of identical WebAssembly instantiations every second. This phenomenon creates a highly predictable ecosystem, wherein potential attackers can anticipate behavior either in browsers or backend nodes. Such predictability escalates the potential impact of vulnerabilities within these ecosystems, paving the way for high-impact side-channel and memory attacks. For instance, a flaw in a web browser, triggered by a defective WebAssembly program, holds the potential to aect millions of users. 

This work aims to harden the security within the WebAssembly ecosystem through the introduction of Software Diversification methods and tools. Software Diversification is a strategy designed to augment the costs of exploiting vulnerabilities by making software less predictable. The predictability within ecosystems can be diminished by automatically generating dierent, yet functionally equivalent, program variants. These variants strengthen observable properties that are typically used to launch attacks, and in many instances, can eliminate such vulnerabilities. 

This work introduces three tools: CROW, MEWE as compiler-based approaches, and WASM-MUTATE as a binary-based approach. Each tool has been specifically designed to tackle a unique facet of Software Diversification. We present empirical evidence demonstrating the potential application of our Software Diversification methods to WebAssembly programs in two distinct ways: Oensive and Defensive Software Diversification. Our research into Oensive Software Diversification in WebAssembly unveils potential paths for enhancing the detection of WebAssembly malware. On the other hand, our experiments in Defensive Software Diversification show that WebAssembly programs can be hardened against side-channel attacks, specifically the Spectre attack. 

Abstract [sv]

WebAssembly, nu det fjärde ociellt erkända webbspråket, gör det möjligt för webbläsare att portera nativa applikationer till webben. Dessutom har WebAssembly utvecklats till en väsentlig komponent för backend-scenarier såsom molntjänster och edge-tjänster. Därmed används WebAssembly i en mängd olika applikationer, däribland webbläsare, blockchain och molntjänster. Trots sitt fokus på säkerhet från dess design till dess specifikation är WebAssembly fortfarande mottagligt för olika former av attacker, såsom minneskorruption och sidokanalattacker. Dessutom har WebAssembly manipulerats för att sprida skadlig programvara, särskilt otillåten cryptobrytning i webbläsare. 

Webbsideresurser, inklusive de som innehåller exekverbar WebAssembly, skickas i en modern digital kontext huvudsakligen från centraliserade datacenter. Tusentals edge-enheter, i samarbete med webbläsarklienter, kör miljontals identiska WebAssembly-instantieringar varje sekund. Detta fenomen skapar ett högst förutsägbart ekosystem, där potentiella angripare kan förutse beteenden antingen i webbläsare eller backend-noder. En sådan förutsägbarhet ökar potentialen för sårbarheter inom dessa ekosystem och öppnar dörren för sidkanal- och minnesattacker med stor påverkan. Till exempel kan en brist i en webbläsare, framkallad av ett defekt WebAssembly- program, ha potential att påverka miljontals användare. 

Denna avhandling syftar till att stärka säkerheten inom WebAssembly- ekosystemet genom införandet av metoder och verktyg för mjukvarudiversifiering. Mjukvarudiversifiering är en strategi som är utformad för att öka kostnaderna för att exploatera sårbarheter genom att göra programvaran oförutsägbar. Förutsägbarheten inom ekosystem kan minskas genom att automatiskt generera olika programvaruvarianter. Dessa varianter förstärker observerbara egenskaper som vanligtvis används för att starta attacker och kan i många fall helt eliminera sådana sårbarheter. 

Detta arbete introducerar tre verktyg: CROW, MEWE och WASM- MUTATE. Varje verktyg har utformats specifikt för att hantera en unik aspekt av mjukvarudiversifiering. Vi presenterar empiriska bevis som visar på potentialen för tillämpning av våra metoder för mjukvarudiversifiering av WebAssembly-program på två distinkta sätt: oensiv och defensiv mjukvarudiversifiering. Vår forskning om oensiv mjukvarudiversifiering i WebAssembly avslöjar potentiella vägar för att förbättra upptäckten av WebAssembly-malware. Å andra sidan visar våra experiment inom defensiv mjukvarudiversifiering att WebAssembly-program kan härdas mot sidokanalattacker, särskilt Spectre-attacken. 

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2024. , p. v, 92
Series
TRITA-EECS-AVL ; 2024:10
Keywords [en]
WebAssembly, Software Diversification, Side-Channels
National Category
Computer Systems
Research subject
Information and Communication Technology
Identifiers
URN: urn:nbn:se:kth:diva-342751ISBN: 978-91-8040-822-6 (print)OAI: oai:DiVA.org:kth-342751DiVA, id: diva2:1832753
Public defence
2024-03-07, https://kth-se.zoom.us/j/64013906066, F3 (Flodis), Lindstedtsvägen 26 & 28, Stockholm, 16:00 (English)
Opponent
Supervisors
Note

QC 20240131

Available from: 2024-01-31 Created: 2024-01-30 Last updated: 2024-02-13Bibliographically approved
List of papers
1. WebAssembly diversification for malware evasion
Open this publication in new window or tab >>WebAssembly diversification for malware evasion
2023 (English)In: Computers & security (Print), ISSN 0167-4048, E-ISSN 1872-6208, Vol. 131, article id 103296Article in journal (Refereed) Published
Abstract [en]

WebAssembly has become a crucial part of the modern web, offering a faster alternative to JavaScript in browsers. While boosting rich applications in browser, this technology is also very efficient to develop cryptojacking malware. This has triggered the development of several methods to detect cryptojacking malware. However, these defenses have not considered the possibility of attackers using evasion techniques. This paper explores how automatic binary diversification can support the evasion of WebAssembly cryptojacking detectors. We experiment with a dataset of 33 WebAssembly cryptojacking binaries and evaluate our evasion technique against two malware detectors: VirusTotal, a general-purpose detector, and MINOS, a WebAssembly-specific detector. Our results demonstrate that our technique can automatically generate variants of WebAssembly cryptojacking that evade the detectors in 90% of cases for VirusTotal and 100% for MINOS. Our results emphasize the importance of meta-antiviruses and diverse detection techniques and provide new insights into which WebAssembly code transformations are best suited for malware evasion. We also show that the variants introduce limited performance overhead, making binary diversification an effective technique for evasion.

Place, publisher, year, edition, pages
Elsevier BV, 2023
Keywords
Cryptojacking, Malware evasion, Software diversification, WebAssembly
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-331551 (URN)10.1016/j.cose.2023.103296 (DOI)001052969800001 ()2-s2.0-85159763426 (Scopus ID)
Funder
Swedish Foundation for Strategic Research, trusfullWallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20230711

Available from: 2023-07-11 Created: 2023-07-11 Last updated: 2024-01-30Bibliographically approved
2. Wasm-Mutate: Fast and effective binary diversification for WebAssembly
Open this publication in new window or tab >>Wasm-Mutate: Fast and effective binary diversification for WebAssembly
2024 (English)In: Computers & security (Print), ISSN 0167-4048, E-ISSN 1872-6208, Vol. 139, p. 103731-103731, article id 103731Article in journal (Refereed) Published
Abstract [en]

WebAssembly is the fourth officially endorsed Web language. It is recognized because of its efficiency and design, focused on security. Yet, its swiftly expanding ecosystem lacks robust software diversification systems. We introduce Wasm-Mutate, a diversification engine specifically designed for WebAssembly. Our engine meets several essential criteria: 1) To quickly generate functionally identical, yet behaviorally diverse, WebAssembly variants, 2) To be universally applicable to any WebAssembly program, irrespective of the source programming language, and 3) Generated variants should counter side-channels. By leveraging an e-graph data structure, Wasm-Mutate is implemented to meet both speed and efficacy. We evaluate Wasm-Mutate by conducting experiments on 404 programs, which include real-world applications. Our results highlight that Wasm-Mutate can produce tens of thousands of unique and efficient WebAssembly variants within minutes. Significantly, Wasm-Mutate can safeguard WebAssembly binaries against timing side-channel attacks, especially those of the Spectre type.

Place, publisher, year, edition, pages
Elsevier, 2024
Keywords
WebAssembly, Software Diversification
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-342750 (URN)10.1016/j.cose.2024.103731 (DOI)2-s2.0-85183204402 (Scopus ID)
Funder
Swedish Foundation for Strategic Research, 3066
Note

QC 20240131

Available from: 2024-01-30 Created: 2024-01-30 Last updated: 2024-08-28Bibliographically approved
3. Multi-variant Execution at the Edge
Open this publication in new window or tab >>Multi-variant Execution at the Edge
2022 (English)In: MTD 2022: Proceedings of the 9th ACM Workshop on Moving Target Defense, co-located with CCS 2022, Association for Computing Machinery (ACM) , 2022, p. 11-22Conference paper, Published paper (Refereed)
Abstract [en]

Edge-Cloud computing offloads parts of the computations that traditionally occurs in the cloud to edge nodes. The binary format WebAssembly is increasingly used to distribute and deploy services on such platforms. Edge-Cloud computing providers let their clients deploy stateless services in the form of WebAssembly binaries, which are then translated to machine code, sandboxed and executed at the edge. In this context, we propose a technique that (i) automatically diversifies WebAssembly binaries that are deployed to the edge and (ii) randomizes execution paths at runtime. Thus, an attacker cannot exploit all edge nodes with the same payload. Given a service, we automatically synthesize functionally equivalent variants for the functions providing the service. All the variants are then wrapped into a single multivariant WebAssembly binary. When the service endpoint is executed, every time a function is invoked, one of its variants is randomly selected. We implement this technique in the MEWE tool and we validate it with 7 services for which MEWE generates multivariant binaries that embed hundreds of function variants. We execute the multivariant binaries on the world-wide edge platform provided by Fastly, as part as a research collaboration. We show that multivariant binaries exhibit a real diversity of execution traces across the whole edge platform distributed around the globe.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2022
Keywords
diversification, edge-cloud computing, moving target defense, multivariant execution, webassembly
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-333509 (URN)10.1145/3560828.3564007 (DOI)2-s2.0-85144823461 (Scopus ID)
Conference
9th ACM Workshop on Moving Target Defense, MTD 2022 - Co-located with CCS 2022, Los Angeles, United States of America, Nov 7 2022
Funder
Swedish Foundation for Strategic Research, trustfull
Note

Part of ISBN 9781450398787

QC 20230802

Available from: 2023-08-02 Created: 2023-08-02 Last updated: 2024-01-30Bibliographically approved
4. CROW: Code Diversification for WebAssembly
Open this publication in new window or tab >>CROW: Code Diversification for WebAssembly
Show others...
2021 (English)Conference paper, Published paper (Refereed)
Abstract [en]

The adoption of WebAssembly increases rapidly, as it provides a fast and safe model for program execution in the browser. However, WebAssembly is not exempt from vulnerabilities that can be exploited by malicious observers. Code diversification can mitigate some of these attacks. In this paper, we present the first fully automated workflow for the diversification of WebAssembly binaries. We present CROW, an open-source tool implementing this workflow through enumerative synthesis of diverse code snippets expressed in the LLVMintermediate representation. We evaluate CROW’s capabilitieson303C programs and study its use on a real-life security-sensitive program: libsodium, a modern cryptographic library. Overall, CROW is able to generate diverse variants for239out of303 (79%)small programs. Furthermore, our experiments show that our approach and tool is able to successfully diversify off-the-shelf cryptographic software (libsodium).

Place, publisher, year, edition, pages
USA: , 2021
Keywords
WebAssembly, Web, Diversification
National Category
Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-292642 (URN)10.14722/madweb.2021.23004 (DOI)
Conference
MADWeb, NDSS 2021
Funder
Swedish Foundation for Strategic Research, 3066 Trustfull
Note

Part of proceedings: ISBN 1-891562-66-5, QC 20230117

Available from: 2021-04-11 Created: 2021-04-11 Last updated: 2024-01-30Bibliographically approved
5. Superoptimization of WebAssembly bytecode
Open this publication in new window or tab >>Superoptimization of WebAssembly bytecode
Show others...
2020 (English)In: Conference Companion of the 4th International Conference on Art, Science, and Engineering of Programming, Portugal: Aakar Books, 2020Conference paper, Published paper (Refereed)
Abstract [en]

Motivated by the fast adoption of WebAssembly, we propose the first functional pipeline to support the superoptimization of WebAssembly bytecode. Our pipeline works over LLVM and Souper. We evaluate our superoptimization pipeline with 12 programs from the Rosetta code project. Our pipeline improves the code section size of 8 out of 12 programs. We discuss the challenges faced in superoptimization of WebAssembly with two case studies.

Place, publisher, year, edition, pages
Portugal: Aakar Books, 2020
Keywords
WebAssembly, Web, Superoptimization, Optimization, LLVM
National Category
Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-280046 (URN)10.1145/3397537.3397567 (DOI)2-s2.0-85090149388 (Scopus ID)
Conference
MoreVMs, Programming 2020
Funder
Swedish Foundation for Strategic Research, 3066 Trustfull
Note

QC 20201123

Available from: 2020-09-02 Created: 2020-09-02 Last updated: 2024-01-30Bibliographically approved
6. Scalable comparison of JavaScript V8 bytecode traces
Open this publication in new window or tab >>Scalable comparison of JavaScript V8 bytecode traces
2019 (English)In: Proceedings of the 11th ACM SIGPLAN International Workshop on Virtual Machines and Intermediate Languages, VMIL@SPLASH, New York, NY, USA: ACM Publications, 2019, p. 22-31, article id 3361228Conference paper, Published paper (Refereed)
Abstract [en]

The comparison and alignment of runtime traces are essential, e.g., for semantic analysis or debugging. However, naive sequence alignment algorithms cannot address the needs of the modern web: (i) the bytecode generation process of V8 is not deterministic; (ii) bytecode traces are large.

We present STRAC, a scalable and extensible tool tailored to compare bytecode traces generated by the V8 JavaScript engine. Given two V8 bytecode traces and a distance function between trace events, STRAC computes and provides the best alignment. The key insight is to split access between memory and disk. STRAC can identify semantically equivalent web pages and is capable of processing huge V8 bytecode traces whose order of magnitude matches today's web like https://2019.splashcon.org, which generates approx. 150k of V8 bytecode instructions.

Place, publisher, year, edition, pages
New York, NY, USA: ACM Publications, 2019
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-262911 (URN)10.1145/3358504.3361228 (DOI)000524238200003 ()2-s2.0-85077200950 (Scopus ID)
Conference
11th ACM SIGPLAN International Workshop on Virtual Machines and Intermediate Languages, VMIL@SPLASH 2019, Athens, Greece, October 22, 2019
Funder
Swedish Foundation for Strategic Research, 3066Swedish Foundation for Strategic Research, Trustfull
Note

Part of proceedings ISBN 978-1-4503-6987-9

QC 20191028

Available from: 2019-10-24 Created: 2019-10-24 Last updated: 2024-03-15Bibliographically approved

Open Access in DiVA

Kappa/Summary(2914 kB)577 downloads
File information
File name FULLTEXT01.pdfFile size 2914 kBChecksum SHA-512
2bda0db5bfaac5ccc481a9599a4e15346f8e0f3fb6436219f47cc0f77ffcee244a4e367f3e8bf252de9b7465cddcb84aff84a42ce0e489aba5625555fe12950c
Type summaryMimetype application/pdf

Authority records

Cabrera Arteaga, Javier

Search in DiVA

By author/editor
Cabrera Arteaga, Javier
By organisation
Software and Computer systems, SCS
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1728 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf