kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Castañeda Lozano, RobertoORCID iD iconorcid.org/0000-0002-2806-7333
Alternative names
Publications (10 of 15) Show all publications
Tsoupidi, R. M., Castañeda Lozano, R., Troubitsyna, E. & Papadimitratos, P. (2023). Securing Optimized Code Against Power Side Channels. In: : . Paper presented at 36th IEEE Computer Security Foundations Symposium.
Open this publication in new window or tab >>Securing Optimized Code Against Power Side Channels
2023 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Side-channel attacks impose a serious threat to cryptographic algorithms, including widely employed ones, such as AES and RSA. These attacks take advantage of the algorithm implementation in hardware or software to extract secret information via side channels. Software masking is a mitigation approach against power side-channel attacks aiming at hiding the secret-revealing dependencies from the power footprint of a vulnerable implementation. However, this type of software mitigation often depends on general-purpose compilers, which do not preserve non-functional properties. Moreover, microarchitectural features, such as the memory bus and register reuse, may also leak secret information. These abstractions are not visible at the high-level implementation of the program. Instead, they are decided at compile time. To remedy these problems, security engineers often sacrifice code efficiency by turning off compiler optimization and/or performing local, post-compilation trans-formations. This paper proposes Secure by Construction Code Generation (SecCG), a constraint-based compiler approach that generates optimized yet protected against power side channels code. SecCG controls the quality of the mitigated program by efficiently searching the best possible low-level implementation according to a processor cost model. In our experiments with twelve masked cryptographic functions up to 100 lines of code on Mips32 and ARM Thumb, SecCG speeds up the generated code from 77% to 6.6 times compared to non-optimized secure code with an overhead of up to 13% compared to non-secure optimized code at the expense of a high compilation cost. For security and compiler researchers, this paper proposes a formal model to generate power side channel free low-level code. For software engineers, SecCG provides a practical approach to optimize performance critical and vulnerable cryptographic implementations that preserve security properties against power side channels.

Keywords
compilation, power side-channel attacks, code optimization, software masking, constraint programming
National Category
Computer Systems Embedded Systems
Identifiers
urn:nbn:se:kth:diva-326720 (URN)
Conference
36th IEEE Computer Security Foundations Symposium
Note

QC 20230522

Available from: 2023-05-09 Created: 2023-05-09 Last updated: 2025-01-17Bibliographically approved
Castañeda Lozano, R., Carlsson, M., Hjort Blindell, G. & Schulte, C. (2019). Combinatorial register allocation and instruction scheduling. ACM Transactions on Programming Languages and Systems, 41(3), Article ID 17.
Open this publication in new window or tab >>Combinatorial register allocation and instruction scheduling
2019 (English)In: ACM Transactions on Programming Languages and Systems, ISSN 0164-0925, E-ISSN 1558-4593, Vol. 41, no 3, article id 17Article in journal (Refereed) Published
Abstract [en]

This article introduces a combinatorial optimization approach to register allocation and instruction scheduling, two central compiler problems. Combinatorial optimization has the potential to solve these problems optimally and to exploit processor-specific features readily. Our approach is the first to leverage this potential in practice: it captures the complete set of program transformations used in state-of-the-art compilers, scales to medium-sized functions of up to 1,000 instructions, and generates executable code. This level of practicality is reached by using constraint programming, a particularly suitable combinatorial optimization technique. Unison, the implementation of our approach, is open source, used in industry, and integrated with the LLVM toolchain. An extensive evaluation confirms that Unison generates better code than LLVM while scaling to medium-sized functions. The evaluation uses systematically selected benchmarks from MediaBench and SPEC CPU2006 and different processor architectures (Hexagon, ARM, MIPS). Mean estimated speedup ranges from 1.1% to 10% and mean code size reduction ranges from 1.3% to 3.8% for the different architectures. A significant part of this improvement is due to the integrated nature of the approach. Executing the generated code on Hexagon confirms that the estimated speedup results in actual speedup. Given a fixed time limit, Unison solves optimally functions of up to 946 instructions, nearly an order of magnitude larger than previous approaches. The results show that our combinatorial approach can be applied in practice to trade compilation time for code quality beyond the usual compiler optimization levels, identify improvement opportunities in heuristic algorithms, and fully exploit processor-specific features.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2019
Keywords
Combinatorial optimization, Instruction scheduling, Register allocation
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-262613 (URN)10.1145/3332373 (DOI)000501479500004 ()2-s2.0-85068443848 (Scopus ID)
Note

QC 20191024

Not duplicate with DiVA 1232469

Available from: 2019-10-24 Created: 2019-10-24 Last updated: 2024-03-18Bibliographically approved
Castañeda Lozano, R. & Schulte, C. (2019). Survey on Combinatorial Register Allocation and Instruction Scheduling. ACM Computing Surveys, 52(3), Article ID 62.
Open this publication in new window or tab >>Survey on Combinatorial Register Allocation and Instruction Scheduling
2019 (English)In: ACM Computing Surveys, ISSN 0360-0300, E-ISSN 1557-7341, Vol. 52, no 3, article id 62Article in journal (Refereed) Published
Abstract [en]

Register allocation (mapping variables to processor registers or memory) and instruction scheduling (reordering instructions to increase instruction-level parallelism) are essential tasks for generating efficient assembly code in a compiler. In the past three decades, combinatorial optimization has emerged as an alternative to traditional, heuristic algorithms for these two tasks. Combinatorial optimization approaches can deliver optimal solutions according to a model, can precisely capture trade-offs between conflicting decisions, and are more flexible at the expense of increased compilation time. This article provides an exhaustive literature review and a classification of combinatorial optimization approaches to register allocation and instruction scheduling, with a focus on the techniques that are most applied in this context: integer programming, constraint programming, partitioned Boolean quadratic programming, and enumeration. Researchers in compilers and combinatorial optimization can benefit from identifying developments, trends, and challenges in the area; compiler practitioners may discern opportunities and grasp the potential benefit of applying combinatorial optimization.

Place, publisher, year, edition, pages
ASSOC COMPUTING MACHINERY, 2019
Keywords
Combinatorial optimization, register allocation, instruction scheduling
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-257823 (URN)10.1145/3200920 (DOI)000481570600019 ()2-s2.0-85068049964 (Scopus ID)
Note

Not duplicate with DiVA 758112.

Not duplicate with DiVA 1232921.

QC 20190906

Available from: 2019-09-06 Created: 2019-09-06 Last updated: 2022-06-26Bibliographically approved
Castañeda Lozano, R., Carlsson, M., Hjort Blindell, G. & Schulte, C. (2018). Combinatorial Register Allocation and Instruction Scheduling.
Open this publication in new window or tab >>Combinatorial Register Allocation and Instruction Scheduling
2018 (English)Report (Other academic)
Abstract [en]

This paper introduces a combinatorial optimization approach to register allocation and instruction scheduling, two central compiler problems. Combinatorial optimization has the potential to solve these problems optimally and to exploit processor-specific features readily. Our approach is the first to leverage this potential in practice: it captures the complete set of program transformations used in state-of-the-art compilers, scales to medium-sized functions of up to 1000 instructions, and generates executable code. This level of practicality is reached by using constraint programming, a particularly suitable combinatorial optimization technique. Unison, the implementation of our approach, is open source, used in industry, and integrated with the LLVM toolchain.

An extensive evaluation of estimated speed, code size, and scalability confirms that Unison generates better code than LLVM while scaling to medium-sized functions. The evaluation uses systematically selected benchmarks from MediaBench and SPEC CPU2006 and different processor architectures (Hexagon, ARM, MIPS). Mean estimated speedup ranges from 1% to 9.3% and mean code size reduction ranges from 0.8% to 3.9% for the different architectures. Executing the generated code on Hexagon confirms that the estimated speedup indeed results in actual speedup. Given a fixed time limit, Unison solves optimally functions of up to 647 instructions, delivers improved solutions for functions of up to 874 instructions, and achieves more than 85% of the potential speed for 90% of the functions on Hexagon.

The results in this paper show that our combinatorial approach can be used in practice to trade compilation time for code quality beyond the usual compiler optimization levels, fully exploit processor-specific features, and identify improvement opportunities in existing heuristic algorithms.

National Category
Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-232118 (URN)
Note

QC 20180822

Available from: 2018-07-11 Created: 2018-07-11 Last updated: 2024-03-15Bibliographically approved
Castañeda Lozano, R. (2018). Constraint-Based Register Allocation and Instruction Scheduling. (Doctoral dissertation). Stockholm: KTH Royal Institute of Technology
Open this publication in new window or tab >>Constraint-Based Register Allocation and Instruction Scheduling
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Register allocation (mapping variables to processor registers or memory) and instruction scheduling (reordering instructions to improve latency or throughput) are central compiler problems. This dissertation proposes a combinatorial optimization approach to these problems that delivers optimal solutions according to a model, captures trade-offs between conflicting decisions, accommodates processor-specific features, and handles different optimization criteria.

The use of constraint programming and a novel program representation enables a compact model of register allocation and instruction scheduling. The model captures the complete set of global register allocation subproblems (spilling, assignment, live range splitting, coalescing, load-store optimization, multi-allocation, register packing, and rematerialization) as well as additional subproblems that handle processor-specific features beyond the usual scope of conventional compilers.

The approach is implemented in Unison, an open-source tool used in industry and research that complements the state-of-the-art LLVM compiler. Unison applies general and problem-specific constraint solving methods to scale to medium-sized functions, solving functions of up to 647 instructions optimally and improving functions of up to 874 instructions. The approach is evaluated experimentally using different processors (Hexagon, ARM and MIPS), benchmark suites (MediaBench and SPEC CPU2006), and optimization criteria (speed and code size reduction). The results show that Unison generates code of slightly to significantly better quality than LLVM, depending on the characteristics of the targeted processor (1% to 9.3% mean estimated speedup; 0.8% to 3.9% mean code size reduction). Additional experiments for Hexagon show that its estimated speedup has a strong monotonic relationship to the actual execution speedup, resulting in a mean speedup of 5.4% across MediaBench applications.

The approach contributed by this dissertation is the first of its kind that is practical (it captures the complete set of subproblems, scales to medium-sized functions, and generates executable code) and effective (it generates better code than the LLVM compiler, fulfilling the promise of combinatorial optimization). It can be applied to trade compilation time for code quality beyond the usual optimization levels, explore and exploit processor-specific features, and identify improvement opportunities in conventional compilers.

Abstract [sv]

Registerallokering (tilldelning av programvariabler till processorregister eller minne) och instruktionsschemaläggning (omordning av instruktioner för att förbättra latens eller genomströmning) är centrala kompilatorproblem. Denna avhandling presenterar en kombinatorisk optimeringsmetod för dessa problem. Metoden, som är baserad på en formell modell, är kraftfull nog att ge optimala lösningar och göra avvägningar mellan motstridiga optimeringsval. Den kan till fullo uttnyttja processorspecifika funktioner och uttrycka olika optimeringsmål.

Användningen av villkorsprogrammering och en ny programrepresentation möjliggör en kompakt modell av registerallokering och instruktionsschemaläggning. Modellen omfattar samtliga delproblem som ingår i global registerallokering: spilling, tilldelning, live range splitting, coalescing, load-store-optimering, flertilldelning, registerpackning och rematerialisering. Förutom dessa, kan den också integrera processorspecifika egenskaper som går utanför vad konventionella kompilatorer hanterar.

Metoden implementeras i Unison, ett öppen-källkods-verktyg som används inom industri- och forskningsvärlden och utgör ett komplement till LLVM-kompilatorn. Unison tillämpar allmänna och problemspecifika villkorslösningstekniker för att skala till medelstora funktioner, lösa funktioner med upp till 647 instruktioner optimalt och förbättra funktioner på upp till 874 instruktioner. Metoden utvärderas experimentellt för olika målprocessorer (Hexagon, ARM och MIPS), benchmark-sviter (MediaBench och SPEC CPU2006) och optimeringsmål (hastighet och kodstorlek). Resultaten visar att Unison genererar kod av något till betydligt bättre kvalitet än LLVM. Den uppskattade hastighetsförbättringen varierar mellan 1% till 9.3% och kodstorleksreduktionen mellan 0.8% till~3.9%, beroende på målprocessor. Ytterligare experiment för Hexagon visar att dess uppskattade hastighetsförbättring har ett starkt monotoniskt förhållande till den faktiska exekveringstiden, vilket resulterar i en 5.4% genomsnittlig hastighetsförbättring för MediaBench-applikationer.

Denna avhandling beskriver den första praktiskt användbara kombinatoriska optimeringsmetoden för integrerad registerallokering och instruktionsschemaläggning. Metoden är praktiskt användbar då den hanterar samtliga ingående delproblem, genererar exekverbar maskinkod och skalar till medelstora funktioner. Den är också effektiv då den genererar bättre maskinkod än LLVM-kompilatorn. Metoden kan tillämpas för att byta kompileringstid mot kodkvalitet utöver de vanliga optimeringsnivåerna, utforska och utnyttja processorspecifika egenskaper samt identifiera förbättringsmöjligheter i konventionella kompilatorer.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2018. p. 60
Series
TRITA-EECS-AVL ; 2018:48
Series
SICS Dissertation Series, ISSN 1101-1335 ; 78
Keywords
constraint programming, combinatorial optimization, register allocation, instruction scheduling, compiler construction
National Category
Computer Systems Computer Sciences
Research subject
Information and Communication Technology
Identifiers
urn:nbn:se:kth:diva-232192 (URN)978-91-7729-853-3 (ISBN)
Public defence
2018-09-03, Sal Ka-208, Electrum, Kistagången 16, Kista, Stockholm, 13:15 (English)
Opponent
Supervisors
Note

QC 20180716

Available from: 2018-07-16 Created: 2018-07-13 Last updated: 2022-06-26Bibliographically approved
Castañeda Lozano, R. & Schulte, C. (2018). Survey on Combinatorial Register Allocation and Instruction Scheduling. ACM Computing Surveys
Open this publication in new window or tab >>Survey on Combinatorial Register Allocation and Instruction Scheduling
2018 (English)In: ACM Computing Surveys, ISSN 0360-0300, E-ISSN 1557-7341Article in journal (Refereed) In press
Abstract [en]

Register allocation (mapping variables to processor registers or memory) and instruction scheduling (reordering instructions to increase instruction-level parallelism) are essential tasks for generating efficient assembly code in a compiler. In the last three decades, combinatorial optimization has emerged as an alternative to traditional, heuristic algorithms for these two tasks. Combinatorial optimization approaches can deliver optimal solutions according to a model, can precisely capture trade-offs between conflicting decisions, and are more flexible at the expense of increased compilation time.

This paper provides an exhaustive literature review and a classification of combinatorial optimization approaches to register allocation and instruction scheduling, with a focus on the techniques that are most applied in this context: integer programming, constraint programming, partitioned Boolean quadratic programming, and enumeration. Researchers in compilers and combinatorial optimization can benefit from identifying developments, trends, and challenges in the area; compiler practitioners may discern opportunities and grasp the potential benefit of applying combinatorial optimization.

Place, publisher, year, edition, pages
ACM Press, 2018
National Category
Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-232189 (URN)
Funder
Swedish Research Council, 621-2011-6229
Note

Not duplicate with DiVA 1348978.

Not duplicate with DiVA 758112.

QC 20180823

Available from: 2018-07-13 Created: 2018-07-13 Last updated: 2022-06-26Bibliographically approved
Hjort Blindell, G., Carlsson, M., Castañeda Lozano, R. & Schulte, C. (2017). Complete and Practical Universal Instruction Selection. Paper presented at International Conferences on Compilers, Architectures and Synthesis for Embedded Systems. ACM Transactions on Embedded Computing Systems
Open this publication in new window or tab >>Complete and Practical Universal Instruction Selection
2017 (English)In: ACM Transactions on Embedded Computing Systems, ISSN 1539-9087, E-ISSN 1558-3465Article in journal (Refereed) Published
Abstract [en]

In code generation, instruction selection chooses processor instructions to implement a program under compilation where code quality crucially depends on the choice of instructions. Using methods from combinatorial optimization, this paper proposes an expressive model that integrates global instruction selection with global code motion.The model introduces (1) handling of memory computations and function calls, (2) a method for inserting additional jump instructions where necessary, (3) a dependency-based technique to ensure correct combinations of instructions, (4) value reuse to improve code quality, and (5) an objective function that reduces compilation time and increases scalability by exploiting bounding techniques. The approach is demonstrated to be complete and practical, competitive with LLVM, and potentially optimal (w.r.t. the model) for medium-sized functions. The results show that combinatorial optimization for instruction selection is well-suited to exploit the potential of modern processors in embedded systems.

Place, publisher, year, edition, pages
ACM Press, 2017
Keywords
instruction selection, code generation, constraint programming, combinatorial optimization
National Category
Computer Systems Embedded Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-213968 (URN)10.1145/3126528 (DOI)000414353800002 ()2-s2.0-85030692980 (Scopus ID)
Conference
International Conferences on Compilers, Architectures and Synthesis for Embedded Systems
Funder
Swedish Research Council, 621-2011-6229
Note

QC 20170908

Available from: 2017-09-07 Created: 2017-09-07 Last updated: 2022-06-27Bibliographically approved
Castañeda Lozano, R., Carlsson, M., Hjort Blindell, G. & Schulte, C. (2016). Register allocation and instruction scheduling in Unison. In: Proceedings of CC 2016: The 25th International Conference on Compiler Construction. Paper presented at 25th International Conference on Compiler Construction, CC 2016, Barcelona, Spain, 17 March 2016 through 18 March 2016 (pp. 263-264). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Register allocation and instruction scheduling in Unison
2016 (English)In: Proceedings of CC 2016: The 25th International Conference on Compiler Construction, Association for Computing Machinery (ACM), 2016, p. 263-264Conference paper, Published paper (Refereed)
Abstract [en]

This paper describes Unison, a simple, flexible, and potentially optimal software tool that performs register allocation and instruction scheduling in integration using combinatorial optimization. The tool can be used as an alternative or as a complement to traditional approaches, which are fast but complex and suboptimal. Unison is most suitable whenever high-quality code is required and longer compilation times can be tolerated (such as in embedded systems or library releases), or the targeted processors are so irregular that traditional compilers fail to generate satisfactory code.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2016
Keywords
Combinatorial optimization, Instruction scheduling, Register allocation
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-183393 (URN)10.1145/2892208.2892237 (DOI)000389808800026 ()2-s2.0-84966560429 (Scopus ID)9781450342414 (ISBN)
Conference
25th International Conference on Compiler Construction, CC 2016, Barcelona, Spain, 17 March 2016 through 18 March 2016
Note

QC 20161122

Available from: 2016-03-09 Created: 2016-03-09 Last updated: 2024-03-15Bibliographically approved
Lozano, R. C., Carlsson, M., Hjort Blindell, G. & Schulte, C. (2014). Combinatorial Spill Code Optimization and Ultimate Coalescing. SIGPLAN notices, 49(5), 23-32
Open this publication in new window or tab >>Combinatorial Spill Code Optimization and Ultimate Coalescing
2014 (English)In: SIGPLAN notices, ISSN 0362-1340, E-ISSN 1558-1160, Vol. 49, no 5, p. 23-32Article in journal (Refereed) Published
Abstract [en]

This paper presents a novel combinatorial model that integrates global register allocation based on ultimate coalescing, spill code optimization, register packing, and multiple register banks with instruction scheduling (including VLIW). The model exploits alternative temporaries that hold the same value as a new concept for ultimate coalescing and spill code optimization. The paper presents Unison as a code generator based on the model and advanced solving techniques using constraint programming. Thorough experiments using MediaBench and a processor (Hexagon) that are typical for embedded systems demonstrate that Unison: is robust and scalable; generates faster code than LLVM (up to 4 1 % with a mean improvement of 7 %); possibly generates optimal code (for 2 9 % of the experiments); effortlessly supports different optimization criteria (code size on par with LLVM). Unison is significant as it addresses the same aspects as traditional code generation algorithms, yet is based on a simple integrated model and robustly can generate optimal code.

Keywords
spill code optimization, ultimate coalescing, combinatorial optimization, register allocation, instruction scheduling
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-154398 (URN)10.1145/2597809.2597815 (DOI)000341937800004 ()2-s2.0-84905660668 (Scopus ID)
Funder
Swedish Research Council, VR 621-2011-6229
Note

QC 20141021

Available from: 2014-10-21 Created: 2014-10-20 Last updated: 2022-06-23Bibliographically approved
Castañeda Lozano, R. (2014). Integrated Register Allocation and Instruction Scheduling with Constraint Programming. (Licentiate dissertation). Stockholm, Sweden: KTH Royal Institute of Technology
Open this publication in new window or tab >>Integrated Register Allocation and Instruction Scheduling with Constraint Programming
2014 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

This dissertation proposes a combinatorial model, program representations, and constraint solving techniques for integrated register allocation and instruction scheduling in compiler back-ends. In contrast to traditional compilers based on heuristics, the proposed approach generates potentially optimal code by considering all trade-offs between interdependent decisions as a single optimization problem.

The combinatorial model is the first to handle a wide array of global register allocation subtasks, including spill code optimization, ultimate coalescing, register packing, and register bank assignment, as well as instruction scheduling for Very Long Instruction Word (VLIW) processors. The model is based on three novel, complementary program representations: Linear Static Single Assignment for global register allocation; copy extension for spilling, basic coalescing, and register bank assignment; and alternative temporaries for spill code optimization and ultimate coalescing. Solving techniques are proposed that exploit the program representation properties for scalability.

The model, program representations, and solving techniques are implemented in Unison, a code generator that delivers potentially optimal code while scaling to medium-size functions. Thorough experiments show that Unison: generates faster code (up to 41% with a mean improvement of 7%) than LLVM (a state-of-the-art compiler) for Hexagon (a challenging VLIW processor), generates code that is competitive with LLVM for MIPS32 (a simpler RISC processor), is robust across different benchmarks such as MediaBench and SPECint 2006, scales up to medium-size functions of up to 1000 instructions, and adapts easily to different optimization criteria.

The contributions of this dissertation are significant. They lead to a combinatorial approach for integrated register allocation and instruction scheduling that is, for the first time, practical (it robustly scales to medium-size functions) and effective (it yields better code than traditional heuristic approaches).

Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2014. p. 48
Series
TRITA-ICT-ECS AVH, ISSN 1653-6363 ; 14:13
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-154599 (URN)978-91-7595-311-3 (ISBN)
Presentation
2014-11-27, Sal A, Electrum, Kistagången 16, Kista, Stockholm, 14:00 (English)
Opponent
Supervisors
Note

QC 20141117

Available from: 2014-11-17 Created: 2014-10-24 Last updated: 2022-06-23Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-2806-7333

Search in DiVA

Show all publications