Modular Extraction of Control-Flow Graphs from Java Bytecode.
Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Student: Attilio Picoco Title: Modular Extraction of Control-Flow Graphs from Java Bytecode Abstract: In the current project we present a strategy to perform the extraction of Control-Flow Graphs (CFG) from Java Bytecode in a modular fashion. A modular approach allows to analyze incomplete systems, that is, software systems whose some of its components are not available. Our solution is based on the notion of interfaces. This technique requires the user to make some assumptions on the missing components. This information is necessary to resolve any of the inter-dependencies present in the system that involve the missing parts. We base our strategy on two previous results. First, we extend a formal definition of the Java Virtual Machine framework to represent incomplete Bytecode programs. This allows us to manage the presence of different configurations of a program whenever any component is either newly provided or modified or added. We define a refinement relation. We check if such a relation exists between two versions of the program, according to some formal rules. Next, we generalize a previous algorithm that performs the extraction of CFGs for closed programs only. We adapt it to cope with those methods not provided. The algorithm uses the information provided for the missing methods by the user to resolve any inter-dependence between an available method and a missing one. Inter-dependencies exist in terms of method invocations. In fact, there may be a call which can go either to missing method, or to a concrete one. Whenever there is a method invocation, we soundly over-approximate the possible receivers to the call, and the set of exceptions potentially propagated by them. We implemented the modular extraction framework in the ConFlEx tool. The tool checks if a given method is missing. If it is, the tool perform the over-approximations mentioned. Moreover, ConFlEx compares two versions of a program and checks if one refines the other, following the refinement rules we have defined. If it does, the tool performs the extraction of CFGs incrementally. Finally, the tool can also store and load partially the results obtained so that they can be reused in a later execution. Our technique over-approximate a lot to achieve soundness. Despite this, our test cases show that ConFlEx is efficient. Moreover, the extraction of the CFGs gets considerable speed-up by reusing results from previous analyses, whenever the task is performed after providing those components not available originally.
Place, publisher, year, edition, pages
Trita-CSC-E, ISSN 1653-5715 ; 2012:091
IdentifiersURN: urn:nbn:se:kth:diva-130977OAI: oai:DiVA.org:kth-130977DiVA: diva2:654423
Master of Science in Engineering - Computer Science and Technology
de Carvalho Gomes, Pedro