kth.sePublications
Change search
Refine search result
1234567 51 - 100 of 1224
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 51.
    Araújo De Medeiros, Daniel
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).
    Schieffer, Gabin
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).
    Wahlgren, Jacob
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).
    Peng, Ivy
    KTH.
    A GPU-Accelerated Molecular Docking Workflow with Kubernetes and Apache Airflow2023In: High Performance Computing: ISC High Performance 2023 International Workshops, Revised Selected Papers, Springer Nature , 2023, p. 193-206Conference paper (Refereed)
    Abstract [en]

    Complex workflows play a critical role in accelerating scientific discovery. In many scientific domains, efficient workflow management can lead to faster scientific output and broader user groups. Workflows that can leverage resources across the boundary between cloud and HPC are a strong driver for the convergence of HPC and cloud. This study investigates the transition and deployment of a GPU-accelerated molecular docking workflow that was designed for HPC systems onto a cloud-native environment with Kubernetes and Apache Airflow. The case study focuses on state-of-of-the-art molecular docking software for drug discovery. We provide a DAG-based implementation in Apache Airflow and technical details for GPU-accelerated deployment. We evaluated the workflow using the SWEETLEAD bioinformatics dataset and executed it in a Cloud environment with heterogeneous computing resources. Our workflow can effectively overlap different stages when mapped onto different computing resources.

  • 52.
    Ardah, Khaled
    et al.
    Univ Fed Ceara, Wireless Telecom Res Grp, BR-60020181 Fortaleza, Ceara, Brazil..
    Fodor, Gabor
    KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control). Ericsson Res, SE-16480 Stockholm, Sweden.
    Silva, Yuri C. B.
    Univ Fed Ceara, Wireless Telecom Res Grp, BR-60020181 Fortaleza, Ceara, Brazil..
    Freitas, Walter C., Jr.
    Univ Fed Ceara, Wireless Telecom Res Grp, BR-60020181 Fortaleza, Ceara, Brazil..
    Cavalcanti, Francisco R. P.
    Univ Fed Ceara, Wireless Telecom Res Grp, BR-60020181 Fortaleza, Ceara, Brazil..
    A Unifying Design of Hybrid Beamforming Architectures Employing Phase Shifters or Switches2018In: IEEE Transactions on Vehicular Technology, ISSN 0018-9545, E-ISSN 1939-9359, Vol. 67, no 11, p. 11243-11247Article in journal (Refereed)
    Abstract [en]

    Hybrid beamfiorming (BF) architectures employing phase shifters or switches reduce the number of required radio frequency chains and the power consumption of base stations that employ a large number of antennas. Due to the inherent tradeoff between the number of radio frequency chains, the complexity of the employed analog and digital BF algorithms and the achieved spectral and energy efficiency, designing hybrid BF architectures is a complex task. To deal with this ormplexity, we propose a unifying design that is applicable to architectures employing either phase shifters or switches. In our design, the analog part (!if the hybrid BF architecture maximizes the capacity of the equivalent channel, while the digital part is updated using the well-known block diagonalizat' approach. We then employ the proposed joint analog-digital beamforming algorithm on lour recently proposed hybrid architectures and compare their performance in terms of spectral and energy efficiency, and find that the proposed analog-digital BF algorithm outperforms previously proposed schemes. We also find that phase shifterbased architectures achieve high spectral efficiency, whereas switching-based architectures can boost energy efficiency with increasing number of base station antennas.

  • 53.
    Ardelius, John
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    On the Performance Analysis of Large Scale, Dynamic, Distributed and Parallel Systems.2013Doctoral thesis, monograph (Other academic)
    Abstract [en]

    Evaluating the performance of large distributed applications is an important and non-trivial task. With the onset of Internet wide applications there is an increasing need to quantify reliability, dependability and performance of these systems, both as a guide in system design as well as a means to understand the fundamental properties of large-scale distributed systems. Previous research has mainly focused on either formalised models where system properties can be deduced and verified using rigorous mathematics or on measurements and experiments on deployed applications. Our aim in this thesis is to study models on an abstraction level lying between the two ends of this spectrum. We adopt a model of distributed systems inspired by methods used in the study of large scale system of particles in physics and model the application nodes as a set of interacting particles each with an internal state whose actions are specified by the application program. We apply our modeling and performance evaluation methodology to four different distributed and parallel systems. The first system is the distributed hash table (DHT) Chord running in a dynamic environment.  We study the system under two scenarios. First we study how performance (in terms of lookup latency) is affectedon a network with finite communication latency. We show that an average delay in conjunction with other parameters describing changes in the network (such as timescales for network repair and join and leave processes)induces fundamentally different system performance. We also verify our analytical predictions via simulations.In the second scenario we introduce network address translators (NATs) to the network model. This makes the overlay topology non-transitive and we explore the implications of this fact to various performance metrics such as lookup latency, consistency and load balance. The latter analysis is mainly simulation based.Even though these two studies focus on a specific DHT, many of our results can easily be translated to other similar ring-based DHTs with long-range links, and the same methodology can be applied evento DHT's based on other geometries.The second type of system studied is an unstructured gossip protocol running a distributed version of the famous Belman-Ford algorithm. The algorithm, called GAP, generates a spanning tree over the participating nodes and the question we set out to study is how reliable this structure is(in terms of generating accurate aggregate values at the root)  in the presence of node churn. All our analytical results are also verified  using simulations.The third system studied is a content distribution network (CDN) of interconnected caches in an aggregation access network. In this model, content which sits at the leaves of the cache hierarchy tree, is requested by end users. Requests can then either be served by the first cache level or sent further up the tree. We study the performance of the whole system under two cache eviction policies namely LRU and LFU. We compare our analytical results with traces from related caching systems.The last system is a work stealing heuristic for task distribution in the TileraPro64 chip. This system has access to a shared memory and is therefore classified as a parallel system. We create a model for the dynamic generation of tasks as well as how they are executed and distributed among the participating nodes. We study how the heuristic scales when the number of nodes exceeds the number of processors on the chip as well as how different work stealing policies compare with each other. The work on this model is mainly simulation-based.

    Download full text (pdf)
    On the Performance Analysis of Large Scale, Dynamic, Distributed and Parallel Systems.pdf
  • 54.
    Arman, Ala
    et al.
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Al-Shishtawy, Ahmad
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Vlassov, Vladimir
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Elasticity controller for Cloud-based key-value stores2012In: Parallel and Distributed Systems (ICPADS), 2012 IEEE 18th International Conference on, IEEE , 2012, p. 268-275Conference paper (Refereed)
    Abstract [en]

    Clouds provide an illusion of an infinite amount of resources and enable elastic services and applications that are capable to scale up and down (grow and shrink by requesting and releasing resources) in response to changes in its environment, workload, and Quality of Service (QoS) requirements. Elasticity allows to achieve required QoS at a minimal cost in a Cloud environment with its pay-as-you-go pricing model. In this paper, we present our experience in designing a feedback elastically controller for a key-value store. The goal of our research is to investigate the feasibility of the control theoretic approach to the automation of elasticity of Cloud-based key-value stores. We describe design steps necessary to build a feedback controller for a real system, namely Voldemort, which we use as a case study in this work. The design steps include defining touchpoints (sensors and actuators), system identification, and controller design. We have designed, developed, and implemented a prototype of the feedback elasticity controller for Voldemort. Our initial evaluation results show the feasibility of using feedback control to automate elasticity of distributed keyvalue stores.

  • 55. Armengaud, Eric
    et al.
    Biehl, Matthias
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics.
    Bourrouilh, Quentin
    Breunig, Michael
    Farfeleder, Stefan
    Hein, Christian
    Oertel, Markus
    Wallner, Alfred
    Zoier, Markus
    Integrated tool chain for improving traceability during the development of automotive systems2012In: ERTS2 2012 | Embedded Real Time Software and Systems, 2012Conference paper (Refereed)
    Abstract [en]

    Tool integration is a key factor for improving development efficiency and product quality during the development of safety-relevant embedded systems. We present in this work a demonstrator based on the most recent outcomes of the CESAR project. The proposed integrated tool-chain aims at better linking development activities together, thus improving traceability during requirements engineering, system design, safety analysis and V&V activities using a model-based development approach. We analyze the proposed tool-chain from three different points of view: (1) tool integrator, (2) technology provider, and (3) end-user. These different points of view enable the description of the different technologies used at the different levels and the analysis of the benefits for the end-user.

  • 56.
    Arsalan, Muhammad
    et al.
    Tech Univ Carolo Wilhelmina Braunschweig, Braunschweig, Germany..
    Di Matteo, Davide
    KTH.
    Imtiaz, Sana
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS. KRY Int AB, Stockholm, Sweden..
    Abbas, Zainab
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS. KRY Int AB, Stockholm, Sweden..
    Vlassov, Vladimir
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Issakov, Vadim
    Tech Univ Carolo Wilhelmina Braunschweig, Braunschweig, Germany..
    Energy-Efficient Privacy-Preserving Time-Series Forecasting on User Health Data Streams2022In: Proceedings - 2022 IEEE 21st International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2022, Institute of Electrical and Electronics Engineers (IEEE) , 2022, p. 541-546Conference paper (Refereed)
    Abstract [en]

    Health monitoring devices are gaining popularity both as wellness tools and as a source of information for healthcare decisions. In this work, we use Spiking Neural Networks (SNNs) for time-series forecasting due to their proven energy-saving capabilities. Thanks to their design that closely mimics the natural nervous system, SNNs are energy-efficient in contrast to classic Artificial Neural Networks (ANNs). We design and implement an energy-efficient privacy-preserving forecasting system on real-world health data streams using SNNs and compare it to a state-of-the-art system with Long short-term memory (LSTM) based prediction model. Our evaluation shows that SNNs tradeoff accuracy (2.2x greater error), to grant a smaller model (19% fewer parameters and 77% less memory consumption) and a 43% less training time. Our model is estimated to consume 3.36 mu J energy, which is significantly less than the traditional ANNs. Finally, we apply epsilon-differential privacy for enhanced privacy guarantees on our federated learning-based models. With differential privacy of epsilon = 0.1, our experiments report an increase in the measured average error (RMSE) of only 25%.

  • 57.
    Arslan, Bercis
    KTH, School of Industrial Engineering and Management (ITM), Industrial Economics and Management (Dept.).
    Ecological Sustainability in Software Development: The Case of a Technical Consultancy Firm2021Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Sustainability in the software and Information Technology (IT) industry has previously been discussed by practitioners mostly with a focus on maintainability and extensibility. In turn, the ecological and environmental dimensions of sustainability have been neglected. Previous research has shown that there are obstacles in the industry in terms of knowledge, experience, and support. Lack of knowledge stems from a lack of tools to detect and determine factors that affect environmental sustainability in software development, such as energy consumption. Furthermore, examining employees’motivations, attitudes, and discretionary behaviours is important to understand how implementation can be enabled and sustained. The purpose of this study is to find practices and tools for achieving environmental sustainability in software development as well as understanding what factors are hindering software engineers from adopting sustainable practices and tools that already exist. A qualitative single case study was conducted with semi-structured interviews as the primary method for data collection. The interviews were performed with individuals with various roles within software engineering as well as their managers. The findings show that the focus on environmental sustainability in software development is insufficient as for now. Practices, such as, reduction of CPU cycles and inactivating idle programs are suggested as environmentally friendly. Additionally, the findings display hinders in areas such as responsibility, requirements, and knowledge. Organizations and their stakeholders have to prioritize and work against these hindrances in order to succeed with environmental efforts. 

    Download full text (pdf)
    fulltext
  • 58.
    Artho, Cyrille
    et al.
    KTH.
    Ölveczky, P.C.
    Preface2017In: 5th International Workshop on Formal Techniques for Safety-Critical Systems, FTSCS 2016, Springer Verlag , 2017Conference paper (Refereed)
  • 59. Asad, H. A.
    et al.
    Wouters, Erik Henricus
    KTH.
    Bhatti, N. A.
    Mottola, L.
    Voigt, T.
    On Securing Persistent State in Intermittent Computing2020In: ENSsys 2020 - Proceedings of the 8th International Workshop on Energy Harvesting and Energy-Neutral Sensing Systems, Association for Computing Machinery, Inc , 2020, p. 8-14Conference paper (Refereed)
    Abstract [en]

    We present the experimental evaluation of different security mechanisms applied to persistent state in intermittent computing. Whenever executions become intermittent because of energy scarcity, systems employ persistent state on non-volatile memories (NVMs) to ensure forward progress of applications. Persistent state spans operating system and network stack, as well as applications. While a device is off recharging energy buffers, persistent state on NVMs may be subject to security threats such as stealing sensitive information or tampering with configuration data, which may ultimately corrupt the device state and render the system unusable. Based on modern platforms of the Cortex M*series, we experimentally investigate the impact on typical intermittent computing workloads of different means to protect persistent state, including software and hardware implementations of staple encryption algorithms and the use of ARM TrustZone protection mechanisms. Our results indicate that i) software implementations bear a significant overhead in energy and time, sometimes harming forward progress, but also retaining the advantage of modularity and easier updates; ii) hardware implementations offer much lower overhead compared to their software counterparts, but require a deeper understanding of their internals to gauge their applicability in given application scenarios; and iii) TrustZone shows almost negligible overhead, yet it requires a different memory management and is only effective as long as attackers cannot directly access the NVMs.

  • 60. Ashjaei, Mohammad
    et al.
    Moghaddami Khalilzad, Nima
    KTH.
    Mubeen, Saad
    Behnam, Moris
    Sander, Ingo
    Almeida, Luis
    Nolte, Thomas
    Designing end-to-end resource reservations in predictable distributed embedded systems2017In: Real-time systems, ISSN 0922-6443, E-ISSN 1573-1383, Vol. 53, no 6, p. 916-956Article in journal (Refereed)
  • 61.
    Aslam Butt, Haseeb
    KTH, School of Electrical Engineering and Computer Science (EECS).
    Investigation into tools to increase Observability of 2oo2 OS based Generic Product2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    2 out of 2 (2oo2) OS based generic product is a generic platform used byBombardier Transportation to develop safety critical, SIL 3 and SIL 4 levelspecialized Railway products. The 2oo2 architecture is based on composite failsafety design technique. During the development and integration of specializedproduct, debugging and optimization efforts are critical to timely market thenew product. In the presence of tools that can increase the observability of thesystem, the process of debugging and optimization can be made more efficient.This thesis examines the availability of tools to enhance the observability of the2 out of 2 OS based generic product. Tracing and profiling techniques wereidentified as possible techniques that would best fit in our context forobservability enhancement. Tools based on the identified technique wereinvestigated in depth to see the possibility of building, customizing and portingthem on the architecture of our 2oo2 system. Development efforts were done tosuccessfully build the complete chain of tools for use in system lab settings. Thecomplete observability infrastructure architecture was designed to extract thetracing data from target machine to the analysis tools. Procedures were definedto extract the tracing data for using it to debug and optimize the systemeffectively. Moreover, we investigate the impact of operating systems upgrades,to increase the observability of the 2oo2 system.

  • 62.
    Asplund, Fredrik
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics. KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Embedded Control Systems.
    Exploratory Testing: Do Contextual Factors Influence Software Fault Identification?2018In: Information and Software Technology, ISSN 0950-5849, E-ISSN 1873-6025Article in journal (Refereed)
    Abstract [en]

    Context: Exploratory Testing (ET) is a manual approach to software testing in which learning, test design and test execution occurs simultaneously. Still a developing topic of interest to academia, although as yet insufficiently investigated, most studies focus on the skills and experience of the individual tester. However, contextual factors such as project processes, test scope and organisational boundaries are also likely to affect the approach.

    Objective: This study explores contextual differences between teams of testers at a MedTec firm developing safety-critical products to ascertain whether contextual factors can influence the outcomes of ET, and what associated implications can be drawn for test management.

    Method: A development project was studied in two iterations, each consisting of a quantitative phase testing hypotheses concerning when ET would identify faults in comparison to other testing approaches and a qualitative phase involving interviews.

    Results: Influence on ET is traced to how the scope of tests focus learning on different types of knowledge and imply an asymmetry in the strength and number of information flows to test teams.

    Conclusions: While test specialisation can be attractive to software development organisations, results suggest changes to processes and organisational structures might be required to maintain test efficiency throughout projects: the responsibility for test cases might need to be rotated late in projects, and asymmetries in information flows might require management to actively strengthen the presence and connections of test teams throughout the firm. However, further research is needed to investigate whether these results also hold for non safety-critical faults.

    Download full text (pdf)
    fulltext
  • 63.
    Asplund, Fredrik
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics.
    Risks Related to the Use of Software Tools when Developing Cyber-Physical Systems: A Critical Perspective on the Future of Developing Complex, Safety-Critical Systems2014Doctoral thesis, monograph (Other academic)
    Abstract [en]

    The increasing complexity and size of modern Cyber-Physical Systems (CPS) has led to a sharp decline in productivity among CPS designers. Requirements on safety aggravate this problem further, both by being difficult to ensure and due to their high importance to the public.

    Tools, or rather efforts to facilitate the automation of development processes, are a central ingredient in many of the proposed innovations to mitigate this problem. Even though the safety-related implications of introducing automation in development processes have not been extensively studied, it is known that automation has already had a large impact on operational systems. If tools are to play a part in mitigating the increase in safety-critical CPS complexity, then their actual impact on CPS development, and thereby the safety of the corresponding end products, must be sufficiently understood.

    An survey of relevant research fields, such as system safety, software engineering and tool integration, is provided to facilitate the discussion on safety-related implications of tool usage. Based on the identification of industrial safety standards as an important source of information and considering that the risks posed by separate tools have been given considerable attention in the transportation domain, several high-profile safety standards in this domain have been surveyed. According to the surveyed standards, automation should primarily be evaluated on its reliable execution of separate process steps independent of human operators. Automation that only supports the actions of operators during CPS development is viewed as relatively inconsequential.

    A conceptual model and a reference model have been created based on the surveyed research fields. The former defines the entities and relationships most relevant to safety-related risks associated with tool usage. The latter describes aspects of tool integration and how these relate to each other. By combining these models, a risk analysis could be performed and properties of tool chains which need to be ensured to mitigate risk identified. Ten such safety-related characteristics of tool chains are described.

    These safety-related characteristics provide a systematic way to narrow down what to look for with regard to tool usage and risk. The hypothesis that a large set of factors related to tool usage may introduce risk could thus be tested through an empirical study, which identified safety-related weaknesses in support environments tied both to high and low levels of automation. The conclusion is that a broader perspective, which includes more factors related to tool usage than those considered by the surveyed standards, will be needed.

    Three possible reasons to disregard such a broad perspective have been refuted, namely requirements on development processes enforced by the domain of CPS itself, certain characteristics of safety-critical CPS and the possibility to place trust in a proven, manual development process. After finding no strong reason to keep a narrow perspective on tool usage, arguments are put forward as to why the future evolution of support environments may actually increase the importance of such a broad perspective.

    Suggestions for how to update the mental models of the surveyed safety standards, and other standards like them, are put forward based on this identified need for a broader perspective.

    Download full text (pdf)
    Thesis
  • 64.
    Asplund, Fredrik
    et al.
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics.
    Biehl, Matthias
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics.
    El-Khoury, Jad
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics.
    Törngren, Martin
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics.
    Tool Integration Beyond Wasserman2011In: Advanced Information Systems Engineering Workshops / [ed] Camille Salinesi, Oscar Pastor, Berlin: Springer-Verlag , 2011, p. 270-281Conference paper (Refereed)
    Abstract [en]

    The typical development environment today consists of many specialized development tools, which are partially integrated, forming a complex tool landscape with partial integration. Traditional approaches for reasoning about tool integration are insufficient to measure the degree of integration and integration optimality in today’s complex tool landscape. This paper presents a reference model that introduces dependencies between, and metrics for, integration aspects to overcome this problem. This model is used to conceive a method for reasoning about tool integration and identify improvements in an industrial case study. Based on this we are able to conclude that our reference model does not detract value from the principles that it is based on, instead it highlights improvements that were not well visible earlier. We conclude the paper by discussing open issues for our reference model, namely if it is suitable to use during the creation of new systems, if the used integration aspects can be subdivided further to support the analysis of secondary issues related to integration, difficulties related to the state dependency between the data and process aspects within the context of developing embedded systems and the analysis of non-functional requirements to support tool integration.

    Download full text (pdf)
    fulltext
  • 65.
    Attarzadeh-Niaki, Seyed Hosein
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronics and Embedded Systems.
    Sander, Ingo
    KTH, School of Information and Communication Technology (ICT), Electronics and Embedded Systems.
    Integrating Functional Mock-up units into a formal heterogeneous system modeling framework2015In: 18th CSI International Symposium on Computer Architecture and Digital Systems, CADS 2015, Institute of Electrical and Electronics Engineers (IEEE), 2015Conference paper (Refereed)
    Abstract [en]

    The Functional Mock-up Interface (FMI) standard defines a method for tool- and platform-independent model exchange and co-simulation of dynamic system models. In FMI, the master algorithm, which executes the imported components, is a timed differential equation solver. This is a limitation for heterogeneous embedded and cyber-physical systems, where models with different time abstractions co-exist and interact. This work integrates FMI into a heterogeneous system modeling and simulation framework as process constructors and co-simulation wrappers. Consequently, each external model communicates with the framework without unnecessary semantic adaptation while the framework provides necessary mechanisms for handling heterogeneity. The presented methods are implemented in the ForSyDe-SystemC modeling framework and tested using a case study.

  • 66.
    Awan, Ahsan Javed
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Performance Characterization and Optimization of In-Memory Data Analytics on a Scale-up Server2017Doctoral thesis, monograph (Other academic)
    Abstract [en]

    The sheer increase in the volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark defines the state of the art in big data analytics platforms for (i) exploiting data-flow and in-memory computing and (ii) for exhibiting superior scale-out performance on the commodity machines, little effort has been devoted to understanding the performance of in-memory data analytics with Spark on modern scale-up servers. This thesis characterizes the performance of in-memory data analytics with Spark on scale-up servers.Through empirical evaluation of representative benchmark workloads on a dual socket server, we have found that in-memory data analytics with Spark exhibit poor multi-core scalability beyond 12 cores due to thread level load imbalance and work-time inflation (the additional CPU time spent by threads in a multi-threaded computation beyond the CPU time required to perform the same work in a sequential computation). We have also found that workloads are bound by the latency of frequent data accesses to the memory. By enlarging input data size, application performance degrades significantly due to the substantial increase in wait time during I/O operations and garbage collection, despite 10% better instruction retirement rate (due to lower L1cache misses and higher core utilization).For data accesses, we have found that simultaneous multi-threading is effective in hiding the data latencies. We have also observed that (i) data locality on NUMA nodes can improve the performance by 10% on average,(ii) disabling next-line L1-D prefetchers can reduce the execution time by upto14%. For garbage collection impact, we match memory behavior with the garbage collector to improve the performance of applications between 1.6xto 3x and recommend using multiple small Spark executors that can provide up to 36% reduction in execution time over single large executor. Based on the characteristics of workloads, the thesis envisions near-memory and near storage hardware acceleration to improve the single-node performance of scale-out frameworks like Apache Spark. Using modeling techniques, it estimates the speed-up of 4x for Apache Spark on scale-up servers augmented with near-data accelerators.

    Download full text (pdf)
    PhD_thesis_AJA
  • 67.
    Awan, Ahsan Javed
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Performance Characterization of In-Memory Data Analytics on a Scale-up Server2016Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    The sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark defines the state of the art in big data analytics platforms for (i) exploiting data-flow and in-memory computing and (ii) for exhibiting superior scale-out performance on the commodity machines, little effort has been devoted at understanding the performance of in-memory data analytics with Spark on modern scale-up servers. This thesis characterizes the performance of in-memory data analytics with Spark on scale-up servers.

    Through empirical evaluation of representative benchmark workloads on a dual socket server, we have found that in-memory data analytics with Spark exhibit poor multi-core scalability beyond 12 cores due to thread level load imbalance and work-time inflation. We have also found that workloads are bound by the latency of frequent data accesses to DRAM. By enlarging input data size, application performance degrades significantly due to substantial increase in wait time during I/O operations and garbage collection, despite 10% better instruction retirement rate (due to lower L1 cache misses and higher core utilization).

    For data accesses we have found that simultaneous multi-threading is effective in hiding the data latencies. We have also observed that (i) data locality on NUMA nodes can improve the performance by 10% on average, (ii) disabling next-line L1-D prefetchers can reduce the execution time by up-to 14%. For GC impact, we match memory behaviour with the garbage collector to improve performance of applications between 1.6x to 3x. and recommend to use multiple small executors that can provide up-to 36% speedup over single large executor.

    Download full text (pdf)
    Licentiate_Thesis_AJA
  • 68.
    Awan, Ahsan Javed
    et al.
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Brorsson, Mats
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Vlassov, Vladimir
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Ayguade, Eduard
    Barcelona Super Computing Center and Technical University of Catalunya.
    Architectural Impact on Performance of In-memoryData Analytics: Apache Spark Case StudyManuscript (preprint) (Other academic)
    Abstract [en]

    While cluster computing frameworks are contin-uously evolving to provide real-time data analysis capabilities,Apache Spark has managed to be at the forefront of big data an-alytics for being a unified framework for both, batch and streamdata processing. However, recent studies on micro-architecturalcharacterization of in-memory data analytics are limited to onlybatch processing workloads. We compare micro-architectural per-formance of batch processing and stream processing workloadsin Apache Spark using hardware performance counters on a dualsocket server. In our evaluation experiments, we have found thatbatch processing are stream processing workloads have similarmicro-architectural characteristics are bounded by the latency offrequent data access to DRAM. For data accesses we have foundthat simultaneous multi-threading is effective in hiding the datalatencies. We have also observed that (i) data locality on NUMAnodes can improve the performance by 10% on average and(ii)disabling next-line L1-D prefetchers can reduce the executiontime by up-to 14% and (iii) multiple small executors can provideup-to 36% speedup over single large executor

    Download full text (pdf)
    paper
  • 69.
    Awan, Ahsan Javed
    et al.
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Brorsson, Mats
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Vlassov, Vladimir
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Ayguade, Eduard
    Technical University of Catalunya, Barcelona Super Computing Center.
    How Data Volume Affects Spark Based Data Analytics on a Scale-up Server2015In: Big Data Benchmarks, Performance Optimization, and Emerging Hardware: 6th Workshop, BPOE 2015, Kohala, HI, USA, August 31 - September 4, 2015. Revised Selected Papers, Springer, 2015, Vol. 9495, p. 81-92Conference paper (Refereed)
    Abstract [en]

    Sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark is gaining popularity for exhibiting superior scale-out performance on the commodity machines, the impact of data volume on the performance of Spark based data analytics in scale-up configuration is not well understood. We present a deep-dive analysis of Spark based applications on a large scale-up server machine. Our analysis reveals that Spark based data analytics are DRAM bound and do not benefit by using more than 12 cores for an executor. By enlarging input data size, application performance degrades significantly due to substantial increase in wait time during I/O operations and garbage collection, despite 10 % better instruction retirement rate (due to lower L1 cache misses and higher core utilization). We match memory behaviour with the garbage collector to improve performance of applications between 1.6x to 3x.

    Download full text (pdf)
    fulltext
  • 70.
    Awan, Ahsan Javed
    et al.
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Brorsson, Mats
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Vlassov, Vladimir
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Ayguade, Eduard
    Barcelona Super Computing Center and Technical University of Catalunya.
    Micro-architectural Characterization of Apache Spark on Batch and Stream Processing Workloads2016Conference paper (Refereed)
    Abstract [en]

    While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. However, recent studies on micro-architectural characterization of in-memory data analytics are limited to only batch processing workloads. We compare the micro-architectural performance of batch processing and stream processing workloads in Apache Spark using hardware performance counters on a dual socket server. In our evaluation experiments, we have found that batch processing and stream processing has same micro-architectural behavior in Spark if the difference between two implementations is of micro-batching only. If the input data rates are small, stream processing workloads are front-end bound. However, the front end bound stalls are reduced at larger input data rates and instruction retirement is improved. Moreover, Spark workloads using DataFrames have improved instruction retirement over workloads using RDDs.

  • 71.
    Awan, Ahsan Javed
    et al.
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Brorsson, Mats
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Vlassov, Vladimir
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Ayguade, Eduard
    Barcelona Super Computing Center and Technical University of Catalunya.
    Node architecture implications for in-memory data analytics on scale-in clusters2016Conference paper (Refereed)
    Abstract [en]

    While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics. Recent studies propose scale-in clusters with in-storage processing devices to process big data analytics with Spark However the proposal is based solely on the memory bandwidth characterization of in-memory data analytics and also does not shed light on the specification of host CPU and memory. Through empirical evaluation of in-memory data analytics with Apache Spark on an Ivy Bridge dual socket server, we have found that (i) simultaneous multi-threading is effective up to 6 cores (ii) data locality on NUMA nodes can improve the performance by 10% on average, (iii) disabling next-line L1-D prefetchers can reduce the execution time by up to 14%, (iv) DDR3 operating at 1333 MT/s is sufficient and (v) multiple small executors can provide up to 36% speedup over single large executor.

  • 72.
    Aybek, Mehmet Onur
    et al.
    Arcticus Syst AB, Järfälla, Sweden..
    Jordao, Rodolfo
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems.
    Lundbäck, John
    Arcticus Syst AB, Järfälla, Sweden..
    Lundbäck, Kurt-Lennart
    Arcticus Syst AB, Järfälla, Sweden..
    Becker, Matthias
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Electronic and embedded systems.
    From the Synchronous Data Flow Model of Computation to an Automotive Component Model2021In: Proceedings 26th IEEE International Conference on Emerging Technologies and Factory Automation, ETFA 2021, Institute of Electrical and Electronics Engineers (IEEE) , 2021Conference paper (Refereed)
    Abstract [en]

    The size and complexity of automotive software systems are steadily increasing. Software functions are subject to different requirements and belong to different functional domains of the car. Meanwhile, streaming applications have become increasingly relevant in emerging application areas such as Advanced Driving Assistance Systems. Among models for streaming applications, the Synchronous Data Flow model is well-known for its analysable properties. This work presents transformation rules that allow transforming applications described by the Synchronous Data Flow model to an automotive component model. The proposed transformation rules are implemented in form of a software plugin for an automotive tool suite that allows for timing analysis, code synthesis and deployment to a Real-Time Operating System. To demonstrate the applicability of the proposed approach, a case study of a Kalman filter that is part of a simplified cruise control application is presented. An abstract Synchronous Data Flow model of the filter is transformed into a component that is deployed on an Electronic Control Unit with hard timing guarantees.

  • 73.
    Ayguadé, Eduard
    et al.
    European Center for Parallelism of Barcelona (CEPBA), Technical University of Catalunya (UPC).
    Brorsson, Mats
    KTH, School of Information and Communication Technology (ICT), Microelectronics and Information Technology, IMIT.
    Brunst, H.
    ) Center for High Performance Computing (ZHR), TU Dresden.
    Hoppe, H. -C
    Pallas GmbH.
    Karlsson, S.
    KTH, School of Information and Communication Technology (ICT), Microelectronics and Information Technology, IMIT.
    Martorell, X.
    European Center for Parallelism of Barcelona (CEPBA), Technical University of Catalunya (UPC).
    Nagel, W. E.
    ) Center for High Performance Computing (ZHR), TU Dresden.
    Schlimbach, F.
    Pallas GmbH.
    Utrera, G.
    European Center for Parallelism of Barcelona (CEPBA), Technical University of Catalunya (UPC).
    Winkler, M.
    ) Center for High Performance Computing (ZHR), TU Dresden.
    OpenMP Performance Analysis in the INTONE Project2001Conference paper (Refereed)
  • 74.
    Azizpour, Hossein
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Visual Representations and Models: From Latent SVM to Deep Learning2016Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Two important components of a visual recognition system are representation and model. Both involves the selection and learning of the features that are indicative for recognition and discarding those features that are uninformative. This thesis, in its general form, proposes different techniques within the frameworks of two learning systems for representation and modeling. Namely, latent support vector machines (latent SVMs) and deep learning.

    First, we propose various approaches to group the positive samples into clusters of visually similar instances. Given a fixed representation, the sampled space of the positive distribution is usually structured. The proposed clustering techniques include a novel similarity measure based on exemplar learning, an approach for using additional annotation, and augmenting latent SVM to automatically find clusters whose members can be reliably distinguished from background class. 

    In another effort, a strongly supervised DPM is suggested to study how these models can benefit from privileged information. The extra information comes in the form of semantic parts annotation (i.e. their presence and location). And they are used to constrain DPMs latent variables during or prior to the optimization of the latent SVM. Its effectiveness is demonstrated on the task of animal detection.

    Finally, we generalize the formulation of discriminative latent variable models, including DPMs, to incorporate new set of latent variables representing the structure or properties of negative samples. Thus, we term them as negative latent variables. We show this generalization affects state-of-the-art techniques and helps the visual recognition by explicitly searching for counter evidences of an object presence.

    Following the resurgence of deep networks, in the last works of this thesis we have focused on deep learning in order to produce a generic representation for visual recognition. A Convolutional Network (ConvNet) is trained on a largely annotated image classification dataset called ImageNet with $\sim1.3$ million images. Then, the activations at each layer of the trained ConvNet can be treated as the representation of an input image. We show that such a representation is surprisingly effective for various recognition tasks, making it clearly superior to all the handcrafted features previously used in visual recognition (such as HOG in our first works on DPM). We further investigate the ways that one can improve this representation for a task in mind. We propose various factors involving before or after the training of the representation which can improve the efficacy of the ConvNet representation. These factors are analyzed on 16 datasets from various subfields of visual recognition.

    Download full text (pdf)
    fulltext
  • 75.
    Azizpour, Hossein
    et al.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Arefiyan, Mostafa
    Naderi Parizi, Sobhan
    Carlsson, Stefan
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Spotlight the Negatives: A Generalized Discriminative Latent Model2015Conference paper (Refereed)
    Abstract [en]

    Discriminative latent variable models (LVM) are frequently applied to various visualrecognition tasks. In these systems the latent (hidden) variables provide a formalism formodeling structured variation of visual features. Conventionally, latent variables are de-fined on the variation of the foreground (positive) class. In this work we augment LVMsto includenegativelatent variables corresponding to the background class. We formalizethe scoring function of such a generalized LVM (GLVM). Then we discuss a frameworkfor learning a model based on the GLVM scoring function. We theoretically showcasehow some of the current visual recognition methods can benefit from this generalization.Finally, we experiment on a generalized form of Deformable Part Models with negativelatent variables and show significant improvements on two different detection tasks.

    Download full text (pdf)
    fulltext
  • 76.
    Azizpour, Hossein
    et al.
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Carlsson, Stefan
    KTH, School of Computer Science and Communication (CSC), Computer Vision and Active Perception, CVAP.
    Self-tuned Visual Subclass Learning with Shared Samples An Incremental Approach2013Manuscript (preprint) (Other academic)
    Abstract [en]

    Computer vision tasks are traditionally defined and eval-uated using semantic categories. However, it is known to thefield that semantic classes do not necessarily correspondto a unique visual class (e.g. inside and outside of a car).Furthermore, many of the feasible learning techniques athand cannot model a visual class which appears consistentto the human eye. These problems have motivated the useof 1) Unsupervised or supervised clustering as a prepro-cessing step to identify the visual subclasses to be used ina mixture-of-experts learning regime. 2) Felzenszwalb etal. part model and other works model mixture assignmentwith latent variables which is optimized during learning 3)Highly non-linear classifiers which are inherently capableof modelling multi-modal input space but are inefficient atthe test time. In this work, we promote an incremental viewover the recognition of semantic classes with varied appear-ances. We propose an optimization technique which incre-mentally finds maximal visual subclasses in a regularizedrisk minimization framework. Our proposed approach uni-fies the clustering and classification steps in a single algo-rithm. The importance of this approach is its compliancewith the classification via the fact that it does not need toknow about the number of clusters, the representation andsimilarity measures used in pre-processing clustering meth-ods a priori. Following this approach we show both quali-tatively and quantitatively significant results. We show thatthe visual subclasses demonstrate a long tail distribution.Finally, we show that state of the art object detection meth-ods (e.g. DPM) are unable to use the tails of this distri-bution comprising 50% of the training samples. In fact weshow that DPM performance slightly increases on averageby the removal of this half of the data.

    Download full text (pdf)
    fulltext
  • 77.
    Baharloo, Mohammad
    et al.
    University of Tehran, Tehran, Iran.
    Khonsari, Ahmen
    University of Tehran, Tehran, Iran.
    Dolati, Mahdi
    University of Tehran, Tehran, Iran.
    Shiri, Pouya
    University of Victoria, BC, Canada.
    Ebrahimi, Masoumeh
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Electronic and embedded systems.
    Rahmati, Dara
    University of Tehran, Tehran, Iran.
    Traffic-aware performance optimization in Real-time wireless network on chip2020In: Nano Communication Networks, ISSN 1878-7789, E-ISSN 1878-7797, Vol. 26, article id 100321Article in journal (Refereed)
    Abstract [en]

    Network on Chip (NoC) is a prevailing communication platform for multi-core embedded systems. Wireless network on chip (WNoC) employs wired and wireless technologies simultaneously to improve the performance and power-efficiency of traditional NoCs. In this paper, we propose a deterministic and scalable arbitration mechanism for the medium access control in the wireless plane and present its analytical worst-case delay model in a certain use-case scenario that considers both Real-time (RT) and Non Real-time (NRT) flows with different packet sizes. Furthermore, we design an optimization model to jointly consider the worst-case and the average-case performance parameters of the system. The Optimization technique determines how NRT flows are allowed to use the wireless plane in a way that all RT flows meet their deadlines, and the average case delay of the WNoC is minimized. Results show that our proposed approach decreases the average latency of network flows up to 17.9%, and 11.5% in 5 × 5, and 6 × 6 mesh sizes, respectively.

    Download full text (pdf)
    fulltext
  • 78.
    Bahri, Leila
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Girdzijauskas, Sarunas
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Trust mends blockchains: Living up to expectations2019In: Proceedings - International Conference on Distributed Computing Systems, 2019, p. 1358-1368Conference paper (Refereed)
    Abstract [en]

    At the heart of Blockchains is the trustless leader election mechanism for achieving consensus among pseudo-anonymous peers, without the need of oversight from any third party or authority whatsoever. So far, two main mechanisms are being discussed: proof-of-work (PoW) and proof-of-stake (PoS). PoW relies on demonstration of computational power, and comes with the markup of huge energy wastage in return of the stake in cyrpto-currency. PoS tries to address this by relying on owned stake (i.e., amount of crypto-currency) in the system. In both cases, Blockchains are limited to systems with financial basis. This forces non-crypto-currency Blockchain applications to resort to "permissioned" setting only, effectively centralizing the system. However, non-crypto-currency permisionless blockhains could enable secure and self-governed peer-to-peer structures for numerous emerging application domains, such as education and health, where some trust exists among peers. This creates a new possibility for valuing trust among peers and capitalizing it as the basis (stake) for reaching consensus. In this paper we show that there is a viable way for permisionless non-financial Blockhains to operate in completely decentralized environments and achieve leader election through proof-of-trust (PoT). In our PoT construction, peer trust is extracted from a trust network that emerges in a decentralized manner and is used as a waiver for the effort to be spent for PoW, thus dramatically reducing total energy expenditure of the system. Furthermore, our PoT construction is resilient to the risk of small cartels monopolizing the network (as it happens with the mining-pool phenomena in PoW) and is not vulnerable to sybils. We evluate security guarantees, and perform experimental evaluation of our construction, demonstrating up to 10-fold energy savings compared to PoW without trading off any of the decentralization characteristics, with further guarantees against risks of monopolization.

  • 79.
    Bahri, Leila
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Girdzijauskas, Sarunas
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Trust Mends Blockchains: Living up to Expectations2019In: IEEE 39th International Conference on Distributed Computing Systems (ICDCS), Dallas, July 7-10 2019, 2019Conference paper (Refereed)
    Abstract [en]

    At the heart of Blockchains is the trustless leader election mechanism for achieving consensus among pseudoanonymous peers, without the need of oversight from any third party or authority whatsoever. So far, two main mechanisms are being discussed: proof-of-work (PoW) and proof-of-stake (PoS). PoW relies on demonstration of computational power, and comes with the markup of huge energy wastage in return of the stake in cyrpto-currency. PoS tries to address this by relying on owned stake (i.e., amount of crypto-currency) in the system. In both cases, Blockchains are limited to systems with financial basis. This forces non-crypto-currency Blockchain applications to resort to “permissioned” setting only, effectively centralizing the system. However, non-crypto-currency permisionless blockhains could enable secure and self-governed peer-to-peer structures for numerous emerging application domains, such as education and health, where some trust exists among peers. This creates a new possibility for valuing trust among peers and capitalizing it as the basis (stake) for reaching consensus. In this paper we show that there is a viable way for permisionless non-financial Blockhains to operate in completely decentralized environments and achieve leader election through proof-of-trust (PoT). In our PoT construction, peer trust is extracted from a trust network that emerges in a decentralized manner and is used as a waiver for the effort to be spent for PoW, thus dramatically reducing total energy expenditure of the system. Furthermore, our PoT construction is resilient to the risk of small cartels monopolizing the network (as it happens with the mining-pool phenomena in PoW) and is not vulnerable to sybils. We evluate security guarantees, and perform experimental evaluation of our construction, demonstrating up to 10-fold energy savings compared to PoW without trading off any of the decentralization characteristics, with further guarantees against risks of monopolization.

    Download full text (pdf)
    fulltext
  • 80.
    Bahri, Leila
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Girdzijauskas, Sarunas
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    When Trust Saves Energy - A Reference Framework for Proof-of-Trust (PoT) Blockchains2018In: WWW '18 Companion Proceedings of the The Web Conference 2018, ACM Digital Library, 2018, p. 1165-1169Conference paper (Refereed)
    Abstract [en]

    Blockchains are attracting the attention of many technical, financial, and industrial parties, as a promising infrastructure for achieving secure peer-to-peer (P2P) transactional systems. At the heart of blockchains is proof-of-work (PoW), a trustless leader election mechanism based on demonstration of computational power. PoW provides blockchain security in trusless P2P environments, but comes at the expense of wasting huge amounts of energy. In this research work, we question this energy expenditure of PoW under blockchain use cases where some form of trust exists between the peers. We propose a Proof-of-Trust (PoT) blockchain where peer trust is valuated in the network based on a trust graph that emerges in a decentralized fashion and that is encoded in and managed by the blockchain itself. This trust is then used as a waiver for the difficulty of PoW; that is, the more trust you prove in the network, the less work you do.

    Download full text (pdf)
    fulltext
  • 81.
    Baig, Roger
    et al.
    Fundacio Privada per la Xarxa Lliure, Oberta i Neural Guifi.net. Mas l’Esperanca, 08503 Gurb, Catalonia.
    Dowling, Jim
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Escrich, Pau
    Fundacio Privada per la Xarxa Lliure, Oberta i Neural Guifi.net. Mas l’Esperanca, 08503 Gurb, Catalonia.
    Freitag, Felix
    Department of Computer Architecture. Universitat Politecnica de Catalunya. Barcelona, Spain .
    Meseguer, Roc
    Department of Computer Architecture. Universitat Politecnica de Catalunya. Barcelona, Spain.
    Moll, Agusti
    Fundacio Privada per la Xarxa Lliure, Oberta i Neural Guifi.net. Mas l’Esperanca, 08503 Gurb, Catalonia.
    Navarro, Leandro
    Department of Computer Architecture. Universitat Politecnica de Catalunya. Barcelona, Spain.
    Pietrosemoli, Ermanno
    The Abdus Salam International Centre for Theoretical Physics (ICTP). Trieste, Italy.
    Pueyo, Roger
    Fundacio Privada per la Xarxa Lliure, Oberta i Neural Guifi.net. Mas l’Esperanca, 08503 Gurb, Catalonia.
    Vlassov, Vladimir
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Zennaro, Marco
    The Abdus Salam International Centre for Theoretical Physics (ICTP). Trieste, Italy.
    Deploying Clouds in the Guifi Community Network2015In: Proceedings of the 2015 IFIP/IEEE International Symposium on Integrated Network Management, IM 2015, IEEE , 2015, p. 1020-1025Conference paper (Refereed)
    Abstract [en]

    This paper describes an operational geographically distributed and heterogeneous cloudinfrastructure with services and applications deployed in the Guifi community network. The presentedcloud is a particular case of a community cloud, developed according to the specific needs and conditions of community networks. We describe the concept of this community cloud, explain our technical choices for building it, and our experience with the deployment of this cloud. We review our solutions and experience on offering the different service models of cloud computing (IaaS, PaaS and SaaS) in community networks. The deployed cloud infrastructure aims to provide stable and attractive cloud services in order to encourage community network user to use, keep and extend it with new services and applications.

  • 82.
    Baig, Roger
    et al.
    Fundacio Privada per la Xarxa Lliure, Oberta i Neural Guifi.net. Mas l’Esperanca, 08503 Gurb, Catalonia.
    Freitag, Felix
    Department of Computer Architecture. Universitat Politecnica de Catalunya. Barcelona, Spain .
    Khan, Amin M.
    Department of Computer Architecture. Universitat Politecnica de Catalunya. Barcelona, Spain.
    Moll, Agusti
    Fundacio Privada per la Xarxa Lliure, Oberta i Neural Guifi.net. Mas l’Esperanca, 08503 Gurb, Catalonia.
    Navarro, Leandro
    Department of Computer Architecture. Universitat Politecnica de Catalunya. Barcelona, Spain.
    Pueyo, Roger
    Fundacio Privada per la Xarxa Lliure, Oberta i Neural Guifi.net. Mas l’Esperanca, 08503 Gurb, Catalonia.
    Vlassov, Vladimir
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Community clouds at the edge deployed in Guifi.net2015Conference paper (Refereed)
    Abstract [en]

    Community clouds are a cloud deployment model in which the cloud infrastructure is built with specific features for a community of users with shared concerns, goals, and interests. Commercialcommunity clouds already operate in several application areas such as in the finance, government and health, fulfilling community-specific requirements. In this demo, a community cloud for citizens is presented. It is formed by devices at the edge of the network, contributed by the members of acommunity network and brought together into a distributed community cloud system through the Cloudy distribution. The demonstration shows to the audience in a live access the deployedcommunity cloud from the perspective of the user, by accessing a Cloudy node, inspecting the services available in the community cloud, and showing the usage of some of its services.

  • 83.
    Baig, Roger
    et al.
    Fundacio Privada per la Xarxa Lliure, Oberta i Neural Guifi.net. Mas l’Esperanca, 08503 Gurb, Catalonia.
    Freitag, Felix
    Department of Computer Architecture. Universitat Politecnica de Catalunya. Barcelona, Spain .
    Moll, Agusti
    Fundacio Privada per la Xarxa Lliure, Oberta i Neural Guifi.net. Mas l’Esperanca, 08503 Gurb, Catalonia.
    Navarro, Leandro
    Department of Computer Architecture. Universitat Politecnica de Catalunya. Barcelona, Spain.
    Pueyo, Roger
    Fundacio Privada per la Xarxa Lliure, Oberta i Neural Guifi.net. Mas l’Esperanca, 08503 Gurb, Catalonia.
    Vlassov, Vladimir
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Cloud-based community services in community networks2016In: 2016 International Conference on Computing, Networking and Communications, ICNC 2016, IEEE conference proceedings, 2016, p. 1-5, article id 7440621Conference paper (Refereed)
    Abstract [en]

    Wireless networks have shown to be a cost effective solution for an IP-based communication infrastructure in under-served areas. Services and application, if deployed within these wireless networks, add value for the users. This paper shows how cloud infrastructures have been made operational in a community wireless network, as a particular case of a community cloud, developed according to the specific requirements and conditions of the community. We describe the conditions and requirements of such a community cloud and explain our technical choices and experience in its deployment in the community network. The user take-up has started, and our case supports the tendency of cloud computing moving towards the network edge.

  • 84.
    Baig, Roger
    et al.
    Fundacio Privada per la Xarxa Lliure, Oberta i Neural Guifi.net. Mas l’Esperanca, 08503 Gurb, Catalonia.
    Freitag, Felix
    Department of Computer Architecture. Universitat Politecnica de Catalunya. Barcelona, Spain .
    Moll, Agusti
    Fundacio Privada per la Xarxa Lliure, Oberta i Neural Guifi.net. Mas l’Esperanca, 08503 Gurb, Catalonia.
    Navarro, Leandro
    Department of Computer Architecture. Universitat Politecnica de Catalunya. Barcelona, Spain.
    Pueyo, Roger
    Fundacio Privada per la Xarxa Lliure, Oberta i Neural Guifi.net. Mas l’Esperanca, 08503 Gurb, Catalonia.
    Vlassov, Vladimir
    KTH, School of Information and Communication Technology (ICT), Software and Computer systems, SCS.
    Community network clouds as a case for the IEEE Intercloud standardization2015In: 2015 IEEE Conference on Standards for Communications and Networking, CSCN 2015, 2015, p. 269-274, article id 7390456Conference paper (Refereed)
    Abstract [en]

    The IEEE P2302 Intercloud WG conducts work since 2011 on the project Standard for Intercloud Interoperability and Federation with the goal to define a standard architecture and building components for large-scale interoperability of independent cloud providers. While the standardization process has achieved fine-grained definitions of several Intercloud components, a deployment of the Intercloud to demonstrate the architectural feasibility is not yet operational. In this paper, we describe a deployed community network cloud and we show how it matches in several aspects the vision of the Intercloud. Similar to the Intercloud, the community network cloud consists of many small cloud providers, which for interoperability use a set of common services. In this sense, the community network cloud is a real use case for elements that the Intercloud standardization WG envisions, and can feed back to and even become part of the Intercloud. In fact, a study on Small or Medium Enterprise (SME) provided commercial services in the community network cloud indicates the importance of the success of the Intercloud standardization initiative for SMEs.

  • 85. Bailis, P.
    et al.
    Fekete, A.
    Ghodsi, Ali
    KTH.
    Hellerstein, J. M.
    Stoica, I.
    HAT, not CAP: Towards highly available transactions2013In: 14th Workshop on Hot Topics in Operating Systems, HotOS 2013, USENIX Association , 2013Conference paper (Refereed)
    Abstract [en]

    While the CAP Theorem is often interpreted to preclude the availability of transactions in a partition-prone environment, we show that highly available systems can provide useful transactional semantics, often matching those of today's ACID databases. We propose Highly Available Transactions (HATs) that are available in the presence of partitions. HATs support many desirable ACID guarantees for arbitrary transactional sequences of read and write operations and permit low-latency operation.

  • 86.
    Balaam, M.
    et al.
    KTH, School of Computer Science and Communication (CSC), Media Technology and Interaction Design, MID.
    Hansen, L. K.
    Women’s health at CHI2018In: interactions, ISSN 1072-5520, E-ISSN 1558-3449, Vol. 25, no 1Article in journal (Refereed)
    Download full text (pdf)
    fulltext
  • 87.
    Balaam, Madeline
    et al.
    Newcastle Univ, Culture Lab, Newcastle NE1 7RU, England..
    Egglestone, Stefan Rennick
    Univ Nottingham, Mixed Real Lab, Nottingham NG5 1PB, England..
    Fitzpatrick, Geraldine
    Vienna Univ Technol, A-1040 Vienna, Austria..
    Rodden, Tom
    Univ Nottingham, Mixed Real Lab, Nottingham NG5 1PB, England..
    Hughes, Ann-Marie
    Univ Southampton, Sch Hlth Sci, Southampton SO17 1BJ, Hants, England..
    Wilkinson, Anna
    Sheffield Hallam Univ, Ctr Hlth & Social Care Res, Sheffield S10 2BP, S Yorkshire, England..
    Nind, Thomas
    Univ Dundee, Sch Comp, Dundee DD1 4HN, Scotland..
    Axelrod, Lesley
    Univ Sussex, HCT Grp, Brighton BN1 9QH, E Sussex, England..
    Harris, Eric
    Univ Sussex, HCT Grp, Brighton BN1 9QH, E Sussex, England..
    Ricketts, Ian
    Univ Dundee, Sch Comp, Dundee DD1 4HN, Scotland..
    Mawson, Susan
    Sheffield Hallam Univ, Ctr Hlth & Social Care Res, Sheffield S10 2BP, S Yorkshire, England..
    Burridge, Jane
    Univ Southampton, Sch Hlth Sci, Southampton SO17 1BJ, Hants, England..
    Motivating Mobility: Designing for Lived Motivation in Stroke Rehabilitation2011In: 29TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, ASSOC COMPUTING MACHINERY , 2011, p. 3073-3082Conference paper (Refereed)
    Abstract [en]

    How to motivate and support behaviour change through design is becoming of increasing interest to the CHI community. In this paper, we present our experiences of building systems that motivate people to engage in upper limb rehabilitation exercise after stroke. We report on participatory design work with four stroke survivors to develop a holistic understanding of their motivation and rehabilitation needs, and to construct and deploy engaging interactive systems that satisfy these. We reflect on the limits of motivational theories in trying to design for the lived experience of motivation and highlight lessons learnt around: helping people articulate what motivates them; balancing work, duty, fun; supporting motivation over time; and understanding the wider social context. From these we identify design guidelines that can inform a toolkit approach to support both scalability and personalisability.

    Download full text (pdf)
    fulltext
  • 88.
    Baldini, Gianmarco
    et al.
    Institute for the Protection and Security of the Citizen (IPSC), Italy.
    Kounelis, Ioannis
    Institute for the Protection and Security of the Citizen (IPSC), Italy.
    Nai Fovino, Igor
    Institute for the Protection and Security of the Citizen (IPSC), Italy.
    Neisse, Ricardo
    Institute for the Protection and Security of the Citizen (IPSC), Italy.
    A Framework for Privacy Protection and Usage Control of Personal Data in a Smart City Scenario2013In: Critical Information Infrastructures Security: 8th International Workshop, CRITIS 2013, Amsterdam, The Netherlands, September 16-18, 2013, Revised Selected Papers, Springer Publishing Company, 2013, p. 212-217Conference paper (Refereed)
    Abstract [en]

    In this paper we address trust and privacy protection issues related to identity and personal data provided by citizens in a smart city environment. Our proposed solution combines identity management, trust negotiation, and usage control. We demonstrate our solution in a case study of a smart city during a crisis situation.

  • 89.
    Balliu, Musard
    KTH, School of Computer Science and Communication (CSC), Theoretical Computer Science, TCS.
    Logics for Information Flow Security:From Specification to Verification2014Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Software is becoming  increasingly  ubiquitous and today we find software running everywhere. There is software driving our favorite  game  application or  inside the web portal we use to read the morning  news, and   when we book a vacation.  Being so commonplace, software has become an easy target to compromise  maliciously or at best to get it wrong. In fact, recent trends and highly-publicized attacks suggest that vulnerable software  is at  the root of many security attacks.     

    Information flow security is the research field that studies  methods and techniques to provide strong security guarantees against  software security attacks and vulnerabilities.  The goal of an  information flow analysis is to rigorously check how  sensitive information is used by the software application and ensure that this information does not escape the boundaries of the application, unless it is properly granted permission to do so by the security policy at hand.  This process can   be challenging asit first requires to determine what the applications security policy is and then to provide a mechanism  to enforce that policy against the  software application.  In this thesis  we address the problem of (information flow) policy specification and policy enforcement by leveraging formal methods, in particular logics and language-based analysis and verification techniques.

    The thesis contributes to the state of the art of information flow security in several directions, both theoretical and practical. On the policy specification side, we provide a  framework to reason about  information flow security conditions using the notion of knowledge. This is accompanied  by logics that  can be used  to express the security policies precisely in a syntactical manner. Also, we study the interplay between confidentiality and integrity  to enforce security in  presence of active attacks.  On the verification side, we provide several symbolic algorithms to effectively check whether an application adheres to the associated security policy. To achieve this,  we propose techniques  based on symbolic execution and first-order reasoning (SMT solving) to first extract a model of the target application and then verify it against the policy.  On the practical side, we provide  tool support by automating our techniques and  thereby making it possible  to verify programs written in Java or ARM machine code.  Besides the expected limitations, our case studies show that the tools can be used to  verify the security of several realistic scenarios.

    More specifically, the thesis consists of two parts and six chapters. We start with an introduction giving an overview of the research problems and the results of the thesis. Then we move to the specification part which  relies on knowledge-based reasoning and epistemic logics to specify state-based and trace-based information flow conditions and on the weakest precondition calculus to certify security in  presence of active attacks.  The second part of the thesis addresses the problem of verification  of the security policies introduced in the first part.  We use symbolic execution  and  SMT solving techniques to enable   model checking of the security properties.  In particular, we implement a tool that verifies noninterference  and declassification policies for Java programs. Finally, we conclude with relational verification of low level code, which is also supported by a tool.

    Download full text (pdf)
    php-thesis-Musard-Balliu
  • 90.
    Balliu, Musard
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.
    Bastys, Iulia
    Chalmers Univ Technol, Dept Comp Sci & Engn, Gothenburg, Sweden..
    Sabelfeld, Andrei
    Chalmers Univ Technol, Dept Comp Sci & Engn, Gothenburg, Sweden..
    Securing IoT Apps2019In: IEEE Security and Privacy, ISSN 1540-7993, E-ISSN 1558-4046, Vol. 17, no 5, p. 22-29Article in journal (Refereed)
    Abstract [en]

    Users increasingly rely on Internet of Things (IoT) apps to manage their digital lives through the overwhelming diversity of IoT services and devices. Are the IoT app platforms doing enough to protect the privacy and security of their users? By securing IoT apps, how can we help users reclaim control over their data?

    Download full text (pdf)
    fulltext
  • 91.
    Balliu, Musard
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.
    Baudry, Benoit
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Bobadilla, Sofia
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.
    Ekstedt, Mathias
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Network and Systems Engineering.
    Monperrus, Martin
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.
    Ron Arteaga, Javier
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.
    Sharma, Aman
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.
    Skoglund, Gabriel
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.
    Soto Valero, César
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Wittlinger, Martin
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.
    Software Bill of Materials in Java2023In: SCORED 2023 - Proceedings of the 2023 Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses, Association for Computing Machinery (ACM) , 2023, p. 75-76Conference paper (Refereed)
    Abstract [en]

    Modern software applications are virtually never built entirely in-house. As a matter of fact, they reuse many third-party dependencies, which form the core of their software supply chain [1]. The large number of dependencies in an application has turned into a major challenge for both security and reliability. For example, to compromise a high-value application, malicious actors can choose to attack a less well-guarded dependency of the project [2]. Even when there is no malicious intent, bugs can propagate through the software supply chain and cause breakages in applications. Gathering accurate, upto- date information about all dependencies included in an application is, therefore, of vital importance.

  • 92.
    Balliu, Musard
    et al.
    KTH, School of Computer Science and Communication (CSC), Theoretical Computer Science, TCS.
    Dam, Mads
    KTH, School of Computer Science and Communication (CSC), Theoretical Computer Science, TCS.
    Guanciale, Roberto
    KTH, School of Computer Science and Communication (CSC), Theoretical Computer Science, TCS.
    Automating Information Flow Analysis of Low Level Code2014In: Proceedings of CCS’14, November 3–7, 2014, Scottsdale, Arizona, USA, Association for Computing Machinery (ACM), 2014Conference paper (Refereed)
    Abstract [en]

    Low level code is challenging: It lacks structure, it uses jumps and symbolic addresses, the control ow is often highly optimized, and registers and memory locations may be reused in ways that make typing extremely challenging. Information ow properties create additional complications: They are hyperproperties relating multiple executions, and the possibility of interrupts and concurrency, and use of devices and features like memory-mapped I/O requires a departure from the usual initial-state nal-state account of noninterference. In this work we propose a novel approach to relational verication for machine code. Verication goals are expressed as equivalence of traces decorated with observation points. Relational verication conditions are propagated between observation points using symbolic execution, and discharged using rst-order reasoning. We have implemented an automated tool that integrates with SMT solvers to automate the verication task. The tool transforms ARMv7 binaries into an intermediate, architecture-independent format using the BAP toolset by means of a veried translator. We demonstrate the capabilities of the tool on a separation kernel system call handler, which mixes hand-written assembly with gcc-optimized output, a UART device driver and a crypto service modular exponentiation routine.

    Download full text (pdf)
    ccs14_bdg
  • 93.
    Balliu, Musard
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.
    Merro, Massimo
    University of Verona.
    Pasqua, Michele
    University of Verona.
    Securing Cross-App Interactions in IoT Platforms2019In: 2019 IEEE 32nd Computer Security Foundations Symposium (CSF), IEEE Computer Society, 2019, p. 319-334, article id 8823751Conference paper (Refereed)
    Abstract [en]

    IoT platforms enable users to connect various smart devices and online services via reactive apps running on the cloud. These apps, often developed by third-parties, perform simple computations on data triggered by external information sources and actuate the results of computation on external information sinks. Recent research shows that unintended or malicious interactions between the different (even benign) apps of a user can cause severe security and safety risks. These works leverage program analysis techniques to build tools for unveiling unexpected interference across apps for specific use cases. Despite these initial efforts, we are still lacking a semantic framework for understanding interactions between IoT apps. The question of what security policy cross-app interference embodies remains largely unexplored. This paper proposes a semantic framework capturing the essence of cross-app interactions in IoT platforms. The framework generalizes and connects syntactic enforcement mechanisms to bisimulation-based notions of security, thus providing a baseline for formulating soundness criteria of these enforcement mechanisms. Specifically, we present a calculus that models the behavioral semantics of a system of apps executing concurrently, and use it to define desirable semantic policies in the security and safety context of IoT apps. To demonstrate the usefulness of our framework, we define static mechanisms for enforcing crossapp security and safety, and prove them sound with respect to our semantic conditions. Finally, we leverage real-world apps to validate the practical benefits of our policy framework.

    Download full text (pdf)
    fulltext
  • 94.
    Balliu, Musard
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.
    Merro, Massimo
    University of Verona.
    Pasqua, Michele
    University of Verona.
    Shcherbakov, Mikhail
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.
    Friendly Fire: Cross-App Interactions in IoT Platforms2021In: ACM Transactions on Privacy and Security (TOPS), ISSN 2471-2566, Vol. 24, no 3, p. 1-40, article id 16Article in journal (Refereed)
    Abstract [en]

    IoT platforms enable users to connect various smart devices and online services via reactive apps running onthe cloud. These apps, often developed by third-parties, perform simple computations on data triggered byexternal information sources and actuate the results of computations on external information sinks. Recentresearch shows that unintended or malicious interactions between the different (even benign) apps of a usercan cause severe security and safety risks. These works leverage program analysis techniques to build toolsfor unveiling unexpected interference across apps for specific use cases. Despite these initial efforts, we arestill lacking a semantic framework for understanding interactions between IoT apps. The question of whatsecurity policy cross-app interference embodies remains largely unexplored.This paper proposes a semantic framework capturing the essence of cross-app interactions in IoT platforms.The framework generalizes and connects syntactic enforcement mechanisms to bisimulation-based notionsof security, thus providing a baseline for formulating soundness criteria of these enforcement mechanisms.Specifically, we present a calculus that models the behavioral semantics of a system of apps executingconcurrently, and use it to define desirable semantic policies targeting the security and safety of IoT apps.To demonstrate the usefulness of our framework, we define and implement static analyses for enforcingcross-app security and safety, and prove them sound with respect to our semantic conditions. We also leveragereal-world apps to validate the practical benefits of our tools based on the proposed enforcement mechanisms.

    Download full text (pdf)
    fulltext
  • 95.
    Bansal, Nikhil
    et al.
    Eindhoven Univ Technol, Eindhoven, Netherlands..
    Chalermsook, Parinya
    Aalto Univ, Helsinki, Finland..
    Laekhanukit, Bundit
    Shanghai Univ Finance & Econ, Shanghai, Peoples R China..
    Na Nongkai, Danupon
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.
    Nederlof, Jesper
    Eindhoven Univ Technol, Eindhoven, Netherlands..
    New Tools and Connections for Exponential-Time Approximation2019In: Algorithmica, ISSN 0178-4617, E-ISSN 1432-0541, Vol. 81, no 10, p. 3993-4009Article in journal (Refereed)
    Abstract [en]

    In this paper, we develop new tools and connections for exponential time approximation. In this setting, we are given a problem instance and an integer r > 1, and the goal is to design an approximation algorithm with the fastest possible running time. We give randomized algorithms that establish an approximation ratio of 1. r for maximum independent set in O*(exp((O) over tilde (n/r log(2) r + r log(2) r))) time, 2. r for chromatic number in O*(exp((O) over tilde (n/r log r + r log(2) r))) time, 3. (2 - 1/r) for minimum vertex cover in O*(exp(n/r(Omega(r)))) time, and 4. (k - 1/r) for minimum k-hypergraph vertex cover in O*(exp(n/(kr)(Omega(kr)))) time. (Throughout, (O) over tilde and O* omit polyloglog(r) and factors polynomial in the input size, respectively.) The best known time bounds for all problems were O*(2n/r) (Bourgeois et al. in Discret Appl Math 159(17): 1954-1970, 2011; Cygan et al. in Exponential-time approximation of hard problems, 2008). For maximum independent set and chromatic number, these bounds were complemented by exp(n(1-o(1))/r(1+o(1))) lower bounds (under the Exponential Time Hypothesis (ETH)) (Chalermsook et al. in Foundations of computer science, FOCS, pp. 370-379, 2013; Laekhanukit in Inapproximability of combinatorial problems in subexponential-time. Ph.D. thesis, 2014). Our results show that the naturally-looking O*(2n/r) bounds are not tight for all these problems. The key to these results is a sparsification procedure that reduces a problem to a bounded-degree variant, allowing the use of approximation algorithms for bounded-degree graphs. To obtain the first two results, we introduce a new randomized branching rule. Finally, we show a connection between PCP parameters and exponential-time approximation algorithms. This connection together with our independent set algorithm refute the possibility to overly reduce the size of Chan's PCP (Chan in J. ACM 63(3): 27: 1-27: 32, 2016). It also implies that a (significant) improvement over our result will refute the gap-ETH conjecture (Dinur in Electron Colloq Comput Complex (ECCC) 23: 128, 2016; Manurangsi and Raghavendra in A birthday repetition theorem and complexity of approximating dense CSPs, 2016).

  • 96.
    Bao, Yan
    et al.
    KTH, School of Information and Communication Technology (ICT), Communication: Services and Infrastucture, Software and Computer Systems, SCS.
    Brorsson, Mats
    KTH, School of Information and Communication Technology (ICT), Communication: Services and Infrastucture, Software and Computer Systems, SCS.
    An Implementation of Cache-Coherence for the Nios II ™ Soft-core processor2009Conference paper (Refereed)
    Abstract [en]

    Soft-core programmable processors mapped onto fieldprogrammable gate arrays (FPGA) can be considered as equivalents to a microcontroller. They combine central processing units (CPUs), caches, memories, and peripherals on a single chip. Soft-cores processors represent an increasingly common embedded software implementation option. Modern FPGA soft-cores are parameterized to support application-specific customization. However, these softcore processors are designed to be used in uniprocessor system, not for multiprocessor system. This project describes an implementation to solve the cache coherency problem in an ALTERA Nios II soft-core multiprocessor system.

    Download full text (pdf)
    fulltext
  • 97.
    Barbette, Tom
    University of Liege.
    Architecture for programmable network infrastructure2018Doctoral thesis, monograph (Other academic)
    Abstract [en]

    Software networking promises a more flexible network infrastructure, poised to leverage the computational power available in datacenters. Virtual Net- work Functions (VNF) can now run on commodity hardware in datacenters instead of using specialized equipment disposed along the network path. VNFs applications like stateful firewalls, carrier-grade NAT or deep packet inspection that are found “in-the-middle”, and therefore often categorized as middleboxes, are now software functions that can be migrated to reduce costs, consolidate the processing or scale easily. But if not carefully implemented, VNFs won’t achieve high-speed and will barely sustain rates of even small networks and therefore fail to fulfil their promise. As of today, out-of-the-box solutions are far from efficient and cannot handle high rates, especially when combined in a single host, as multiple case studies will show in this thesis. We start by reviewing the current obstacles to high-speed software net- working. We leverage current commodity hardware to achieve what seemed impossible to do in software not long ago and made software solutions be- lieved unworthy and untrusted by network operators. Our work paves the way for building a proper software framework for a programmable network infrastructure that can be used to quickly implement network functions. We built FastClick, a faster version of the Click Modular Router, that allows fast packet processing thanks to a careful integration of fast I/O frame- works and a deep study of interactions of their features. FastClick proposes a revised, easier to use execution model that hides multi-queueing and sim- plifies multithreading using a thread traversal analysis of the configuration. We propose tailored network-specific multi-threaded algorithms that enable parallel high-speed networking. We build a new retro-compatible batching implementation, and avoid system calls “left over” by previous work. We then build MiddleClick, an NFV dataplane built on top of FastClick. It combines VNFs along a service chain to use a common subsystem that implements shared features such as classification and session handling, but makes sure no feature is applied that isn’t absolutely needed by one of the VNFs. E.g., the classification is optimized to be minimal and only needs to be done once for all VNFs. E.g., if no VNF needs TCP reconstruction, that reconstruction won’t happen. We propose an algorithm to enable a per-session, per-VNF “scratchpad”. Only the minimal amount of state is declared and accessible in predictable locations using a per-VNF offset into the “scratchpad” for fast lookups across the chain. MiddleClick also offers new flow abstractions and ways to handle sessions that enable fast and easy development of new middlebox functions that can handle many flows in parallel. Cooperation, consolidation and using the hardware in an appropriate way may not always be enough. This thesis finally explores how to use classi- fication hardware such as smart NICs and SDN switches to accelerate the processing of the combined service chain, removing the need for software classification. While this work mostly relies on known high-level NFV dataplane principles and proposes a few new ones, it is one of the most low-level work in the field, leading to precise implementation considerations yielding very high performance results. Both FastClick and MiddleClick are available as Open Source projects and constitute an important contribution to the state of the art. Multiple leading edge use cases are built to show how the prototype can be used to build fast and efficient solutions quickly.

    Download full text (pdf)
    fulltext
  • 98.
    Barbette, Tom
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Chiesa, Marco
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Stateless CPU-aware datacenter load-balancing2020In: Poster: Stateless CPU-aware datacenter load-balancing, Association for Computing Machinery (ACM) , 2020, p. 548-549Conference paper (Refereed)
    Abstract [en]

    Today, datacenter operators deploy Load-balancers (LBs) to efficiently utilize server resources, but must over-provision server resources (by up to 30%) because of load imbalances and the desire to bound tail service latency. We posit one of the reasons for these imbalances is the lack of per-core load statistics in existing LBs. As a first step, we designed CrossRSS, a CPU core-aware LB that dynamically assigns incoming connections to the least loaded cores in the server pool. CrossRSS leverages knowledge of the dispatching by each server's Network Interface Card (NIC) to specific cores to reduce imbalances by more than an order of magnitude compared to existing LBs in a proof-of-concept datacenter environment, processing 12% more packets with the same number of cores.

    Download full text (pdf)
    fulltext
  • 99.
    Barbette, Tom
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Katsikas, Georgios P.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Radio Systems Laboratory (RS Lab).
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    RSS++: load and state-aware receive side scaling2019In: Proceedings of the 15th International Conference on emerging Networking EXperiments and Technologies / [ed] ACM, Orlando, FL, USA: Association for Computing Machinery (ACM), 2019Conference paper (Refereed)
    Abstract [en]

    While the current literature typically focuses on load-balancing among multiple servers, in this paper, we demonstrate the importance of load-balancing within a single machine (potentially with hundreds of CPU cores). In this context, we propose a new load-balancing technique (RSS++) that dynamically modifies the receive side scaling (RSS) indirection table to spread the load across the CPU cores in a more optimal way. RSS++ incurs up to 14x lower 95th percentile tail latency and orders of magnitude fewer packet drops compared to RSS under high CPU utilization. RSS++ allows higher CPU utilization and dynamic scaling of the number of allocated CPU cores to accommodate the input load, while avoiding the typical 25% over-provisioning. RSS++ has been implemented for both (i) DPDK and (ii) the Linux kernel. Additionally, we implement a new state migration technique, which facilitates sharding and reduces contention between CPU cores accessing per-flow data. RSS++ keeps the flow-state by groups that can be migrated at once, leading to a 20% higher efficiency than a state of the art shared flow table.

    Download full text (pdf)
    RSSPP
  • 100.
    Barbette, Tom
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Soldani, Cyril
    Université de Liège.
    Mathy, Laurent
    Université de Liège.
    Combined stateful classification and session splicing for high-speed NFV service chaining2021In: IEEE/ACM Transactions on Networking, ISSN 1063-6692, E-ISSN 1558-2566Article in journal (Refereed)
    Abstract [en]

    Network functions such as firewalls, NAT, DPI, content-aware optimizers, and load-balancers are increasingly realized as software to reduce costs and enable outsourcing. To meet performance requirements these virtual network functions (VNFs) often bypass the kernel and use their own user-space networking stack. A naïve realization of a chain of VNFs will exchange raw packets, leading to many redundant operations, wasting resources. In this work, we design a system to execute a pipeline of VNFs. We provide the user facilities to define (i) a traffic class of interest for the VNF, (ii) a session to group the packets (such as the TCP 4-tuple), and (iii) the amount of space per session. The system synthesizes a classifier and builds an efficient flow table that when possible will automatically be partially offloaded and accelerated by the network interface. We utilize an abstract view of flows to support seamless inspection and modification of the content of any flow (such as TCP or HTTP). By applying only surgical modifications to the protocol headers, we avoid the need for a complex, hard-to-maintain user-space TCP stack and can chain multiple VNFs without re-constructing the stream multiple times, allowing up to 5x improvement over standard approaches.

    Download full text (pdf)
    fulltext
1234567 51 - 100 of 1224
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf