Endre søk
Begrens søket
1234567 1 - 50 of 780
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Treff pr side
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Abdullah, Nazri
    et al.
    Universiti Tun Hussien Onn Malaysia, Malaysia .
    Kounelis, Ioannis
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Kommunikationssystem, CoS.
    Muftic, Sead
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Kommunikationssystem, CoS.
    Security Extensions for Mobile Commerce Objects2014Inngår i: SECURWARE 2014, The Eighth International Conference on Emerging Security Information, Systems and Technologies, 2014Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Electronic commerce and its variance mobile commerce have tremendously increased their popularity in the last several years. As mobile devices have become the most popular mean to access and use the Internet, mobile commerce and its security are timely and very hot topics. Yet, today there is still no consistent model of various m–commerce applications and transactions, even less clear specification of their security. In order to address and solve those issues, in this paper, we first establish the concept of mobile commerce objects, an equivalent of virtual currencies, used for m–commerce transactions. We describe functionalities and unique characteristics of these objects; we follow with security requirements, and then offer some solutions – security extensions of these objects. All solutions are treated within the complete lifecycle of creation and use of the m–commerce objects.

  • 2. Abouelhoda, Mohamed
    et al.
    Issa, Shady
    Center for Informatics Sciences, Nile University, Giza, Egypt.
    Ghanem, Moustafa
    Tavaxy: integrating Taverna and Galaxy workflows with cloud computing support.2012Inngår i: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 13Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    BACKGROUND: Over the past decade the workflow system paradigm has evolved as an efficient and user-friendly approach for developing complex bioinformatics applications. Two popular workflow systems that have gained acceptance by the bioinformatics community are Taverna and Galaxy. Each system has a large user-base and supports an ever-growing repository of application workflows. However, workflows developed for one system cannot be imported and executed easily on the other. The lack of interoperability is due to differences in the models of computation, workflow languages, and architectures of both systems. This lack of interoperability limits sharing of workflows between the user communities and leads to duplication of development efforts.

    RESULTS: In this paper, we present Tavaxy, a stand-alone system for creating and executing workflows based on using an extensible set of re-usable workflow patterns. Tavaxy offers a set of new features that simplify and enhance the development of sequence analysis applications: It allows the integration of existing Taverna and Galaxy workflows in a single environment, and supports the use of cloud computing capabilities. The integration of existing Taverna and Galaxy workflows is supported seamlessly at both run-time and design-time levels, based on the concepts of hierarchical workflows and workflow patterns. The use of cloud computing in Tavaxy is flexible, where the users can either instantiate the whole system on the cloud, or delegate the execution of certain sub-workflows to the cloud infrastructure.

    CONCLUSIONS: Tavaxy reduces the workflow development cycle by introducing the use of workflow patterns to simplify workflow creation. It enables the re-use and integration of existing (sub-) workflows from Taverna and Galaxy, and allows the creation of hybrid workflows. Its additional features exploit recent advances in high performance cloud computing to cope with the increasing data size and complexity of analysis.The system can be accessed either through a cloud-enabled web-interface or downloaded and installed to run within the user's local environment. All resources related to Tavaxy are available at http://www.tavaxy.org.

  • 3.
    Abourraja, Mohamed Nezar
    KTH.
    Gestion multi-agents d’un terminal à conteneurs2018Annet (Annet vitenskapelig)
  • 4. Abourraja, Mohamed Nezar
    et al.
    Oudani, Mustapha
    Samiri, Mohamed Yassine
    Boudebous, Dalila
    El Fazziki, Abdelaziz
    Najib, Mehdi
    Bouain, Abdelhadi
    Rouky, Naoufal
    A multi-agent based simulation model for rail–rail transshipment: An engineering approach for gantry crane scheduling2017Inngår i: IEEE Access, E-ISSN 2169-3536, Vol. 5, s. 13142-13156Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Le Havre Port Authority is putting into service a multimodal hub terminal with massified hinterland links (trains and barges) in order to restrict the intensive use of roads, to achieve a more attractive massification share of hinterland transportation and to provide a river connection to its maritime terminals that do not currently have one. This paper focuses on the rail-rail transshipment yard of this new terminal. In the current organizational policy, this yard is divided into two equal operating areas, and, in each one, a crane is placed, and it is equipped with reach stackers to enable container moves across both operating areas. However, this policy causes poor scheduling of crane moves, because it gives rise to many crane interference situations. For the sake of minimizing the occurrence of these undesirable situations, this paper proposes a multi-agent simulation model including an improved strategy for crane scheduling. This strategy is inspired by the ant colony approach and it is governed by a new configuration for the rail yard's working area that eliminates the use of reach stackers. The proposed simulation model is based on two planner agents, to each of which a time-horizon planning is assigned. The simulation results show that the model developed here is very successful in significantly reducing unproductive times and moves (undesirable situations), and it outperforms other existing simulation models based on the current organizational policy.

  • 5. Abourraja, Mohamed Nezar
    et al.
    Oudani, Mustapha
    Samiri, Mohamed Yassine
    Boukachour, Jaouad
    Elfazziki, Abdelaziz
    Bouain, Abdelhadi
    Najib, Mehdi
    An improving agent-based engineering strategy for minimizing unproductive situations of cranes in a rail–rail transshipment yard2018Inngår i: Simulation, Vol. 94, nr 8, s. 681-705Artikkel i tidsskrift (Fagfellevurdert)
  • 6.
    Abtahi, Farhad
    et al.
    KTH, Skolan för teknik och hälsa (STH), Medicinsk teknik, Medicinska sensorer, signaler och system.
    Berndtsson, Andreas
    Abtahi, Shirin
    Seoane, Fernando
    KTH, Skolan för teknik och hälsa (STH), Medicinsk teknik, Medicinska sensorer, signaler och system.
    Lindecrantz, Kaj
    KTH, Skolan för teknik och hälsa (STH), Medicinsk teknik, Medicinska sensorer, signaler och system.
    Development and preliminary evaluation of an Android based heart rate variability biofeedback system2014Inngår i: Engineering in Medicine and Biology Society (EMBC), 2014 36th Annual International Conference of the IEEE, IEEE conference proceedings, 2014, s. 3382-3385Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The reduced Heart Rate Variability (HRV) is believed to be associated with several diseases such as congestive heart failure, diabetes and chronic kidney diseases (CKD). In these cases, HRV biofeedback may be a potential intervention method to increase HRV which in turn is beneficial to these patients. In this work, a real-time Android biofeedback application based on a Bluetooth enabled ECG and thoracic electrical bioimpedance (respiration) measurement device has been developed. The system performance and usability have been evaluated in a brief study with eight healthy volunteers. The result demonstrates real-time performance of system and positive effects of biofeedback training session by increased HRV and reduced heart rate. Further development of the application and training protocol is ongoing to investigate duration of training session to find an optimum length and interval of biofeedback sessions to use in potential interventions.

  • 7.
    Agelfors, Eva
    et al.
    KTH, Tidigare Institutioner, Talöverföring och musikakustik.
    Beskow, Jonas
    Dahlquist, M
    Granström, Björn
    Lundeberg, M
    Salvi, Giampiero
    Spens, K-E
    Öhman, Tobias
    Two methods for Visual Parameter Extraction in the Teleface Project1999Inngår i: Proceedings of Fonetik, Gothenburg, Sweden, 1999Konferansepaper (Annet vitenskapelig)
  • 8.
    Aguilar, Xavier
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Beräkningsvetenskap och beräkningsteknik (CST).
    Performance Monitoring, Analysis, and Real-Time Introspection on Large-Scale Parallel Systems2020Doktoravhandling, monografi (Annet vitenskapelig)
    Abstract [en]

    High-Performance Computing (HPC) has become an important scientific driver. A wide variety of research ranging for example from drug design to climate modelling is nowadays performed in HPC systems. Furthermore, the tremendous computer power of such HPC systems allows scientists to simulate problems that were unimaginable a few years ago. However, the continuous increase in size and complexity of HPC systems is turning the development of efficient parallel software into a difficult task. Therefore, the use of per- formance monitoring and analysis is a must in order to unveil inefficiencies in parallel software. Nevertheless, performance tools also face challenges as a result of the size of HPC systems, for example, coping with huge amounts of performance data generated.

    In this thesis, we propose a new model for performance characterisation of MPI applications that tackles the challenge of big performance data sets. Our approach uses Event Flow Graphs to balance the scalability of profiling techniques (generating performance reports with aggregated metrics) with the richness of information of tracing methods (generating files with sequences of time-stamped events). In other words, graphs allow to encode ordered se- quences of events without storing the whole sequence of such events, and therefore, they need much less memory and disk space, and are more scal- able. We demonstrate in this thesis how our Event Flow Graph model can be used as a trace compression method. Furthermore, we propose a method to automatically detect the structure of MPI applications using our Event Flow Graphs. This knowledge can afterwards be used to collect performance data in a smarter way, reducing for example the amount of redundant data collected. Finally, we demonstrate that our graphs can be used beyond trace compression and automatic analysis of performance data. We propose a new methodology to use Event Flow Graphs in the task of visual performance data exploration.

    In addition to the Event Flow Graph model, we also explore in this thesis the design and use of performance data introspection frameworks. Future HPC systems will be very dynamic environments providing extreme levels of parallelism, but with energy constraints, considerable resource sharing, and heterogeneous hardware. Thus, the use of real-time performance data to or- chestrate program execution in such a complex and dynamic environment will be a necessity. This thesis presents two different performance data introspec- tion frameworks that we have implemented. These introspection frameworks are easy to use, and provide performance data in real time with very low overhead. We demonstrate, among other things, how our approach can be used to reduce in real time the energy consumed by the system.

    The approaches proposed in this thesis have been validated in different HPC systems using multiple scientific kernels as well as real scientific applica- tions. The experiments show that our approaches in performance character- isation and performance data introspection are not intrusive at all, and can be a valuable contribution to help in the performance monitoring of future HPC systems.

  • 9.
    Aguilar, Xavier
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Towards Scalable Performance Analysis of MPI Parallel Applications2015Licentiatavhandling, med artikler (Annet vitenskapelig)
    Abstract [en]

      A considerably fraction of science discovery is nowadays relying on computer simulations. High Performance Computing  (HPC) provides scientists with the means to simulate processes ranging from climate modeling to protein folding. However, achieving good application performance and making an optimal use of HPC resources is a heroic task due to the complexity of parallel software. Therefore, performance tools  and runtime systems that help users to execute  applications in the most optimal way are of utmost importance in the landscape of HPC.  In this thesis, we explore different techniques to tackle the challenges of collecting, storing, and using  fine-grained performance data. First, we investigate the automatic use of real-time performance data in order to run applications in an optimal way. To that end, we present a prototype of an adaptive task-based runtime system that uses real-time performance data for task scheduling. This runtime system has a performance monitoring component that provides real-time access to the performance behavior of anapplication while it runs. The implementation of this monitoring component is presented and evaluated within this thesis. Secondly, we explore lossless compression approaches  for MPI monitoring. One of the main problems that  performance tools face is the huge amount of fine-grained data that can be generated from an instrumented application. Collecting fine-grained data from a program is the best method to uncover the root causes of performance bottlenecks, however, it is unfeasible with extremely parallel applications  or applications with long execution times. On the other hand, collecting coarse-grained data is scalable but  sometimes not enough to discern the root cause of a performance problem. Thus, we propose a new method for performance monitoring of MPI programs using event flow graphs. Event flow graphs  provide very low overhead in terms of execution time and  storage size, and can be used to reconstruct fine-grained trace files of application events ordered in time.

  • 10.
    Aguilar, Xavier
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Fürlinger, Karl
    Ludwig-Maximilians-Universitat (LMU).
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    MPI Trace Compression Using Event Flow Graphs2014Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Understanding how parallel applications behave is crucial for using high-performance computing (HPC) resources efficiently. However, the task of performance analysis is becoming increasingly difficult due to the growing complexity of scientific codes and the size of machines. Even though many tools have been developed over the past years to help in this task, current approaches either only offer an overview of the application discarding temporal information, or they generate huge trace files that are often difficult to handle.

    In this paper we propose the use of event flow graphs for monitoring MPI applications, a new and different approach that balances the low overhead of profiling tools with the abundance of information available from tracers. Event flow graphs are captured with very low overhead, require orders of magnitude less storage than standard trace files, and can still recover the full sequence of events in the application. We test this new approach with the NERSC-8/Trinity Benchmark suite and achieve compression ratios up to 119x.

  • 11.
    Aguilar, Xavier
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Fürlinger, Karl
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Visual MPI Performance Analysis using Event Flow Graphs2015Inngår i: Procedia Computer Science, ISSN 1877-0509, E-ISSN 1877-0509, Vol. 51, s. 1353-1362Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Event flow graphs used in the context of performance monitoring combine the scalability and low overhead of profiling methods with lossless information recording of tracing tools. In other words, they capture statistics on the performance behavior of parallel applications while pre- serving the temporal ordering of events. Event flow graphs require significantly less storage than regular event traces and can still be used to recover the full ordered sequence of events performed by the application.  In this paper we explore the usage of event flow graphs in the context of visual performance analysis. We show that graphs can be used to quickly spot performance problems, helping to better understand the behavior of an application. We demonstrate our performance analysis approach with MiniFE, a mini-application that mimics the key performance aspects of finite- element applications in High Performance Computing (HPC).

  • 12.
    Aguilar, Xavier
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz).
    Fürlinger, Karl
    Ludwig-Maximilians-Universität München.
    Online Performance Data Introspection with IPM2014Inngår i: Proceedings of the 15th IEEE International Conference on High Performance Computing and Communications (HPCC 2013), IEEE Computer Society, 2014, s. 728-734Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Exascale systems will be heterogeneous architectures with multiple levels of concurrency and energy constraints. In such a complex scenario, performance monitoring and runtime systems play a major role to obtain good application performance and scalability. Furthermore, online access to performance data becomes a necessity to decide how to schedule resources and orchestrate computational elements: processes, threads, tasks, etc. We present the Performance Introspection API, an extension of the IPM tool that provides online runtime access to performance data from an application while it runs. We describe its design and implementation and show its overhead on several test benchmarks. We also present a real test case using the Performance Introspection API in conjunction with processor frequency scaling to reduce power consumption.

  • 13.
    Aguilar, Xavier
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Schliephake, Michael
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC. KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Vahtras, Olav
    KTH, Skolan för bioteknologi (BIO), Teoretisk kemi och biologi. KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC. KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Gimenez, Judit
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), High Performance Computing and Visualization (HPCViz). KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC. KTH, Centra, SeRC - Swedish e-Science Research Centre.
    Scalability analysis of Dalton, a molecular structure program2013Inngår i: Future generations computer systems, ISSN 0167-739X, E-ISSN 1872-7115, Vol. 29, nr 8, s. 2197-2204Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Dalton is a molecular electronic structure program featuring common methods of computational chemistry that are based on pure quantum mechanics (QM) as well as hybrid quantum mechanics/molecular mechanics (QM/MM). It is specialized and has a leading position in calculation of molecular properties with a large world-wide user community (over 2000 licenses issued). In this paper, we present a performance characterization and optimization of Dalton. We also propose a solution to avoid the master/worker design of Dalton to become a performance bottleneck for larger process numbers. With these improvements we obtain speedups of 4x, increasing the parallel efficiency of the code and being able to run in it in a much bigger number of cores.

  • 14. Ahmed, J.
    et al.
    Johnsson, A.
    Yanggratoke, Rerngvit
    KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre. KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Ardelius, J.
    Flinta, C.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Predicting SLA conformance for cluster-based services using distributed analytics2016Inngår i: Proceedings of the NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium, IEEE conference proceedings, 2016, s. 848-852Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Service assurance for the telecom cloud is a challenging task and is continuously being addressed by academics and industry. One promising approach is to utilize machine learning to predict service quality in order to take early mitigation actions. In previous work we have shown how to predict service-level metrics, such as frame rate for a video application on the client side, from operational data gathered at the server side. This gives the service provider early indications on whether the platform can support the current load demand. This paper extends previous work by addressing scalability issues for cluster-based services. Operational data being generated in large volumes, from several sources, and at high velocity puts strain on computational and communication resources. We propose and evaluate a distributed machine learning system based on the Winnow algorithm to tackle scalability issues, and then compare the new distributed solution with the previously proposed centralized solution. We show that network overhead and computational execution time is substantially reduced while maintaining high prediction accuracy making it possible to achieve real-time service quality predictions in large systems.

  • 15.
    Ahmed, Tanvir Saif
    et al.
    KTH, Skolan för teknik och hälsa (STH), Medicinsk teknik, Data- och elektroteknik.
    Markovic, Bratislav
    KTH, Skolan för teknik och hälsa (STH), Medicinsk teknik, Data- och elektroteknik.
    Distribuerade datalagringssystem för tjänsteleverantörer: Undersökning av olika användningsfall för distribuerade datalagringssystem2016Independent thesis Basic level (university diploma), 10 poäng / 15 hpOppgave
    Abstract [sv]

    Detta examensarbete handlar om undersökning av tre olika användningsfall inom datalagring; Cold Storage, High Performance Storage och Virtual Machine Storage. Rapporten har som syfte att ge en översikt över kommersiella distribuerade filsystem samt en djupare undersökning av distribuerade filsystem som bygger på öppen källkod och därmed hitta en optimal lösning för dessa användnings-fall. I undersökningen ingick att analysera och jämföra tidigare arbeten där jämförelser mellan pre-standamätningar, dataskydd och kostnader utfördes samt lyfta upp diverse funktionaliteter (snapshotting, multi-tenancy, datadeduplicering, datareplikering) som moderna distribuerade filsy-stem kännetecknas av. Både kommersiella och öppna distribuerade filsystem undersöktes. Även en kostnadsuppskattning för kommersiella och öppna distribuerade filsystem gjordes för att ta reda på lönsamheten för dessa två typer av distribuerat filsystem.Efter att jämförelse och analys av olika tidigare arbeten utfördes, visade sig att det öppna distribue-rade filsystemet Ceph lämpade sig bra som en lösning utifrån kraven som sattes som mål för High Performance Storage och Virtual Machine Storage. Kostnadsuppskattningen visade att det var mer lönsamt att implementera ett öppet distribuerat filsystem. Denna undersökning kan användas som en vägledning vid val mellan olika distribuerade filsystem.

  • 16. Akhmetova, D.
    et al.
    Kestor, G.
    Gioiosa, R.
    Markidis, Stefano
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    On the application task granularity and the interplay with the scheduling overhead in many-core shared memory systems2015Inngår i: Proceedings - IEEE International Conference on Cluster Computing, ICCC, IEEE , 2015, s. 428-437Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Task-based programming models are considered one of the most promising programming model approaches for exascale supercomputers because of their ability to dynamically react to changing conditions and reassign work to processing elements. One question, however, remains unsolved: what should the task granularity of task-based applications be? Fine-grained tasks offer more opportunities to balance the system and generally result in higher system utilization. However, they also induce in large scheduling overhead. The impact of scheduling overhead on coarse-grained tasks is lower, but large systems may result imbalanced and underutilized. In this work we propose a methodology to analyze the interplay between application task granularity and scheduling overhead. Our methodology is based on three main points: 1) a novel task algorithm that analyzes an application directed acyclic graph (DAG) and aggregates tasks, 2) a fast and precise emulator to analyze the application behavior on systems with up to 1,024 cores, 3) a comprehensive sensitivity analysis of application performance and scheduling overhead breakdown. Our results show that there is an optimal task granularity between 1.2x10^4 and 10x10^4 cycles for the representative schedulers. Moreover, our analysis indicates that a suitable scheduler for exascale task-based applications should employ a best-effort local scheduler and a sophisticated remote scheduler to move tasks across worker threads.

  • 17.
    Akhmetova, Dana
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Iakymchuk, Roman
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Ekeberg, Örjan
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Laure, Erwin
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsvetenskap och beräkningsteknik (CST).
    Performance study of multithreaded MPI and Openmp tasking in a large scientific code2017Inngår i: Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017, Institute of Electrical and Electronics Engineers (IEEE), 2017, s. 756-765, artikkel-id 7965119Konferansepaper (Fagfellevurdert)
    Abstract [en]

    With a large variety and complexity of existing HPC machines and uncertainty regarding exact future Exascale hardware, it is not clear whether existing parallel scientific codes will perform well on future Exascale systems: they can be largely modified or even completely rewritten from scratch. Therefore, now it is important to ensure that software is ready for Exascale computing and will utilize all Exascale resources well. Many parallel programming models try to take into account all possible hardware features and nuances. However, the HPC community does not yet have a precise answer whether, for Exascale computing, there should be a natural evolution of existing models interoperable with each other or it should be a disruptive approach. Here, we focus on the first option, particularly on a practical assessment of how some parallel programming models can coexist with each other. This work describes two API combination scenarios on the example of iPIC3D [26], an implicit Particle-in-Cell code for space weather applications written in C++ and MPI plus OpenMP. The first scenario is to enable multiple OpenMP threads call MPI functions simultaneously, with no restrictions, using an MPI THREAD MULTIPLE thread safety level. The second scenario is to utilize the OpenMP tasking model on top of the first scenario. The paper reports a step-by-step methodology and experience with these API combinations in iPIC3D; provides the scaling tests for these implementations with up to 2048 physical cores; discusses occurred interoperability issues; and provides suggestions to programmers and scientists who may adopt these API combinations in their own codes.

  • 18.
    Alexandru, Iordan
    et al.
    Norwegian University of Science and Technology Trondheim.
    Podobas, Artur
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Kommunikation: Infrastruktur och tjänster, Programvaru- och datorsystem, SCS.
    Natvig, Lasse
    Norwegian University of Science and Technology Trondheim.
    Brorsson, Mats
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Kommunikation: Infrastruktur och tjänster, Programvaru- och datorsystem, SCS.
    Investigating the Potential of Energy-savings Using a Fine-grained Task Based Programming Model on Multi-cores2011Konferansepaper (Fagfellevurdert)
    Abstract [en]

    In this paper we study the relation between energy-efficiencyand parallel executions when implemented with a fine-grained task-centricprogramming model. Using a simulation framework comprised of an ar-chitectural simulator and a power and area estimation tool, we haveinvestigated the potential energy-savings when employing parallelism onmulti-cores system. In our experiments with 2 - 8 multi-cores systems,we employed frequency and voltage scaling in order to keep the relativeperformance of the systems constant and measured the energy-efficiencyusing the Energy-delay-product. Also, we compared the energy consump-tion of the parallel execution against the serial one. Our results showthat through judicious choice of load balancing parameters, significantimprovements of around 200 % in energy consumption can be acheived.

  • 19.
    Al-Shishtawy, Ahmad
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Self-Management for Large-Scale Distributed Systems2012Doktoravhandling, med artikler (Annet vitenskapelig)
    Abstract [en]

    Autonomic computing aims at making computing systems self-managing by using autonomic managers in order to reduce obstacles caused by management complexity. This thesis presents results of research on self-management for large-scale distributed systems. This research was motivated by the increasing complexity of computing systems and their management.

    In the first part, we present our platform, called Niche, for programming self-managing component-based distributed applications. In our work on Niche, we have faced and addressed the following four challenges in achieving self-management in a dynamic environment characterized by volatile resources and high churn: resource discovery, robust and efficient sensing and actuation, management bottleneck, and scale. We present results of our research on addressing the above challenges. Niche implements the autonomic computing architecture, proposed by IBM, in a fully decentralized way. Niche supports a network-transparent view of the system architecture simplifying the design of distributed self-management. Niche provides a concise and expressive API for self-management. The implementation of the platform relies on the scalability and robustness of structured overlay networks. We proceed by presenting a methodology for designing the management part of a distributed self-managing application. We define design steps that include partitioning of management functions and orchestration of multiple autonomic managers.

    In the second part, we discuss robustness of management and data consistency, which are necessary in a distributed system. Dealing with the effect of churn on management increases the complexity of the management logic and thus makes its development time consuming and error prone. We propose the abstraction of Robust Management Elements, which are able to heal themselves under continuous churn. Our approach is based on replicating a management element using finite state machine replication with a reconfigurable replica set. Our algorithm automates the reconfiguration (migration) of the replica set in order to tolerate continuous churn. For data consistency, we propose a majority-based distributed key-value store supporting multiple consistency levels that is based on a peer-to-peer network. The store enables the tradeoff between high availability and data consistency. Using majority allows avoiding potential drawbacks of a master-based consistency control, namely, a single-point of failure and a potential performance bottleneck.

    In the third part, we investigate self-management for Cloud-based storage systems with the focus on elasticity control using elements of control theory and machine learning. We have conducted research on a number of different designs of an elasticity controller, including a State-Space feedback controller and a controller that combines feedback and feedforward control. We describe our experience in designing an elasticity controller for a Cloud-based key-value store using state-space model that enables to trade-off performance for cost. We describe the steps in designing an elasticity controller. We continue by presenting the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores that combines feedforward and feedback control.

  • 20.
    Al-Shishtawy, Ahmad
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Kommunikation: Infrastruktur och tjänster, Programvaru- och datorsystem, SCS.
    Fayyaz, Muhammad Asif
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Kommunikation: Infrastruktur och tjänster, Programvaru- och datorsystem, SCS.
    Popov, Konstantin
    Swedish Institute of Computer Science (SICS), Kista, Sweden.
    Vlassov, Vladimir
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Kommunikation: Infrastruktur och tjänster, Programvaru- och datorsystem, SCS.
    Achieving Robust Self-Management for Large-Scale Distributed Applications2010Inngår i: Self-Adaptive and Self-Organizing Systems (SASO), 2010 4th IEEE International Conference on: SASO 2010, IEEE Computer Society, 2010, s. 31-40Konferansepaper (Fagfellevurdert)
    Abstract [sv]

    Achieving self-management can be challenging, particularly in dynamic environments with resource churn (joins/leaves/failures). Dealing with the effect of churn on management increases the complexity of the management logic and thus makes its development time consuming and error prone. We propose the abstraction of robust management elements (RMEs), which are able to heal themselves under continuous churn. Using RMEs allows the developer to separate the issue of dealing with the effect of churn on management from the management logic. This facilitates the development of robust management by making the developer focus on managing the application while relying on the platform to provide the robustness of management. RMEs can be implemented as fault-tolerant long-living services. We present a generic approach and an associated algorithm to achieve fault-tolerant long-living services. Our approach is based on replicating a service using finite state machine replication with a reconfigurable replica set. Our algorithm automates the reconfiguration (migration) of the replica set in order to tolerate continuous churn. The algorithm uses P2P replica placement schemes to place replicas and uses the P2P overlay to monitor them. The replicated state machine is extended to analyze monitoring data in order to decide on when and where to migrate. We describe how to use our approach to achieve robust management elements. We present a simulation-based evaluation of our approach which shows its feasibility.

  • 21.
    Al-Shishtawy, Ahmad
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Vlassov, Vladimir
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    ElastMan: Autonomic elasticity manager for cloud-based key-value stores2013Inngår i: HPDC 2013 - Proceedings of the 22nd ACM International Symposium on High-Performance Parallel and Distributed Computing, 2013, s. 115-116Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The increasing spread of elastic Cloud services, together with the pay-as-you-go pricing model of Cloud computing, has led to the need of an elasticity controller. The controller automatically resizes an elastic service in response to changes in workload, in order to meet Service Level Objectives (SLOs) at a reduced cost. However, variable performance of Cloud virtual machines and nonlinearities in Cloud services complicates the controller design. We present the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores. ElastMan combines feedforward and feedback control. Feedforward control is used to respond to spikes in the workload by quickly resizing the service to meet SLOs at a minimal cost. Feedback control is used to correct modeling errors and to handle diurnal workload. We have implemented and evaluated ElastMan using the Voldemort key-value store running in a Cloud environment based on OpenStack. Our evaluation shows the feasibility and effectiveness of our approach to automation of Cloud service elasticity.

  • 22.
    Al-Shishtawy, Ahmad
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Vlassov, Vladimir
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    ElastMan: Autonomic Elasticity Manager for Cloud-Based Key-Value Stores2012Rapport (Annet vitenskapelig)
    Abstract [en]

    The increasing spread of elastic Cloud services, together with the pay-asyou-go pricing model of Cloud computing, has led to the need of an elasticity controller. The controller automatically resizes an elastic service, in response to changes in workload, in order to meet Service Level Objectives (SLOs) at a reduced cost. However, variable performance of Cloud virtual machines and nonlinearities in Cloud services, such as the diminishing reward of adding a service instance with increasing the scale, complicates the controller design. We present the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores. ElastMan combines feedforward and feedback control. Feedforward control is used to respond to spikes in the workload by quickly resizing the service to meet SLOs at a minimal cost. Feedback control is used to correct modeling errors and to handle diurnal workload. To address nonlinearities, our design of ElastMan leverages the near-linear scalability of elastic Cloud services in order to build a scale-independent model of the service. Our design based on combining feedforward and feedback control allows to efficiently handle both diurnal and rapid changes in workload in order to meet SLOs at a minimal cost. Our evaluation shows the feasibility of our approach to automation of Cloud service elasticity.

  • 23.
    Al-Shishtawy, Ahmad
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Vlassov, Vladimir
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    ElastMan: Elasticity manager for elastic key-value stores in the cloud2013Inngår i: Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference, New York, NY, USA: Association for Computing Machinery (ACM), 2013, s. 7:1-7:10Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The increasing spread of elastic Cloud services, together with the pay-as-you-go pricing model of Cloud computing, has led to the need of an elasticity controller. The controller automatically resizes an elastic service in response to changes in workload, in order to meet Service Level Objectives (SLOs) at a reduced cost. However, variable performance of Cloud Virtual Machines and nonlinearities in Cloud services, such as the diminishing reward of adding a service instance with increasing the scale, complicates the controller design. We present the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores. ElastMan combines feedforward and feedback control. Feedforward control is used to respond to spikes in the workload by quickly resizing the service to meet SLOs at a minimal cost. Feedback control is used to correct modeling errors and to handle diurnal workload. To address nonlinearities, our design of ElastMan leverages the near-linear scalability of elastic Cloud services in order to build a scale-independent model of the service. We have implemented and evaluated ElastMan using the Voldemort key-value store running in an OpenStack Cloud environment. Our evaluation shows the feasibility and effectiveness of our approach to automation of Cloud service elasticity.

  • 24.
    Amighi, Afshin
    et al.
    University of Twente.
    de Carvalho Gomes, Pedro
    KTH, Skolan för datavetenskap och kommunikation (CSC), Teoretisk datalogi, TCS.
    Gurov, Dilian
    KTH, Skolan för datavetenskap och kommunikation (CSC), Teoretisk datalogi, TCS.
    Huisman, Marieke
    University of Twente.
    Provably Correct Control-Flow Graphs from Java Programs with Exceptions2012Rapport (Annet vitenskapelig)
    Abstract [en]

    We present an algorithm to extract flow graphs from Java bytecode, including exceptional control flows. We prove its correctness, meaning that the behavior of the extracted control-flow graph is a sound over-approximation of the behavior of the original program. Thus any safety property that holds for the extracted control-flow graph also holds for the original program. This makes control-flow graphs suitable for performing various static analyses, such as model checking.The extraction is performed in two phases. In the first phase the program is transformed into a BIR program, a stack-less intermediate representation of Java bytecode, from which the control-flow graph is extracted in the second phase. We use this intermediate format because it results in compact flow graphs, with provably correct exceptional control flow. To prove the correctness of the two-phase extraction, we also define an idealized extraction algorithm, whose correctness can be proven directly. Then we show that the behavior of the control-flow graph extracted via the intermediate representation is an over-approximation of the behavior of the directly extracted graphs, and thus of the original program. We implemented the indirect extraction as the CFGEx tool and performed several test-cases to show the efficiency of the algorithm.

  • 25.
    Amighi, Afshin
    et al.
    University of Twente.
    de Carvalho Gomes, Pedro
    KTH, Skolan för datavetenskap och kommunikation (CSC), Teoretisk datalogi, TCS.
    Gurov, Dilian
    KTH, Skolan för datavetenskap och kommunikation (CSC), Teoretisk datalogi, TCS.
    Huisman, Marieke
    University of Twente.
    Sound Control-Flow Graph Extraction for Java Programs with Exceptions2012Inngår i: Software Engineering and Formal Methods: 10th International Conference, SEFM 2012, Thessaloniki, Greece, October 1-5, 2012. Proceedings, Springer Berlin/Heidelberg, 2012, s. 33-47Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We present an algorithm to extract control-flow graphs from Java bytecode, considering exceptional flows. We then establish its correctness: the behavior of the extracted graphs is shown to be a sound over-approximation of the behavior of the original programs. Thus, any temporal safety property that holds for the extracted control-flow graph also holds for the original program. This makes the extracted graphs suitable for performing various static analyses, in particular model checking. The extraction proceeds in two phases. First, we translate Java bytecode into BIR, a stack-less intermediate representation. The BIR transformation is developed as a module of Sawja, a novel static analysis framework for Java bytecode. Besides Sawja’s efficiency, the resulting intermediate representation is more compact than the original bytecode and provides an explicit representation of exceptions. These features make BIR a natural starting point for sound control-flow graph extraction. Next, we formally define the transformation from BIR to control-flow graphs, which (among other features) considers the propagation of uncaught exceptions within method calls. We prove the correctness of the two-phase extraction by suitably combining the properties of the two transformations with those of an idealized control-flow graph extraction algorithm, whose correctness has been proved directly. The control-flow graph extraction algorithm is implemented in the ConFlEx tool. A number of test-cases show the efficiency and the utility of the implementation.

  • 26.
    Amor, Christian
    et al.
    Univ Politecn Madrid, Sch Aerosp Engn, E-28040 Madrid, Spain..
    Perez, Jose M.
    Univ Politecn Madrid, Sch Aerosp Engn, E-28040 Madrid, Spain..
    Schlatter, Philipp
    KTH, Skolan för teknikvetenskap (SCI), Mekanik. KTH, Skolan för teknikvetenskap (SCI), Centra, Linné Flow Center, FLOW.
    Vinuesa, Ricardo
    KTH, Skolan för teknikvetenskap (SCI), Mekanik. KTH, Skolan för teknikvetenskap (SCI), Centra, Linné Flow Center, FLOW.
    Le Clainche, Soledad
    Univ Politecn Madrid, Sch Aerosp Engn, E-28040 Madrid, Spain..
    Soft Computing Techniques to Analyze the Turbulent Wake of a Wall-Mounted Square Cylinder2020Inngår i: 14th International Conference on Soft Computing Models in Industrial and Environmental Applications, SOCO 2019 / [ed] Alvarez, FM Lora, AT Munoz, JAS Quintian, H Corchado, E, Springer, 2020, Vol. 950, s. 577-586Konferansepaper (Fagfellevurdert)
    Abstract [en]

    This paper introduces several methods, generally used in fluid dynamics, to provide low-rank approximations. The algorithm describing these methods are mainly based on singular value decomposition (SVD) and dynamic mode decomposition (DMD) techniques, and are suitable to analyze turbulent flows. The application of these methods will be illustrated in the analysis of the turbulent wake of a wall-mounted cylinder, a geometry modeling a skyscraper. A brief discussion about the large and small size structures of the flow will provide the key ideas to represent the general dynamics of the flow using low-rank approximations. If the flow physics is understood, then it is possible to adapt these techniques, or some other strategies, to solve general complex problems with reduced computational cost. The main goal is to introduce these methods as machine learning strategies that could be potentially used in the field of fluid dynamics, and that can be extended to any other research field.

  • 27.
    Andersson, Birger
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Data- och systemvetenskap, DSV.
    Bergholtz, Maria
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Data- och systemvetenskap, DSV.
    Edirisuriya, A.
    Ilayperuma, T.
    Jayaweera, P.
    Johannesson, Paul
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Data- och systemvetenskap, DSV.
    Zdravkovic, Jelena
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Data- och systemvetenskap, DSV.
    Enterprise sustainability through the alignment of goal models and business models2008Inngår i: CEUR Workshop Proc., 2008, s. 73-87Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Business modelling can be used as a starting point for business analysis. The core of a business model is information about resources, events, agents, and their relations. The motivation of a business model can be found in the goals of an enterprise and those are made explicit in a goal model. This paper discusses the alignment of business models with goal models and proposes a method for constructing business models based on goal models. The method assists in the design of business models that conform to the explicit goals of an enterprise. Main benefits are clear and uniform goal formulations, well founded business model designs, and increased traceability between models.

  • 28.
    Andersson, Dan
    KTH, Skolan för teknik och hälsa (STH), Data- och elektroteknik.
    Implementation av prototyp för inomhuspositionering2013Independent thesis Basic level (university diploma), 10 poäng / 15 hpOppgave
    Abstract [sv]

    Utveckling av teknik skapar ständigt nya möjligheter men innebär också stora förändringar för företag och organisationer. Mobiltelefoner, surfplattor, bärbara datorer, mobilkommunikation och molnteknik gör det möjligt idag att inte längre vara bunden av tid, plats eller en enhet för att kunna arbeta. Förändringen innebär att en ny typ av flexibla och yteffektiva kontor med inga fasta arbetsplatser blir allt vanligare. Problemet med de så kallade flexkontoren är att veta var eller när en kollega befinner sig på kontoret är inte lika självklart, framförallt om det är stort kontor med flera våningsplan.

    Målsättningen med detta arbete är att ta fram och implementera en inomhuspositioneringstjänst eller en så kallad Location-Based Service, till företaget Connecta AB. Tjänsten ska göra det möjligt för användare att med hjälp av sin mobiltelefon dela med sig av sin nuvarande arbetsplats i en kontorsmiljö.

    Resultatet av arbetet är en Location Based Service som gör det möjligt för en användare att med hjälp av en Androidtelefon med stöd för kortdistanskommunikationstekniken Near Field Communcication att dela med sig av sin nuvarande arbetsplats. Den molnbaserade serverlösningen Windows Azure används för att lagra registrerade arbetsplatser.

  • 29. Antaris, Stefanos
    Link injection for boosting information spread in social networks2014Inngår i: Social Network Analysis and Mining, ISSN 1869-5450, E-ISSN 1869-5469, Vol. 4, nr 1, artikkel-id 236Artikkel i tidsskrift (Fagfellevurdert)
  • 30.
    Antonova, Rika
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Centrum för autonoma system, CAS. KTH, Skolan för elektroteknik och datavetenskap (EECS), Robotik, perception och lärande, RPL.
    Kokic, Mia
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Centrum för autonoma system, CAS. KTH, Skolan för elektroteknik och datavetenskap (EECS), Robotik, perception och lärande, RPL.
    Stork, Johannes A.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Centrum för autonoma system, CAS. KTH, Skolan för elektroteknik och datavetenskap (EECS), Robotik, perception och lärande, RPL.
    Kragic, Danica
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, Centrum för autonoma system, CAS. KTH, Skolan för elektroteknik och datavetenskap (EECS), Robotik, perception och lärande, RPL.
    Global Search with Bernoulli Alternation Kernel for Task-oriented Grasping Informed by Simulation2018Inngår i: Proceedings of The 2nd Conference on Robot Learning, PMLR 87, 2018, s. 641-650Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We develop an approach that benefits from large simulated datasets and takes full advantage of the limited online data that is most relevant. We propose a variant of Bayesian optimization that alternates between using informed and uninformed kernels. With this Bernoulli Alternation Kernel we ensure that discrepancies between simulation and reality do not hinder adapting robot control policies online. The proposed approach is applied to a challenging real-world problem of task-oriented grasping with novel objects. Our further contribution is a neural network architecture and training pipeline that use experience from grasping objects in simulation to learn grasp stability scores. We learn task scores from a labeled dataset with a convolutional network, which is used to construct an informed kernel for our variant of Bayesian optimization. Experiments on an ABB Yumi robot with real sensor data demonstrate success of our approach, despite the challenge of fulfilling task requirements and high uncertainty over physical properties of objects.

  • 31. Anzanpour, A.
    et al.
    Rahmani, Amir-Mohammad
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Industriell och Medicinsk Elektronik. University of Turku, Finland.
    Liljeberg, P.
    Tenhunen, Hannu
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Industriell och Medicinsk Elektronik. University of Turku, Finland.
    Context-aware early warning system for in-home healthcare using internet-of-things2016Inngår i: 2nd International Summit on Internet of Things, IoT 360° 2015, Springer, 2016, s. 517-522Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Early warning score (EWS) is a prediction method to notify caregivers at a hospital about the deterioration of a patient. Deterioration can be identified by detecting abnormalities in patient’s vital signs several hours prior the condition of the patient gets life-threatening. In the existing EWS systems, monitoring of patient’s vital signs and the determining the score is mostly performed in a paper and pen based way. Furthermore, currently it is done solely in a hospital environment. In this paper, we propose to import this system to patients’ home to provide an automated platform which not only monitors patents’ vital signs but also looks over his/her activities and the surrounding environment. Thanks to the Internet-of-Things technology, we present an intelligent early warning method to remotely monitor in-home patients and generate alerts in case of different medical emergencies or radical changes in condition of the patient. We also demonstrate an early warning score analysis system which continuously performs sensing, transferring, and recording vital signs, activity-related data, and environmental parameters.

  • 32.
    Apelkrans, Mats
    et al.
    Dept of Informatics, Jönköping International Business School.
    Håkansson, Anne
    Uppsala University, Sweden.
    Information Coordination Using Meta-agents in Information Logistics Processes2008Inngår i: Proceedings of Knowledge-Based and Intelligent Information & Engineering Systems: KES2008 / [ed] Ignac Lovrek, Robert J. Howlett, Lakhmi C. Jain, Berlin Heidelberg: Springer Berlin/Heidelberg, 2008, s. 788-798Konferansepaper (Fagfellevurdert)
    Abstract [en]

    In order to coordinate and deliver information in the right time and to the right place, theories from multi-agent systems and information logistics are combined. We use agents to support supply chain by searching for company specific information. Hence, there are a vast number of agents working at the Internet, simultaneously, which requires supervising agents. In this paper, we suggest using meta-agents to control the behaviour of a number of intelligent agents, where the meta-agents are working with coordination of the communication that takes place in a supply chain system. As an example, we look at a manufacturing company receiving orders on items from customers, which need to be produced. The handling of this distributed information flow can be thought of as an Information Logistics Processes and the similarities of the functioning of processes and intelligent agents’ behaviour are illuminated.

  • 33.
    Arad, Cosmin
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Dowling, Jim
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Haridi, Seif
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Message-Passing Concurrency for Scalable, Stateful, Reconfigurable Middleware2012Inngår i: Middleware 2012: ACM/IFIP/USENIX 13th International Middleware Conference, Montreal, QC, Canada, December 3-7, 2012. Proceedings / [ed] Priya Narasimhan and Peter Triantafillou, Springer Berlin/Heidelberg, 2012, s. 208-228Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Message-passing concurrency (MPC) is increasingly being used to build systems software that scales well on multi-core hardware. Functional programming implementations of MPC, such as Erlang, have also leveraged their stateless nature to build middleware that is not just scalable, but also dynamically reconfigurable. However, many middleware platforms lend themselves more naturally to a stateful programming model, supporting session and application state. A limitation of existing programming models and frameworks that support dynamic reconfiguration for stateful middleware, such as component frameworks, is that they are not designed for MPC.

    In this paper, we present Kompics, a component model and programming framework, that supports the construction and composition of dynamically reconfigurable middleware using stateful, concurrent, message-passing components. An added benefit of our approach is that by decoupling our component execution model, we can run the same code in both simulation and production environments. We present the architectural patterns and abstractions that Kompics facilitates and we evaluate them using a case study of a non-trivial key-value store that we built using Kompics. We show how our model enables the systematic development and testing of scalable, dynamically reconfigurable middleware.

  • 34.
    Arad, Cosmin Ionel
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Programming Model and Protocols for Reconfigurable Distributed Systems2013Doktoravhandling, monografi (Annet vitenskapelig)
    Abstract [en]

    Distributed systems are everywhere. From large datacenters to mobile devices, an ever richer assortment of applications and services relies on distributed systems, infrastructure, and protocols. Despite their ubiquity, testing and debugging distributed systems remains notoriously hard. Moreover, aside from inherent design challenges posed by partial failure, concurrency, or asynchrony, there remain significant challenges in the implementation of distributed systems. These programming challenges stem from the increasing complexity of the concurrent activities and reactive behaviors in a distributed system on the one hand, and the need to effectively leverage the parallelism offered by modern multi-core hardware, on the other hand.

    This thesis contributes Kompics, a programming model designed to alleviate some of these challenges. Kompics is a component model and programming framework for building distributed systems by composing message-passing concurrent components. Systems built with Kompics leverage multi-core machines out of the box, and they can be dynamically reconfigured to support hot software upgrades. A simulation framework enables deterministic execution replay for debugging, testing, and reproducible behavior evaluation for largescale Kompics distributed systems. The same system code is used for both simulation and production deployment, greatly simplifying the system development, testing, and debugging cycle.

    We highlight the architectural patterns and abstractions facilitated by Kompics through a case study of a non-trivial distributed key-value storage system. CATS is a scalable, fault-tolerant, elastic, and self-managing key-value store which trades off service availability for guarantees of atomic data consistency and tolerance to network partitions. We present the composition architecture for the numerous protocols employed by the CATS system, as well as our methodology for testing the correctness of key CATS algorithms using the Kompics simulation framework.

    Results from a comprehensive performance evaluation attest that CATS achieves its claimed properties and delivers a level of performance competitive with similar systems which provide only weaker consistency guarantees. More importantly, this testifies that Kompics admits efficient system implementations. Its use as a teaching framework as well as its use for rapid prototyping, development, and evaluation of a myriad of scalable distributed systems, both within and outside our research group, confirm the practicality of Kompics.

  • 35.
    Ardah, Khaled
    et al.
    Univ Fed Ceara, Wireless Telecom Res Grp, BR-60020181 Fortaleza, Ceara, Brazil..
    Fodor, Gabor
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Reglerteknik. Ericsson Res, SE-16480 Stockholm, Sweden.
    Silva, Yuri C. B.
    Univ Fed Ceara, Wireless Telecom Res Grp, BR-60020181 Fortaleza, Ceara, Brazil..
    Freitas, Walter C., Jr.
    Univ Fed Ceara, Wireless Telecom Res Grp, BR-60020181 Fortaleza, Ceara, Brazil..
    Cavalcanti, Francisco R. P.
    Univ Fed Ceara, Wireless Telecom Res Grp, BR-60020181 Fortaleza, Ceara, Brazil..
    A Unifying Design of Hybrid Beamforming Architectures Employing Phase Shifters or Switches2018Inngår i: IEEE Transactions on Vehicular Technology, ISSN 0018-9545, E-ISSN 1939-9359, Vol. 67, nr 11, s. 11243-11247Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Hybrid beamfiorming (BF) architectures employing phase shifters or switches reduce the number of required radio frequency chains and the power consumption of base stations that employ a large number of antennas. Due to the inherent tradeoff between the number of radio frequency chains, the complexity of the employed analog and digital BF algorithms and the achieved spectral and energy efficiency, designing hybrid BF architectures is a complex task. To deal with this ormplexity, we propose a unifying design that is applicable to architectures employing either phase shifters or switches. In our design, the analog part (!if the hybrid BF architecture maximizes the capacity of the equivalent channel, while the digital part is updated using the well-known block diagonalizat' approach. We then employ the proposed joint analog-digital beamforming algorithm on lour recently proposed hybrid architectures and compare their performance in terms of spectral and energy efficiency, and find that the proposed analog-digital BF algorithm outperforms previously proposed schemes. We also find that phase shifterbased architectures achieve high spectral efficiency, whereas switching-based architectures can boost energy efficiency with increasing number of base station antennas.

  • 36.
    Ardelius, John
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    On the Performance Analysis of Large Scale, Dynamic, Distributed and Parallel Systems.2013Doktoravhandling, monografi (Annet vitenskapelig)
    Abstract [en]

    Evaluating the performance of large distributed applications is an important and non-trivial task. With the onset of Internet wide applications there is an increasing need to quantify reliability, dependability and performance of these systems, both as a guide in system design as well as a means to understand the fundamental properties of large-scale distributed systems. Previous research has mainly focused on either formalised models where system properties can be deduced and verified using rigorous mathematics or on measurements and experiments on deployed applications. Our aim in this thesis is to study models on an abstraction level lying between the two ends of this spectrum. We adopt a model of distributed systems inspired by methods used in the study of large scale system of particles in physics and model the application nodes as a set of interacting particles each with an internal state whose actions are specified by the application program. We apply our modeling and performance evaluation methodology to four different distributed and parallel systems. The first system is the distributed hash table (DHT) Chord running in a dynamic environment.  We study the system under two scenarios. First we study how performance (in terms of lookup latency) is affectedon a network with finite communication latency. We show that an average delay in conjunction with other parameters describing changes in the network (such as timescales for network repair and join and leave processes)induces fundamentally different system performance. We also verify our analytical predictions via simulations.In the second scenario we introduce network address translators (NATs) to the network model. This makes the overlay topology non-transitive and we explore the implications of this fact to various performance metrics such as lookup latency, consistency and load balance. The latter analysis is mainly simulation based.Even though these two studies focus on a specific DHT, many of our results can easily be translated to other similar ring-based DHTs with long-range links, and the same methodology can be applied evento DHT's based on other geometries.The second type of system studied is an unstructured gossip protocol running a distributed version of the famous Belman-Ford algorithm. The algorithm, called GAP, generates a spanning tree over the participating nodes and the question we set out to study is how reliable this structure is(in terms of generating accurate aggregate values at the root)  in the presence of node churn. All our analytical results are also verified  using simulations.The third system studied is a content distribution network (CDN) of interconnected caches in an aggregation access network. In this model, content which sits at the leaves of the cache hierarchy tree, is requested by end users. Requests can then either be served by the first cache level or sent further up the tree. We study the performance of the whole system under two cache eviction policies namely LRU and LFU. We compare our analytical results with traces from related caching systems.The last system is a work stealing heuristic for task distribution in the TileraPro64 chip. This system has access to a shared memory and is therefore classified as a parallel system. We create a model for the dynamic generation of tasks as well as how they are executed and distributed among the participating nodes. We study how the heuristic scales when the number of nodes exceeds the number of processors on the chip as well as how different work stealing policies compare with each other. The work on this model is mainly simulation-based.

  • 37.
    Arman, Ala
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Al-Shishtawy, Ahmad
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Vlassov, Vladimir
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Elasticity controller for Cloud-based key-value stores2012Inngår i: Parallel and Distributed Systems (ICPADS), 2012 IEEE 18th International Conference on, IEEE , 2012, s. 268-275Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Clouds provide an illusion of an infinite amount of resources and enable elastic services and applications that are capable to scale up and down (grow and shrink by requesting and releasing resources) in response to changes in its environment, workload, and Quality of Service (QoS) requirements. Elasticity allows to achieve required QoS at a minimal cost in a Cloud environment with its pay-as-you-go pricing model. In this paper, we present our experience in designing a feedback elastically controller for a key-value store. The goal of our research is to investigate the feasibility of the control theoretic approach to the automation of elasticity of Cloud-based key-value stores. We describe design steps necessary to build a feedback controller for a real system, namely Voldemort, which we use as a case study in this work. The design steps include defining touchpoints (sensors and actuators), system identification, and controller design. We have designed, developed, and implemented a prototype of the feedback elasticity controller for Voldemort. Our initial evaluation results show the feasibility of using feedback control to automate elasticity of distributed keyvalue stores.

  • 38. Armengaud, Eric
    et al.
    Biehl, Matthias
    KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Mekatronik.
    Bourrouilh, Quentin
    Breunig, Michael
    Farfeleder, Stefan
    Hein, Christian
    Oertel, Markus
    Wallner, Alfred
    Zoier, Markus
    Integrated tool chain for improving traceability during the development of automotive systems2012Inngår i: ERTS2 2012 | Embedded Real Time Software and Systems, 2012Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Tool integration is a key factor for improving development efficiency and product quality during the development of safety-relevant embedded systems. We present in this work a demonstrator based on the most recent outcomes of the CESAR project. The proposed integrated tool-chain aims at better linking development activities together, thus improving traceability during requirements engineering, system design, safety analysis and V&V activities using a model-based development approach. We analyze the proposed tool-chain from three different points of view: (1) tool integrator, (2) technology provider, and (3) end-user. These different points of view enable the description of the different technologies used at the different levels and the analysis of the benefits for the end-user.

  • 39.
    Artho, Cyrille
    et al.
    KTH.
    Ölveczky, P.C.
    Preface2017Inngår i: 5th International Workshop on Formal Techniques for Safety-Critical Systems, FTSCS 2016, Springer Verlag , 2017Konferansepaper (Fagfellevurdert)
  • 40. Ashjaei, Mohammad
    et al.
    Moghaddami Khalilzad, Nima
    KTH.
    Mubeen, Saad
    Behnam, Moris
    Sander, Ingo
    Almeida, Luis
    Nolte, Thomas
    Designing end-to-end resource reservations in predictable distributed embedded systems2017Inngår i: Real-time systems, ISSN 0922-6443, E-ISSN 1573-1383, Vol. 53, nr 6, s. 916-956Artikkel i tidsskrift (Fagfellevurdert)
  • 41.
    Aslam Butt, Haseeb
    KTH, Skolan för elektroteknik och datavetenskap (EECS).
    Investigation into tools to increase Observability of 2oo2 OS based Generic Product2018Independent thesis Advanced level (degree of Master (Two Years)), 20 poäng / 30 hpOppgave
    Abstract [en]

    2 out of 2 (2oo2) OS based generic product is a generic platform used byBombardier Transportation to develop safety critical, SIL 3 and SIL 4 levelspecialized Railway products. The 2oo2 architecture is based on composite failsafety design technique. During the development and integration of specializedproduct, debugging and optimization efforts are critical to timely market thenew product. In the presence of tools that can increase the observability of thesystem, the process of debugging and optimization can be made more efficient.This thesis examines the availability of tools to enhance the observability of the2 out of 2 OS based generic product. Tracing and profiling techniques wereidentified as possible techniques that would best fit in our context forobservability enhancement. Tools based on the identified technique wereinvestigated in depth to see the possibility of building, customizing and portingthem on the architecture of our 2oo2 system. Development efforts were done tosuccessfully build the complete chain of tools for use in system lab settings. Thecomplete observability infrastructure architecture was designed to extract thetracing data from target machine to the analysis tools. Procedures were definedto extract the tracing data for using it to debug and optimize the systemeffectively. Moreover, we investigate the impact of operating systems upgrades,to increase the observability of the 2oo2 system.

  • 42.
    Asplund, Fredrik
    KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Mekatronik. KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Inbyggda styrsystem.
    Exploratory Testing: Do Contextual Factors Influence Software Fault Identification?2018Inngår i: Information and Software Technology, ISSN 0950-5849, E-ISSN 1873-6025Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Context: Exploratory Testing (ET) is a manual approach to software testing in which learning, test design and test execution occurs simultaneously. Still a developing topic of interest to academia, although as yet insufficiently investigated, most studies focus on the skills and experience of the individual tester. However, contextual factors such as project processes, test scope and organisational boundaries are also likely to affect the approach.

    Objective: This study explores contextual differences between teams of testers at a MedTec firm developing safety-critical products to ascertain whether contextual factors can influence the outcomes of ET, and what associated implications can be drawn for test management.

    Method: A development project was studied in two iterations, each consisting of a quantitative phase testing hypotheses concerning when ET would identify faults in comparison to other testing approaches and a qualitative phase involving interviews.

    Results: Influence on ET is traced to how the scope of tests focus learning on different types of knowledge and imply an asymmetry in the strength and number of information flows to test teams.

    Conclusions: While test specialisation can be attractive to software development organisations, results suggest changes to processes and organisational structures might be required to maintain test efficiency throughout projects: the responsibility for test cases might need to be rotated late in projects, and asymmetries in information flows might require management to actively strengthen the presence and connections of test teams throughout the firm. However, further research is needed to investigate whether these results also hold for non safety-critical faults.

  • 43.
    Asplund, Fredrik
    KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Mekatronik.
    Risks Related to the Use of Software Tools when Developing Cyber-Physical Systems: A Critical Perspective on the Future of Developing Complex, Safety-Critical Systems2014Doktoravhandling, monografi (Annet vitenskapelig)
    Abstract [en]

    The increasing complexity and size of modern Cyber-Physical Systems (CPS) has led to a sharp decline in productivity among CPS designers. Requirements on safety aggravate this problem further, both by being difficult to ensure and due to their high importance to the public.

    Tools, or rather efforts to facilitate the automation of development processes, are a central ingredient in many of the proposed innovations to mitigate this problem. Even though the safety-related implications of introducing automation in development processes have not been extensively studied, it is known that automation has already had a large impact on operational systems. If tools are to play a part in mitigating the increase in safety-critical CPS complexity, then their actual impact on CPS development, and thereby the safety of the corresponding end products, must be sufficiently understood.

    An survey of relevant research fields, such as system safety, software engineering and tool integration, is provided to facilitate the discussion on safety-related implications of tool usage. Based on the identification of industrial safety standards as an important source of information and considering that the risks posed by separate tools have been given considerable attention in the transportation domain, several high-profile safety standards in this domain have been surveyed. According to the surveyed standards, automation should primarily be evaluated on its reliable execution of separate process steps independent of human operators. Automation that only supports the actions of operators during CPS development is viewed as relatively inconsequential.

    A conceptual model and a reference model have been created based on the surveyed research fields. The former defines the entities and relationships most relevant to safety-related risks associated with tool usage. The latter describes aspects of tool integration and how these relate to each other. By combining these models, a risk analysis could be performed and properties of tool chains which need to be ensured to mitigate risk identified. Ten such safety-related characteristics of tool chains are described.

    These safety-related characteristics provide a systematic way to narrow down what to look for with regard to tool usage and risk. The hypothesis that a large set of factors related to tool usage may introduce risk could thus be tested through an empirical study, which identified safety-related weaknesses in support environments tied both to high and low levels of automation. The conclusion is that a broader perspective, which includes more factors related to tool usage than those considered by the surveyed standards, will be needed.

    Three possible reasons to disregard such a broad perspective have been refuted, namely requirements on development processes enforced by the domain of CPS itself, certain characteristics of safety-critical CPS and the possibility to place trust in a proven, manual development process. After finding no strong reason to keep a narrow perspective on tool usage, arguments are put forward as to why the future evolution of support environments may actually increase the importance of such a broad perspective.

    Suggestions for how to update the mental models of the surveyed safety standards, and other standards like them, are put forward based on this identified need for a broader perspective.

  • 44.
    Asplund, Fredrik
    et al.
    KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Mekatronik.
    Biehl, Matthias
    KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Mekatronik.
    El-Khoury, Jad
    KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Mekatronik.
    Törngren, Martin
    KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Mekatronik.
    Tool Integration Beyond Wasserman2011Inngår i: Advanced Information Systems Engineering Workshops / [ed] Camille Salinesi, Oscar Pastor, Berlin: Springer-Verlag , 2011, s. 270-281Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The typical development environment today consists of many specialized development tools, which are partially integrated, forming a complex tool landscape with partial integration. Traditional approaches for reasoning about tool integration are insufficient to measure the degree of integration and integration optimality in today’s complex tool landscape. This paper presents a reference model that introduces dependencies between, and metrics for, integration aspects to overcome this problem. This model is used to conceive a method for reasoning about tool integration and identify improvements in an industrial case study. Based on this we are able to conclude that our reference model does not detract value from the principles that it is based on, instead it highlights improvements that were not well visible earlier. We conclude the paper by discussing open issues for our reference model, namely if it is suitable to use during the creation of new systems, if the used integration aspects can be subdivided further to support the analysis of secondary issues related to integration, difficulties related to the state dependency between the data and process aspects within the context of developing embedded systems and the analysis of non-functional requirements to support tool integration.

  • 45.
    Attarzadeh-Niaki, Seyed Hosein
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Elektronik och Inbyggda System.
    Sander, Ingo
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Elektronik och Inbyggda System.
    Integrating Functional Mock-up units into a formal heterogeneous system modeling framework2015Inngår i: 18th CSI International Symposium on Computer Architecture and Digital Systems, CADS 2015, Institute of Electrical and Electronics Engineers (IEEE), 2015Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The Functional Mock-up Interface (FMI) standard defines a method for tool- and platform-independent model exchange and co-simulation of dynamic system models. In FMI, the master algorithm, which executes the imported components, is a timed differential equation solver. This is a limitation for heterogeneous embedded and cyber-physical systems, where models with different time abstractions co-exist and interact. This work integrates FMI into a heterogeneous system modeling and simulation framework as process constructors and co-simulation wrappers. Consequently, each external model communicates with the framework without unnecessary semantic adaptation while the framework provides necessary mechanisms for handling heterogeneity. The presented methods are implemented in the ForSyDe-SystemC modeling framework and tested using a case study.

  • 46.
    Awan, Ahsan Javed
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Performance Characterization and Optimization of In-Memory Data Analytics on a Scale-up Server2017Doktoravhandling, monografi (Annet vitenskapelig)
    Abstract [en]

    The sheer increase in the volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark defines the state of the art in big data analytics platforms for (i) exploiting data-flow and in-memory computing and (ii) for exhibiting superior scale-out performance on the commodity machines, little effort has been devoted to understanding the performance of in-memory data analytics with Spark on modern scale-up servers. This thesis characterizes the performance of in-memory data analytics with Spark on scale-up servers.Through empirical evaluation of representative benchmark workloads on a dual socket server, we have found that in-memory data analytics with Spark exhibit poor multi-core scalability beyond 12 cores due to thread level load imbalance and work-time inflation (the additional CPU time spent by threads in a multi-threaded computation beyond the CPU time required to perform the same work in a sequential computation). We have also found that workloads are bound by the latency of frequent data accesses to the memory. By enlarging input data size, application performance degrades significantly due to the substantial increase in wait time during I/O operations and garbage collection, despite 10% better instruction retirement rate (due to lower L1cache misses and higher core utilization).For data accesses, we have found that simultaneous multi-threading is effective in hiding the data latencies. We have also observed that (i) data locality on NUMA nodes can improve the performance by 10% on average,(ii) disabling next-line L1-D prefetchers can reduce the execution time by upto14%. For garbage collection impact, we match memory behavior with the garbage collector to improve the performance of applications between 1.6xto 3x and recommend using multiple small Spark executors that can provide up to 36% reduction in execution time over single large executor. Based on the characteristics of workloads, the thesis envisions near-memory and near storage hardware acceleration to improve the single-node performance of scale-out frameworks like Apache Spark. Using modeling techniques, it estimates the speed-up of 4x for Apache Spark on scale-up servers augmented with near-data accelerators.

  • 47.
    Awan, Ahsan Javed
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Performance Characterization of In-Memory Data Analytics on a Scale-up Server2016Licentiatavhandling, med artikler (Annet vitenskapelig)
    Abstract [en]

    The sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark defines the state of the art in big data analytics platforms for (i) exploiting data-flow and in-memory computing and (ii) for exhibiting superior scale-out performance on the commodity machines, little effort has been devoted at understanding the performance of in-memory data analytics with Spark on modern scale-up servers. This thesis characterizes the performance of in-memory data analytics with Spark on scale-up servers.

    Through empirical evaluation of representative benchmark workloads on a dual socket server, we have found that in-memory data analytics with Spark exhibit poor multi-core scalability beyond 12 cores due to thread level load imbalance and work-time inflation. We have also found that workloads are bound by the latency of frequent data accesses to DRAM. By enlarging input data size, application performance degrades significantly due to substantial increase in wait time during I/O operations and garbage collection, despite 10% better instruction retirement rate (due to lower L1 cache misses and higher core utilization).

    For data accesses we have found that simultaneous multi-threading is effective in hiding the data latencies. We have also observed that (i) data locality on NUMA nodes can improve the performance by 10% on average, (ii) disabling next-line L1-D prefetchers can reduce the execution time by up-to 14%. For GC impact, we match memory behaviour with the garbage collector to improve performance of applications between 1.6x to 3x. and recommend to use multiple small executors that can provide up-to 36% speedup over single large executor.

  • 48.
    Awan, Ahsan Javed
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Brorsson, Mats
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Vlassov, Vladimir
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Ayguade, Eduard
    Barcelona Super Computing Center and Technical University of Catalunya.
    Architectural Impact on Performance of In-memoryData Analytics: Apache Spark Case StudyManuskript (preprint) (Annet vitenskapelig)
    Abstract [en]

    While cluster computing frameworks are contin-uously evolving to provide real-time data analysis capabilities,Apache Spark has managed to be at the forefront of big data an-alytics for being a unified framework for both, batch and streamdata processing. However, recent studies on micro-architecturalcharacterization of in-memory data analytics are limited to onlybatch processing workloads. We compare micro-architectural per-formance of batch processing and stream processing workloadsin Apache Spark using hardware performance counters on a dualsocket server. In our evaluation experiments, we have found thatbatch processing are stream processing workloads have similarmicro-architectural characteristics are bounded by the latency offrequent data access to DRAM. For data accesses we have foundthat simultaneous multi-threading is effective in hiding the datalatencies. We have also observed that (i) data locality on NUMAnodes can improve the performance by 10% on average and(ii)disabling next-line L1-D prefetchers can reduce the executiontime by up-to 14% and (iii) multiple small executors can provideup-to 36% speedup over single large executor

  • 49.
    Awan, Ahsan Javed
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Brorsson, Mats
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Vlassov, Vladimir
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Ayguade, Eduard
    Technical University of Catalunya, Barcelona Super Computing Center.
    How Data Volume Affects Spark Based Data Analytics on a Scale-up Server2015Inngår i: Big Data Benchmarks, Performance Optimization, and Emerging Hardware: 6th Workshop, BPOE 2015, Kohala, HI, USA, August 31 - September 4, 2015. Revised Selected Papers, Springer, 2015, Vol. 9495, s. 81-92Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark is gaining popularity for exhibiting superior scale-out performance on the commodity machines, the impact of data volume on the performance of Spark based data analytics in scale-up configuration is not well understood. We present a deep-dive analysis of Spark based applications on a large scale-up server machine. Our analysis reveals that Spark based data analytics are DRAM bound and do not benefit by using more than 12 cores for an executor. By enlarging input data size, application performance degrades significantly due to substantial increase in wait time during I/O operations and garbage collection, despite 10 % better instruction retirement rate (due to lower L1 cache misses and higher core utilization). We match memory behaviour with the garbage collector to improve performance of applications between 1.6x to 3x.

  • 50.
    Awan, Ahsan Javed
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Brorsson, Mats
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Vlassov, Vladimir
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Programvaruteknik och Datorsystem, SCS.
    Ayguade, Eduard
    Barcelona Super Computing Center and Technical University of Catalunya.
    Micro-architectural Characterization of Apache Spark on Batch and Stream Processing Workloads2016Konferansepaper (Fagfellevurdert)
    Abstract [en]

    While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. However, recent studies on micro-architectural characterization of in-memory data analytics are limited to only batch processing workloads. We compare the micro-architectural performance of batch processing and stream processing workloads in Apache Spark using hardware performance counters on a dual socket server. In our evaluation experiments, we have found that batch processing and stream processing has same micro-architectural behavior in Spark if the difference between two implementations is of micro-batching only. If the input data rates are small, stream processing workloads are front-end bound. However, the front end bound stalls are reduced at larger input data rates and instruction retirement is improved. Moreover, Spark workloads using DataFrames have improved instruction retirement over workloads using RDDs.

1234567 1 - 50 of 780
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf