Change search
Refine search result
12 1 - 50 of 63
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1. Bachwani, Rekha
    et al.
    Crameri, Olivier
    Bianchini, Ricardo
    Kostic, Dejan
    EPFL.
    Zwaenepoel, Willy
    Sahara: Guiding the Debugging of Failed Software Upgrades2011In: Proceedings of the 27th IEEE International Conference on Software Maintenance, IEEE conference proceedings, 2011, p. -272Conference paper (Refereed)
    Abstract [en]

    Today, debugging failed software upgrades is a long and tedious activity, as developers may have to consider large sections of code to locate the bug. We argue that failed upgrade debugging can be simplified by exploiting the characteristics of upgrade problems to prioritize the set of routines to consider. In particular, previous work has shown that differences between the computing environment in the developer's and users' sites cause most upgrade problems. Based on this observation, we design and implement Sahara, a system that identifies the aspects of the environment that are most likely the culprits of the misbehavior, finds the subset of routines that relate to those aspects, and selects an even smaller subset of routines to debug first. We evaluate Sahara for three real upgrade problems with the OpenSSH suite, one synthetic problem with the SQLite database, and one synthetic problem with the uServer Web server. Our results show that the system produces accurate recommendations comprising only a small number of routines.

  • 2.
    Barbette, Tom
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Katsikas, Georgios P.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Radio Systems Laboratory (RS Lab).
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    RSS++: load and state-aware receive side scaling2019In: Proceedings of the 15th International Conference on emerging Networking EXperiments and Technologies / [ed] ACM, Orlando, FL, USA: Association for Computing Machinery (ACM), 2019Conference paper (Refereed)
    Abstract [en]

    While the current literature typically focuses on load-balancing among multiple servers, in this paper, we demonstrate the importance of load-balancing within a single machine (potentially with hundreds of CPU cores). In this context, we propose a new load-balancing technique (RSS++) that dynamically modifies the receive side scaling (RSS) indirection table to spread the load across the CPU cores in a more optimal way. RSS++ incurs up to 14x lower 95th percentile tail latency and orders of magnitude fewer packet drops compared to RSS under high CPU utilization. RSS++ allows higher CPU utilization and dynamic scaling of the number of allocated CPU cores to accommodate the input load, while avoiding the typical 25% over-provisioning. RSS++ has been implemented for both (i) DPDK and (ii) the Linux kernel. Additionally, we implement a new state migration technique, which facilitates sharding and reduces contention between CPU cores accessing per-flow data. RSS++ keeps the flow-state by groups that can be migrated at once, leading to a 20% higher efficiency than a state of the art shared flow table.

  • 3.
    Bogdanov, Kirill
    et al.
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Peón-Quirós, Miguel
    Complutense University of Madrid.
    Maguire Jr., Gerald Q.
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Radio Systems Laboratory (RS Lab).
    Kostic, Dejan
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    The Nearest Replica Can Be Farther Than You Think2015In: Proceedings of the ACM Symposium on Cloud Computing 2015, Association for Computing Machinery (ACM), 2015, p. 16-29Conference paper (Refereed)
    Abstract [en]

    Modern distributed systems are geo-distributed for reasons of increased performance, reliability, and survivability. At the heart of many such systems, e.g., the widely used Cassandra and MongoDB data stores, is an algorithm for choosing a closest set of replicas to service a client request. Suboptimal replica choices due to dynamically changing network conditions result in reduced performance as a result of increased response latency. We present GeoPerf, a tool that tries to automate the process of systematically testing the performance of replica selection algorithms for geodistributed storage systems. Our key idea is to combine symbolic execution and lightweight modeling to generate a set of inputs that can expose weaknesses in replica selection. As part of our evaluation, we analyzed network round trip times between geographically distributed Amazon EC2 regions, and showed a significant number of daily changes in nearestK replica orders. We tested Cassandra and MongoDB using our tool, and found bugs in each of these systems. Finally, we use our collected Amazon EC2 latency traces to quantify the time lost due to these bugs. For example due to the bug in Cassandra, the median wasted time for 10% of all requests is above 50 ms.

  • 4.
    Bogdanov, Kirill
    et al.
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Peón-Quirós, Miguel
    Complutense University of Madrid.
    Maguire Jr., Gerald Q.
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Radio Systems Laboratory (RS Lab).
    Kostić, Dejan
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Toward Automated Testing of Geo-Distributed Replica Selection Algorithms2015In: Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, Association for Computing Machinery (ACM), 2015, p. 89-90Conference paper (Refereed)
    Abstract [en]

    Many geo-distributed systems rely on a replica selection algorithms to communicate with the closest set of replicas.  Unfortunately, the bursty nature of the Internet traffic and ever changing network conditions present a problem in identifying the best choices of replicas. Suboptimal replica choices result in increased response latency and reduced system performance. In this work we present GeoPerf, a tool that tries to automate testing of geo-distributed replica selection algorithms. We used GeoPerf to test Cassandra and MongoDB, two popular data stores, and found bugs in each of these systems.

  • 5.
    Bogdanov, Kirill
    et al.
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Reda, Waleed
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab). Université catholique de Louvain.
    Kostic, Dejan
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Maguire Jr., Gerald Q.
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS.
    Canini, Marco
    KAUST.
    Kurma: Fast and Efficient Load Balancing for Geo-Distributed Storage Systems: Evaluation of Convergence and Scalability2018Report (Other academic)
    Abstract [en]

    This report provides an extended evaluation of Kurma, a practical implementation of a geo-distributed load balancer for backend storage systems. In this report we demonstrate the ability of distributed Kurma instances to accurately converge to the same solutions within 1% of the total datacenter’s capacity and the ability of Kurma to scale up to 8 datacenters using a single CPU core at each datacenter.

  • 6.
    Bogdanov, Kirill
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS.
    Reda, Waleed
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS.
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS.
    Canini, M.
    Fast and accurate load balancing for geo-distributed storage systems2018In: SoCC 2018 - Proceedings of the 2018 ACM Symposium on Cloud Computing, Association for Computing Machinery (ACM), 2018, p. 386-400Conference paper (Refereed)
    Abstract [en]

    The increasing density of globally distributed datacenters reduces the network latency between neighboring datacenters and allows replicated services deployed across neighboring locations to share workload when necessary, without violating strict Service Level Objectives (SLOs). We present Kurma, a practical implementation of a fast and accurate load balancer for geo-distributed storage systems. At run-time, Kurma integrates network latency and service time distributions to accurately estimate the rate of SLO violations for requests redirected across geo-distributed datacenters. Using these estimates, Kurma solves a decentralized rate-based performance model enabling fast load balancing (in the order of seconds) while taming global SLO violations. We integrate Kurma with Cassandra, a popular storage system. Using real-world traces along with a geo-distributed deployment across Amazon EC2, we demonstrate Kurma’s ability to effectively share load among datacenters while reducing SLO violations by up to a factor of 3 in high load settings or reducing the cost of running the service by up to 17%.

  • 7. Braynard, Rebecca
    et al.
    Kostic, Dejan
    Duke.
    Rodriguez, Adolfo
    Chase, Jeff
    Vahdat, Amin
    Opus: an overlay peer utility service2002In: Proceedings of the 5th International Conference on Open Architectures and Network Programming (OPENARCH), IEEE conference proceedings, 2002, p. -178Conference paper (Refereed)
    Abstract [en]

    Today, an increasing number of important network services, such as content distribution, replicated services, and storage systems, are deploying overlays across multiple Internet sites to deliver better performance, reliability and adaptability. Currently however, such network services must individually reimplement substantially similar functionality. For example, applications must configure the overlay to meet their specific demands for scale, service quality and reliability. Further, they must dynamically map data and functions onto network resources-including servers, storage, and network paths-to adapt to changes in load or network conditions. In this paper, we present Opus, a large-scale overlay utility service that provides a common platform and the necessary abstractions for simultaneously hosting multiple distributed applications. In our utility model, wide-area resource mapping is guided by an application’s specification of performance and availability targets. Opus then allocates available nodes to meet the requirements of competing applications based on dynamically changing system characteristics. Specifically, we describe issues and initial results associated with: i) developing a general architecture that enables a broad range of applications to push their functionality across the network, ii) constructing overlays that match both the performance and reliability characteristics of individual applications and scale to thousands of participating nodes, iii) using Service Level Agreements to dynamically allocate utility resources among competing applications, and iv) developing decentralized techniques for tracking global system characteristics through the use of hierarchy, aggregation, and approximation

  • 8. Canini, Marco
    et al.
    Jovanovic, Vojin
    Venzano, Daniele
    Spasojevic, Boris
    Crameri, Olivier
    Kostic, Dejan
    EPFL.
    Toward Online Testing of Federated and Heterogeneous Distributed Systems2011In: Proceedings of The 2011 USENIX Annual Technical Conference, 2011Conference paper (Refereed)
    Abstract [en]

    Making distributed systems reliable is notoriously difficult. It is even more difficult to achieve high reliability for federated and heterogeneous systems, i.e., those that are operated by multiple administrative entities and have numerous inter-operable implementations. A prime example of such a system is the Internet’s inter-domain routing, today based on BGP.

    We argue that system reliability should be improved by proactively identifying potential faults using an online testing functionality. We propose DiCE, an approach that continuously and automatically explores the system behavior, to check whether the system deviates from its desired behavior. DiCE orchestrates the exploration of relevant system behaviors by subjecting system nodes to many possible inputs that exercise node actions. DiCE starts exploring from current, live system state, and operates in isolation from the deployed system. We describe our experience in integrating DiCE with an opensource BGP router. We evaluate the prototype’s ability to quickly detect origin misconfiguration, a recurring operator mistake that causes Internet-wide outages. We also quantify DiCE’s overhead and find it to have marginal impact on system performance.

  • 9. Canini, Marco
    et al.
    Kostic, Dejan
    EPFL.
    Rexford, Jennifer
    Venzano, Daniele
    Automating the Testing of OpenFlow Applications2011In: Proceedings of the 1st International Workshop on Rigorous Protocol Engineering (WRiPE), 2011Conference paper (Refereed)
    Abstract [en]

    Software-defined networking, and the emergence of OpenFlow-capable switches, enables a wide range of new network functionality. However, enhanced programmability inevitably leads to more software faults (or bugs). We believe that tools for testing OpenFlow programs are critical to the success of the new technology. However, the way OpenFlow applications interact with the data plane raises several challenges. First, the space of possible inputs (e.g., packet headers and inter-packet timings) is huge. Second, the centralized controller has a indirect view of the traffic and experiences unavoidable delays in installing rules in the switches. Third, external factors like user behavior (e.g., mobility) and higher-layer protocols (e.g., the TCP state machine) affect the correctness of OpenFlow programs. In this work-in-progress paper, we extend techniques for symbolic execution to generate inputs that systematically explore the space of system executions. First, we analyze controller applications to identify equivalence classes of packets that exercise different parts of the code. Second, we propose several network models with increasing precision, ranging from simple traffic models to live testing on the target network. Initial experiences with our prototype, which symbolically executes OpenFlow applications written in Python, suggest that our techniques can help programmers identify bugs in their OpenFlow programs.

  • 10. Canini, Marco
    et al.
    Novakovic, Dejan
    Jovanovic, Vojin
    Kostic, Dejan
    EPFL.
    Fault Prediction in Distributed Systems Gone Wild2010In: Proceedings of The 4th ACM SIGOPS/SIGACT Workshop on Large Scale Distributed Systems and Middleware, Association for Computing Machinery (ACM), 2010, p. -11Conference paper (Refereed)
    Abstract [en]

    We consider the problem of predicting faults in deployed, large-scale distributed systems that are heterogeneous and federated. Motivated by the importance of ensuring reliability of the services these systems provide, we argue that the key step in making these systems reliable is the need to automatically predict faults. For example, doing so is vital for avoiding Internet-wide outages that occur due to programming errors or misconfigurations.

  • 11. Canini, Marco
    et al.
    Venzano, Daniele
    Peresini, Peter
    Kostic, Dejan
    EPFL.
    Rexford, Jennifer
    A NICE Way to Test OpenFlow Applications2012In: Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI), Association for Computing Machinery (ACM), 2012Conference paper (Refereed)
    Abstract [en]

    The emergence of OpenFlow-capable switches enables exciting new network functionality, at the risk of programming errors that make communication less reliable. The centralized programming model, where a single controller program manages the network, seems to reduce the likelihood of bugs. However, the system is inherently distributed and asynchronous, with events happening at different switches and end hosts, and inevitable delays affecting communication with the controller. In this paper, we present efficient, systematic techniques for testing unmodified controller programs. Our NICE tool applies model checking to explore the state space of the entire system—the controller, the switches, and the hosts. Scalability is the main challenge, given the diversity of data packets, the large system state, and the many possible event orderings. To address this, we propose a novel way to augment model checking with symbolic execution of event handlers (to identify representative packets that exercise code paths on the controller). We also present a simplified OpenFlow switch model (to reduce the state space), and effective strategies for generating event interleavings likely to uncover bugs. Our prototype tests Python applications on the popular NOX platform. In testing three real applications—a MAC-learning switch, in-network server load balancing, and energy-efficient traffic engineering—we uncover eleven bugs.

  • 12. Crameri, Olivier
    et al.
    Knezevic, Nikola
    Kostic, Dejan
    EPFL.
    Bianchini, Ricardo
    Zwaenepoel, Willy
    Staged Deployment in Mirage, an Integrated Software Upgrade Testing and Distribution System2007In: Proceedings of the 21st ACM Symposium on Operating Systems Principles (SOSP), Association for Computing Machinery (ACM), 2007, p. -236Conference paper (Refereed)
    Abstract [en]

    Despite major advances in the engineering of maintainable and robust software over the years, upgrading software remains a primitive and error-prone activity. In this paper, we argue that several problems with upgrading software are caused by a poor integration between upgrade deployment, user-machine testing, and problem reporting. To support this argument, we present a characterization of softwareupgrades resulting from a survey we conducted of 50 system administrators. Motivated by the survey results, we present Mirage, a distributed framework for integrating upgrade deployment, user-machine testing, and problem reporting into the overall upgrade development process. Our evaluation focuses on the most novel aspect of Mirage, namely its staged upgrade deployment based on the clustering of usermachines according to their environments and configurations. Our results suggest that Mirage's staged deployment is effective for real upgrade problems.

  • 13. Dagand, Pierre-Evariste
    et al.
    Kostic, Dejan
    EPFL.
    Kuncak, Viktor
    Opis: Reliable Distributed Systems in OCaml2009In: Proceedings of TLDI, Association for Computing Machinery (ACM), 2009, p. -78Conference paper (Refereed)
    Abstract [en]

    Concurrency and distribution pose algorithmic and implementation challenges in developing reliable distributed systems, making the field an excellent testbed for evaluating programming language and verification paradigms. Several specialized domain-specific languages and extensions of memory-unsafe languages were proposed to aid distributed system development. We present an alternative to these approaches, showing that modern, higher-order, strongly typed, memory safe languages provide an excellent vehicle for developing and debugging distributed systems. We present Opis, a functional-reactive approach for developing distributed systems in Objective Caml. An Opis protocol description consists of a reactive function (called event function) describing the behavior of a distributed system node. The event functions in Opis are built from pure functions as building blocks, composed using the Arrow combinators. Such architecture aids reasoning about event functions both informally and using interactive theorem provers. For example, it facilitates simple termination arguments. Given a protocol description, a developer can use higher-order library functions of Opis to 1) deploy the distributed system, 2) run the distributed system in a network simulator with full-replay capabilities, 3) apply explicit-state model checking to the distributed system, detecting undesirable behaviors, and 4) do performance analysis on the system. We describe the design and implementation of Opis, and present our experience in using Opis to develop peer-to-peer overlay protocols, including the Chord distributed hash table and the Cyclon random gossip protocol. We found that using Opis results in high programmer productivity and leads to easily composable protocol descriptions. Opis tools were effective in helping identify and eliminate correctness and performance problems during distributed system development.

  • 14. Dunagan, John
    et al.
    Harvey, Nicholas J. A.
    Jones, Michael B.
    Kostic, Dejan
    Duke.
    Theimer, Marvin
    Wolman, Alec
    FUSE:Lightweight Guaranteed Distributed Failure Notification2004In: Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI), Association for Computing Machinery (ACM), 2004Conference paper (Refereed)
    Abstract [en]

    FUSE is a lightweight failure notification service for building distributed systems. Distributed systems built with FUSE are guaranteed that failure notifications never fail. Whenever a failure notification is triggered, all live members of the FUSE group will hear a notification within a bounded period of time, irrespective of node or communication failures. In contrast to previous work on failure detection, the responsibility for deciding that afailure has occurred is shared between the FUSE service and the distributed application. This allows applications to implement their own definitions of failure. Our experience building a scalable distributed event delivery system on an overlay network has convinced us of the usefulness of this service. Our results demonstrate that the network costs of each FUSE group can be small; in particular, our overlay network implementation requires no additional liveness-verifying ping traffic beyond that already needed to maintain the overlay, making the steady state network load independent of the number of active FUSE groups.

  • 15. Facca, Federico
    et al.
    Karl, Holger
    Lopez, Diego
    Aranda, Pedro
    Kostic, Dejan
    IMDEA Networks Institute.
    Riggio, Roberto
    NetIDE: First steps towards an integrated development environment for portable network apps2013In: Proceedings of the 2nd European Workshop on Software Defined Networks (EWSDN), IEEE conference proceedings, 2013, p. 105-110Conference paper (Refereed)
    Abstract [en]

    Nowadays, while most of the programmable network apparatus vendors support OpenFlow, a number of fragmented control plane solutions exist for proprietary Software–Defined Networks. Thus, network applications developers are forced to re-implement their solutions every time they encounter a new network controller. Moreover, different network developers adopt different solutions as control plane programming language (e.g. Frenetic, Procera), severely limiting code sharing and reuse. Despite having OpenFlow as candidate standard interface between the controller and the network infrastructure, interoperability between different controllers and network devices is hindered and closed ecosystems are emerging. In this paper we present the roadmap toward NetIDE, an integrated development environment which aims at supporting the whole development lifecycle of vendor–agnostic network applications.

  • 16.
    Farshin, Alireza
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Roozbeh, Amir
    Ericsson Research.
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS.
    Make the Most out of Last Level Cache in Intel Processors2019In: Proceedings of the Fourteenth EuroSys Conference (EuroSys'19), Dresden, Germany, 25-28 March 2019., ACM Digital Library, 2019Conference paper (Refereed)
    Abstract [en]

    In modern (Intel) processors, Last Level Cache (LLC) is divided into multiple slices and an undocumented hashing algorithm (aka Complex Addressing) maps different parts of memory address space among these slices to increase the effective memory bandwidth. After a careful study of Intel’s Complex Addressing, we introduce a slice-aware memory management scheme, wherein frequently used data can be accessed faster via the LLC. Using our proposed scheme, we show that a key-value store can potentially improve its average performance ∼12.2% and ∼11.4% for 100% & 95% GET workloads, respectively. Furthermore, we propose CacheDirector, a network I/O solution which extends Direct Data I/O (DDIO) and places the packet’s header in the slice of the LLC that is closest to the relevant processing core. We implemented CacheDirector as an extension to DPDK and evaluated our proposed solution for latency-critical applications in Network Function Virtualization (NFV) systems. Evaluation results show that CacheDirector makes packet processing faster by reducing tail latencies (90-99th percentiles) by up to 119 µs (∼21.5%) for optimized NFV service chains that are running at 100 Gbps. Finally, we analyze the effectiveness of slice-aware memory management to realize cache isolation

  • 17. Goma, Eduard
    et al.
    Canini, Marco
    Lopez, Alberto
    Laoutaris, Nikolaos
    Kostic, Dejan
    EPFL.
    Rodriguez, Pablo
    Stanojevic, Rade
    Yague, Pablo
    Insomnia in the Access (or How to Curb Access Network Related Energy Consumption)2011In: Proceedings of the ACM SIGCOMM 2011 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, Association for Computing Machinery (ACM), 2011, p. -349Conference paper (Refereed)
    Abstract [en]

    Access networks include modems, home gateways, and DSL Access Multiplexers (DSLAMs), and are responsible for 70-80% of total network-based energy consumption. In this paper, we take an in-depth look at the problem of greening access networks, identify root problems, and propose practical solutions for their user- and ISP-parts. On the user side, the combination of continuous light traffic and lack of alternative paths condemns gateways to being powered most of the time despite having Sleep-on-Idle (SoI) capabilities. To address this, we introduce Broadband Hitch-Hiking (BH2), that takes advantage of the overlap of wireless networks to aggregate user traffic in as few gateways as possible. In current urban settings BH2 can power off 65-90% of gateways. Powering off gateways permits the remaining ones to synchronize at higher speeds due to reduced crosstalk from having fewer active lines. Our tests reveal speedup up to 25%. On the ISP side, we propose introducing simple inexpensive switches at the distribution frame for batching active lines to a subset of cards letting the remaining ones sleep. Overall, our results show an 80% energy savings margin in access networks. The combination of BH2 and switching gets close to this margin, saving 66% on average.

  • 18. Guerraoui, Rachid
    et al.
    Kostic, Dejan
    EPFL.
    Levy, Ron R.
    Quema, Vivien
    A High Throughput Atomic Storage Algorithm2007In: Proceedings of the 27th IEEE International Conference on Distributed Computing Systems (ICDCS’07), IEEE conference proceedings, 2007Conference paper (Refereed)
    Abstract [en]

    This paper presents an algorithm to ensure the atomicity of a distributed storage that can be read and written by any number of clients. In failure-free and synchronous situations, and even if there is contention, our algorithm has a high write throughput and a read throughput that grows linearly with the number of available servers. The algorithm is devised with a homogeneous cluster of servers in mind. It organizes servers around a ring and assumes oint-to-point communication. It is resilient to the crash failure of any number of readers and writers as well as to the crash failure of all but one server. We evaluated our algorithm on a cluster of 24 nodes with dual fast ethernet network interfaces (100 Mbps). We achieve 81 Mbps of write throughput and 8*90 Mbps of read throughput (with up to 8 servers) which conveys the linear scalability with the number of servers.

  • 19.
    Katsikas, Georgios P.
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS, Network Systems Laboratory (NS Lab). RISE SICS.
    Barbette, Tom
    University of Liege.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Steinert, Rebecca
    RISE SICS.
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS, Radio Systems Laboratory (RS Lab).
    Metron: NFV Service Chains at the True Speed of the Underlying Hardware2018Conference paper (Refereed)
    Abstract [en]

    In this paper we present Metron, a Network Functions Virtualization (NFV) platform that achieves high resource utilization by jointly exploiting the underlying network and commodity servers’ resources. This synergy allows Metron to: (i) offload part of the packet processing logic to the network, (ii) use smart tagging to setup and exploit the affinity of traffic classes, and (iii) use tag-based hardware dispatching to carry out the remaining packet processing at the speed of the servers’ fastest cache(s), with zero inter-core communication. Metron also introduces a novel resource allocation scheme that minimizes the resource allocation overhead for large-scale NFV deployments. With commodity hardware assistance, Metron deeply inspects traffic at 40 Gbps and realizes stateful network functions at the speed of a 100 GbE network card on a single server. Metron has 2.75-6.5x better efficiency than OpenBox, a state of the art NFV system, while ensuring key requirements such as elasticity, fine-grained load balancing, and flexible traffic steering.

  • 20.
    Katsikas, Georgios P.
    et al.
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Enguehard, Marcel
    Kuźniar, Maciej
    Maguire Jr, Gerald Q.
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Radio Systems Laboratory (RS Lab).
    Kostic, Dejan
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    SNF: synthesizing high performance NFV service chains2016In: PeerJ Computer Science, ISSN 2376-5992, p. 1-30Article in journal (Refereed)
    Abstract [en]

    In this paper we introduce SNF, a framework that synthesizes (S) network function (NF) service chains by eliminating redundant I/O and repeated elements, while consolidating stateful cross layer packet operations across the chain. SNF uses graph composition and set theory to determine traffic classes handled by a service chain composed of multiple elements. It then synthesizes each traffic class using a minimal set of new elements that apply single-read-single-write and early-discard operations. Our SNF prototype takes a baseline state of the art network functions virtualization (NFV) framework to the level of performance required for practical NFV service deployments. Software-based SNF realizes long (up to 10 NFs) and stateful service chains that achieve line-rate 40 Gbps throughput (up to 8.5x greater than the baseline NFV framework). Hardware-assisted SNF, using a commodity OpenFlow switch, shows that our approach scales at 40 Gbps for Internet Service Provider-level NFV deployments.

  • 21.
    Katsikas, Georgios P.
    et al.
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Maguire Jr., Gerald Q.
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Radio Systems Laboratory (RS Lab).
    Kostic, Dejan
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Profiling and accelerating commodity NFV service chains with SCC2017In: Journal of Systems and Software, ISSN 0164-1212, E-ISSN 1873-1228, Vol. 127, no C, p. 12-27Article in journal (Refereed)
    Abstract [en]

    Recent approaches to network functions virtualization (NFV) have shown that commodity network stacks and drivers struggle to keep up with increasing hardware speed. Despite this, popular cloud networking services still rely on commodity operating systems (OSs) and device drivers.

     

    Taking into account the hardware underlying of commodity servers, we built an NFV profiler that tracks the movement of packets across the system’s memory hierarchy by collecting key hardware and OS-level performance counters.

     

    Leveraging the profiler’s data, our Service Chain Coordinator’s (SCC) runtime accelerates user-space NFV service chains, based on commodity drivers. To do so, SCC combines multiplexing of system calls with scheduling strategies, taking time, priority, and processing load into account.

     

    By granting longer time quanta to chained network functions (NFs), combined with I/O multiplexing, SCC reduces unnecessary scheduling and I/O overheads, resulting in three-fold latency reduction due to cache and main memory utilization improvements. More importantly, SCC reduces the latency variance of NFV service chains by up to 40x compared to standard FastClick chains by making the average case for an NFV chain to perform as well as the best case. These improvements are possible because of our profiler’s accuracy.

  • 22. Klemm, Fabius
    et al.
    Le Boudec, Jean-Yves
    Kostic, Dejan
    EPFL.
    Aberer, Karl
    Handling Very Large Numbers of Messages in Distributed Hash Tables2009In: Proceedings of The First International Conference on COMmunication Systems and NETworkS (COMSNETS), IEEE conference proceedings, 2009, p. -9Conference paper (Refereed)
    Abstract [en]

    The principal service of distributed hash tables (DHTs) is route(id, data), which sends data to a peer responsible for id, using typically O(log(# of peers)) overlay hops. Certain applications like peer-to-peer information retrieval generate billions of small messages that are concurrently inserted into a DHT. These applications can generate messages faster than the DHT can process them. To support such demanding applications, a DHT needs a congestion control mechanism to efficiently handle high loads of messages. In this paper we provide an extended study on congestion control for DHTs: we present a theoretical analysis that demonstrates that congestion control for DHTs is absolutely necessary for applications that provide elastic traffic. We then present a new congestion control algorithm for DHTs. We provide extensive live evaluations in a ModelNet cluster and the PlanetLab test bed, which show that our algorithm is nearly loss-free, fair, and provides low lookup times and high throughput under cross-load.

  • 23. Knezevic, Nikola
    et al.
    Schubert, Simon
    Kostic, Dejan
    EPFL.
    Towards a Cost-Effective Networking Testbed2010In: ACM SIGOPS Operating Systems Review, ISSN 0163-5980, Vol. 43, no 4, p. 66-71Article in journal (Refereed)
    Abstract [en]

    The Internet is suffering from ossification. There has been substantial research on improving current protocols, but the vendors are reluctant to deploy new ones. We believe that this is in part due to the difficulty of evaluating protocols under realistic conditions. Recent wide-area testbeds can help alleviate this problem, but they require substantial resources (equipment, bandwidth) from each participant, and they have difficulty in providing repeatability and full control over the experiments. Existing in-house networking testbeds are capable of running controlled, repeatable experiments, but are typically small-scale (due to various overheads), limited in features, or expensive.

    The premise of our work is that it is possible to leverage the recent increases in computational power to improve the researchers' ability to experiment with new protocols in lab settings. We propose a cost-effective testbed, called MX, which emulates many programmable routers running over a realistic topology on multi-core commodity servers. We leverage open source implementations of programmable routers, such as Click, and modify them to allow coexistence of multiple instances in the same kernel in an effort to reduce packet forwarding overheads. Our initial results show that we outperform similar cost-effective solutions by a factor of 2. Next, we demonstrate that grouping and placing routers on to cores which share the L2 cache yields high performance.

  • 24.
    Kostic, Dejan
    et al.
    Duke.
    Braud, Ryan
    Killian, Charles
    Vandekieft, Eric
    Anderson, James W.
    Snoeren, Alex C.
    Vahdat, Amin
    Maintaining high bandwidth under dynamic network conditions2005In: Proceedings of the USENIX Annual Technical Conference, USENIX - The Advanced Computing Systems Association, 2005Conference paper (Refereed)
    Abstract [en]

    The need to distribute large files across multiple wide-area sites is becoming increasingly common, for instance, in support of scientific computing, configuring distributed systems, distributing software updates such as open source ISOs or Windows patches, or disseminating multimedia content. Recently a number of techniques have been proposed for simultaneously retrieving portions of a file from multiple remote sites with the twin goals of filling the client’s pipe and overcoming any performance bottlenecks between the client and any individual server. While there are a number of interesting tradeoffs in locating appropriate download sites in the face of dynamically changing network conditions, to date there has been no systematic evaluation of the merits of different protocols. This paper explores the design space of file distribution protocols and conducts a detailed performance evaluation of a number of competing systems running in both controlled emulation environments and live across the Internet. Based on our experience with these systems under a variety of conditions, we propose, implement and evaluate Bullet’ (Bullet prime), a mesh based high bandwidth data dissemination system that outperforms previous techniques under both static and dynamic conditions

  • 25.
    Kostic, Dejan
    et al.
    Duke.
    Rodriguez, Adolfo
    Albrecht, Jeannie
    Abhijeet, Bhirud
    Vahdat, Amin
    Using Random Subsets to Build Scalable Network Services2003In: Proceedings of the 4th USENIX Symposium on Internet Technologies and Systems (USITS), USENIX - The Advanced Computing Systems Association, 2003, p. 19-Conference paper (Refereed)
    Abstract [en]

    In this paper, we argue that a broad range of large-scale network services would benefit from a scalable mechanism for delivering state about a random subset of global participants. Key to this approach is ensuring that membership in the subset changes periodically and with uniform representation over all participants. Random subsets could help overcome inherent scaling limitations to services that maintain global state and perform global network probing. It could further improve the routing performance of peer-to-peer distributed hash tables by locating topologically-close nodes. This paper presents the design, implementation, and evaluation of RanSub, a scalable protocol for delivering such state. As a first demonstration of the RanSub utility, we construct SARO, a scalable and adaptive application-layer overlay tree. SARO uses RanSub state information tolocate appropriate peers for meeting application-specific delay and bandwidth targets and to dynamically adapt to changing network conditions. A large-scale evaluation of 1000 overlay nodes participating in an emulated 20,000- node wide-area network topology demonstrate both the adaptivity and scalability (in terms of per-node state and network overhead) of both RanSub and SARO. Finally, we use an existing streaming media server to distribute content through SARO running on top of the PlanetLab Internet testbed.

  • 26.
    Kostic, Dejan
    et al.
    Duke.
    Rodriguez, Adolfo
    Albrecht, Jeannie
    Vahdat, Amin
    Bullet: high bandwidth data dissemination using an overlay mesh2003In: Proceedings of the 19th ACM Symposium on Operating System Principles (SOSP), Association for Computing Machinery (ACM), 2003, p. -297Conference paper (Refereed)
    Abstract [en]

    In recent years, overlay networks have become an effective alternative to IP multicast for efficient point to multipoint communication across the Internet. Typically, nodes self-organize with the goal of forming an efficient overlay tree, one that meets performance targets without placing undue burden on the underlying network. In this paper, we target high-bandwidth data distribution from a single source to a large number of receivers. Applications include large-file transfers and real-time multimedia streaming. For these applications, we argue that an overlay mesh, rather than a tree, can deliver fundamentally higher bandwidth and reliability relative to typical tree structures. This paper presents Bullet, a scalable and distributed algorithm that enables nodes spread across the Internet to self-organize into a high bandwidth overlay mesh. We construct Bullet around the insight that data should be distributed in a disjoint manner to strategic points in the network. Individual Bullet receivers are then responsible for locating and retrieving the data from multiple points in parallel. Key contributions of this work include: i) an algorithm that sends data to different points in the overlay such that any data object is equally likely to appear at any node, ii) a scalable and decentralized algorithm that allows nodes to locate and recover missing data items, and iii) a complete implementation and evaluation of Bullet running across the Internet and in a large-scale emulation environment reveals up to a factor two bandwidth improvements under a variety of circumstances. In addition, we find that, relative to tree-based solutions, Bullet reduces the need to perform expensive bandwidth probing. In a tree, it is critical that a node’s parent delivers a high rate of application data to each child. In Bullet however, nodes simultaneously receive data from multiple sources in parallel, making it less important to locate any single source capable of sustaining a high transmission rate

  • 27.
    Kostic, Dejan
    et al.
    EPFL.
    Snoeren, Alex C.
    Vahdat, Amin
    Braud, Ryan
    Killian, Charles
    Anderson, James W.
    Albrecht, Jeannie
    Rodriguez, Adolfo
    Vandekieft, Erik
    High-bandwidth Data Dissemination for Large-scale Distributed Systems2008In: ACM Transactions on Computer Systems, ISSN 0734-2071, E-ISSN 1557-7333, Vol. 26, no 1Article in journal (Refereed)
    Abstract [en]

    This article focuses on the multireceiver data dissemination problem. Initially, IP multicast formed the basis for efficiently supporting such distribution. More recently, overlay networks have emerged to support point-to-multipoint communication. Both techniques focus on constructing trees rooted at the source to distribute content among all interested receivers. We argue, however, that trees have two fundamental limitations for data dissemination. First, since all data comes from a single parent, participants must often continuously probe in search of a parent with an acceptable level of bandwidth. Second, due to packet losses and failures, available bandwidth is monotonically decreasing down the tree.

    To address these limitations, we present Bullet, a data dissemination mesh that takes advantage of the computational and storage capabilities of end hosts to create a distribution structure where a node receives data in parallel from multiple peers. For the mesh to deliver improved bandwidth and reliability, we need to solve several key problems: (i) disseminating disjoint data over the mesh, (ii) locating missing content, (iii) finding who to peer with (peering strategy), (iv) retrieving data at the right rate from all peers (flow control), and (v) recovering from failures and adapting to dynamically changing network conditions. Additionally, the system should be self-adjusting and should have few user-adjustable parameter settings. We describe our approach to addressing all of these problems in a working, deployed system across the Internet. Bullet outperforms state-of-the-art systems, including BitTorrent, by 25-70% and exhibits strong performance and reliability in a range of deployment settings. In addition, we find that, relative to tree-based solutions, Bullet reduces the need to perform expensive bandwidth probing.

  • 28. Kuzniar, Maciej
    et al.
    Canini, Marco
    Kostic, Dejan
    EPFL.
    OFTEN Testing OpenFlow Networks2012In: Proceedings of the 1st European Workshop on Software Defined Networks (EWSDN), IEEE conference proceedings, 2012, p. -60Conference paper (Refereed)
    Abstract [en]

    Software-defined networking and OpenFlow in particular enable independent development of network devices and software that controls them. Such separation of concerns eases the introduction of new network functionality, however, it leads to distributed responsibility for bugs. Despite the common interface, separate development entails the need to test an integrated network before deployment. In this work-in-progress paper, we identify the challenges of creating an environment that simplifies and systematically conducts such tests. We discuss optimizations required for efficient and reliable OpenFlow switch black-box testing and present a possible approach to address other challenges. In our preliminary prototype, we combine systematic state-space exploration techniques with real switches execution to explore an integrated network behavior. Our initial results show that such methods help detect previously unrevealed inconsistencies in the network.

  • 29. Kuzniar, Maciej
    et al.
    Peresini, Peter
    Canini, Marco
    Venzano, Daniele
    Kostic, Dejan
    IMDEA Networks Institute.
    A SOFT Way for OpenFlow Switch Interoperability Testing2012In: Proceedings of the 8th International Conference on emerging Networking EXperiments and Technologies (ACM CoNEXT), Association for Computing Machinery (ACM), 2012Conference paper (Refereed)
    Abstract [en]

    The increasing adoption of Software Defined Networking, and OpenFlow in particular, brings great hope for increasing extensibility and lowering costs of deploying new network functionality. A key component in these networks is the OpenFlow agent, a piece of software that a switch runs to enable remote programmatic access to its forwarding tables. While testing high-level network functionality, the correct behavior and interoperability of any OpenFlow agent are taken for granted. However, existing tools for testing agents are not exhaustive nor systematic, and only check that the agent’s basic functionality works. In addition, the rapidly changing and sometimes vague OpenFlow specifications can result in multiple implementations that behave differently. This paper presents SOFT, an approach for testing the interoperability of OpenFlow switches. Our key insight is in automatically identifying the testing inputs that cause different OpenFlow agent implementations to behave inconsistently. To this end, we first symbolically execute each agent under test in isolation to derive which set of inputs causes which behavior. We then crosscheck all distinct behaviors across different agent implementations and evaluate whether a common input subset causes inconsistent behaviors. Our evaluation shows that our tool identified several inconsistencies between the publicly available Reference OpenFlow switch and Open vSwitch implementations.

  • 30. Kuzniar, Maciej
    et al.
    Peresini, Peter
    Kostic, Dejan
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Providing Reliable FIB Update Acknowledgments in SDN2014In: The 10th International Conference on emerging Networking Experiments and Technologies (CoNEXT’14), December 2–5, 2014, Sydney, Australia, Association for Computing Machinery (ACM), 2014Conference paper (Refereed)
    Abstract [en]

    In this paper, we rst show that transient, but grave problems such as violations of security policies can occur with real switches even when using consistent updates to Software Dened Networks. Next, we present techniques that are eective in ameliorating this problem. Our key insight is in creating a transparent layer that relies on control and data plane measurements to conrm rule updates only when the rule is visible in the data plane.

  • 31.
    Kuzniar, Maciej
    et al.
    EPFL.
    Peresini, Peter
    EPFL.
    Kostic, Dejan
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    What You Need to Know About SDN Flow Tables2015In: PASSIVE AND ACTIVE MEASUREMENT (PAM 2015), Springer, 2015, p. 347-359Conference paper (Refereed)
    Abstract [en]

    SDN deployments rely on switches that come from various vendors and differ in terms of performance and available features. Understanding these differences and performance characteristics is essential for ensuring successful deployments. In this paper we measure, report, and explain the performance characteristics of flow table updates in three hardware OpenFlow switches. Our results can help controller developers to make their programs efficient. Further, we also highlight differences between the OpenFlow specification and its implementations, that if ignored, pose a serious threat to network security and correctness.

  • 32. Kuzniar, Maciej
    et al.
    Peresini, Peter
    Kostic, Dejan
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab). KTH, School of Electrical Engineering and Computer Science (EECS).
    Canini, Marco
    KAUST.
    Methodology, Measurement and Analysis of Flow Table Update Characteristics in Hardware OpenFlow Switches2018In: Computer Networks, ISSN 1389-1286, E-ISSN 1872-7069Article in journal (Refereed)
    Abstract [en]

    Software-Defined Networking (SDN) and OpenFlow are actively being standardized and deployed. These deployments rely on switches that come from various vendors and differ in terms of performance and available features. Understanding these differences and performance characteristics is essential for ensuring successful and safe deployments.

    We propose a systematic methodology for SDN switch performance analysis and devise a series of experiments based on this methodology. The methodology relies on sending a stream of rule updates, while relying on both observing the control plane view as reported by the switch and probing the data plane state to determine switch characteristics by comparing these views. We measure, report and explain the performance characteristics of flow table updates in six hardware OpenFlow switches. Our results describing rule update rates can help SDN designers make their controllers efficient. Further, we also highlight differences between the OpenFlow specification and its implementations, that if ignored, pose a serious threat to network security and correctness.

  • 33. Kuzniar, Maciej
    et al.
    Peresini, Peter
    Vasic, Nedeljko
    Canini, Marco
    Kostic, Dejan
    IMDEA Networks Institute.
    Automatic Failure Recovery for Software-Defined Networks2013In: Proceedings of the ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking (HotSDN), Association for Computing Machinery (ACM), 2013, p. -160Conference paper (Refereed)
    Abstract [en]

    Tolerating and recovering from link and switch failures are fundamental requirements of most networks, including Software-Defined Networks (SDNs). However, instead of traditional behaviors such as network-wide routing reconvergence, failure recovery in an SDN is determined by the specific software logic running at the controller. While this admits more freedom to respond to a failure event, it ultimately means that each controller application must include its own recovery logic, which makes the code more difficult to write and potentially more error-prone. In this paper, we propose a runtime system that automates failure recovery and enables network developers to write simpler, failure-agnostic code. To this end, upon detecting a failure, our approach first spawns a new controller instance that runs in an emulated environment consisting of the network topology excluding the failed elements. Then, it quickly replays inputs observed by the controller before the failure occurred, leading the emulated network into the forwarding state that accounts for the failed elements. Finally, it recovers the network by installing the difference ruleset between emulated and current forwarding states.

  • 34. Liu, S.
    et al.
    Steinert, R.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Control under Intermittent Network Partitions2018In: 2018 IEEE International Conference on Communications (ICC), Institute of Electrical and Electronics Engineers (IEEE), 2018, article id 8422615Conference paper (Refereed)
    Abstract [en]

    We propose a novel distributed leader election algorithm to deal with the controller and control service availability issues in programmable networks, such as Software Defined Networks (SDN) or programmable Radio Access Network (RAN). Our approach can deal with a wide range of network failures, especially intermittent network partitions, where splitting and merging of a network repeatedly occur. In contrast to traditional leader election algorithms that mainly focus on the (eventual) consensus on one leader, the proposed algorithm aims at optimizing control service availability, stability and reducing the controller state synchronization effort during intermittent network partitioning situations. To this end, we design a new framework that enables dynamic leader election based on real-time estimates acquired from statistical monitoring. With this framework, the proposed leader election algorithm has the capability of being flexibly configured to achieve different optimization objectives, while adapting to various failure patterns. Compared with two existing algorithms, our approach can significantly reduce the synchronization overhead (up to 12x) due to controller state updates, and maintain up to twice more nodes under a controller.

  • 35. Liu, Shaoteng
    et al.
    Steinert, Rebecca
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Flexible distributed control plane deployment2018In: IEEE/IFIP Network Operations and Management Symposium: Cognitive Management in a Cyber World, NOMS 2018, Institute of Electrical and Electronics Engineers Inc. , 2018, p. 1-7Conference paper (Refereed)
    Abstract [en]

    For large-scale programmable networks, flexible deployment of distributed control planes is essential for service availability and performance. However, existing approaches only focus on placing controllers whereas the consequent control traffic is often ignored. In this paper, we propose a black-box optimization framework offering the additional steps for quanti-fying the effect of the consequent control traffic when deploying a distributed control plane. Evaluating different implementations of the framework over real-world topologies shows that close to optimal solutions can be achieved. Moreover, experiments indicate that running a method for controller placement without considering the control traffic, cause excessive bandwidth usage (worst cases varying between 20.1%-50.1% more) and congestion, compared to our approach. © 2018 IEEE.

  • 36. Novakovic, Dejan
    et al.
    Vasic, Nedeljko
    Novakovic, Stanko
    Kostic, Dejan
    IMDEA Networks Institute.
    Bianchini, Ricardo
    DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments2013In: Proceedings of The 2013 USENIX Annual Technical Conference, 2013Conference paper (Refereed)
    Abstract [en]

    We describe the design and implementation of Deep-Dive, a system for transparently identifying and managing performance interference between virtual machines (VMs) co-located on the same physical machine in Infrastructure-as-a-Service cloud environments. DeepDive successfully addresses several important challenges, including the lack of performance information from applications, and the large overhead of detailed interference analysis. We first show that it is possible to use easily-obtainable, low-level metrics to clearly discern when interference is occurring and what resource is causing it. Next, using realistic workloads, we show that DeepDive quickly learns about interference across co-located VMs. Finally, we show DeepDive’s ability to deal efficiently with interference when it is detected, by using a low-overhead approach to identifying a VM placement that alleviates interference.

  • 37. Peresini, Peter
    et al.
    Kostic, Dejan
    IMDEA Networks Institute.
    Is the Network Capable of Computation?2013In: Proceedings of the 3rd International Workshop on Rigorous Protocol Engineering (WRiPE), IEEE conference proceedings, 2013, p. -6Conference paper (Refereed)
    Abstract [en]

    Ensuring correct network behavior is hard. Previous state of the art has demonstrated that analyzing a network containing middleboxes is hard. In this paper, we show that even using only statically configured switches, and asking the simplest possible question - “Will this concrete packet reach the destination?” - can make the problem intractable. Moreover, we demonstrate that this is a fundamental property because a network can perform arbitrary computations. Namely, we show how to emulate the Rule 110 cellular automaton using only basic network switches with simple features such as packet matching, header rewriting and round-robin loadbalancing. This ultimately means that analyzing dynamic network behavior can be as hard as analyzing an arbitrary program.

  • 38. Peresini, Peter
    et al.
    Kuzniar, Maciej
    Canini, Marco
    Kostic, Dejan
    IMDEA Networks Institute.
    ESPRES: Easy Scheduling and Prioritization for SDN2014In: Proceedings of the Open Networking Summit (ONS) Research Track, 2014Conference paper (Refereed)
  • 39.
    Peresini, Peter
    et al.
    EPFL.
    Kuzniar, Maciej
    EPFL.
    Canini, Marco
    UCLouvain.
    Venzano, Daniele
    EURECOM.
    Kostic, Dejan
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Rexford, Jennier
    Princeton University.
    Systematically Testing OpenFlow Controller Applications2015In: Computer Networks, ISSN 1389-1286, E-ISSN 1872-7069, Vol. 92Article in journal (Refereed)
    Abstract [en]

    The emergence of OpenFlow-capable switches enables exciting new network functionality, at the risk of programming errors that make communication less reliable. The centralized programming model, where a single controller program manages the network, seems to reduce the likelihood of bugs. However, the system is inherently distributed and asynchronous, with events happening at different switches and end hosts, and inevitable delays affecting communication with the controller. In this paper, we present efficient, systematic techniques for testing unmodified controller programs. Our NICE tool applies model checking to explore the state space of the entire system—the controller, the switches, and the hosts. Scalability is the main challenge, given the diversity of data packets, the large system state, and the many possible event orderings. To address this, we propose a novel way to augment model checking with symbolic execution of event handlers (to identify representative packets that exercise code paths on the controller). We also present a simplified OpenFlow switch model (to reduce the state space), and effective strategies for generating event interleavings likely to uncover bugs. Our prototype tests Python applications on the popular NOX platform. In testing three real applications—a MAC-learning switch, in-network server load balancing, and energy-efficient traffic engineering—we uncover thirteen bugs

  • 40. Peresini, Peter
    et al.
    Kuzniar, Maciej
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Dynamic, Fine-Grained Data Plane Monitoring with Monocle2018In: IEEE/ACM Transactions on Networking, ISSN 1063-6692, E-ISSN 1558-2566, Vol. 26, no 1, p. 534-547Article in journal (Refereed)
    Abstract [en]

    Ensuring network reliability is important for satisfying service-level objectives. However, diagnosing network anomalies in a timely fashion is difficult due to the complex nature of network configurations. We present Monocle — a system that uncovers forwarding problems due to hardware or software failures in switches, by verifying that the data plane corresponds to the view that an SDN controller installs via the control plane. Monocle works by systematically probing the switch data plane; the probes are constructed by formulating the switch forwarding table logic as a Boolean satisfiability (SAT) problem. Our SAT formulation quickly generates probe packets targeting a particular rule considering both existing and new rules. Monocle can monitor not only static flow tables (as is currently typically the case), but also dynamic networks with frequent flow table changes. Our evaluation shows that Monocle is capable of fine-grained monitoring for the majority of rules, and it can identify a rule suddenly missing from the data plane or misbehaving in a matter of seconds. In fact, during our evaluation Monocle uncovered problems with two hardware switches that we were using in our evaluation. Finally, during network updates Monocle helps controllers cope with switches that exhibit transient inconsistencies between their control and data plane states.

  • 41.
    Peresini, Peter
    et al.
    EPFL.
    Kuzniar, Maciej
    EPFL.
    Kostic, Dejan
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Monocle: Dynamic, Fine-Grained Data Plane Monitoring2015In: Proceedings of the 11th International Conference on emerging Networking EXperiments and Technologies (ACM CoNEXT), Association for Computing Machinery (ACM), 2015Conference paper (Refereed)
    Abstract [en]

    Ensuring network reliability is important for satisfying service-level objectives. However, diagnosing network anomalies in a timely fashion is difficult due to the complex nature of network configurations. We present Monocle — a system that uncovers forwarding problems due to hardware or software failures in switches, by verifying that the data plane corresponds to the view that an SDN controller installs via the control plane. Monocle works by systematically probing the switch data plane; the probes are constructed by formulating the switch forwarding table logic as a Boolean satisfiability (SAT) problem. Our SAT formulation quickly generates probe packets targeting a particular rule considering both existing and new rules. Monocle can monitor not only static flow tables (as is currently typically the case), but also dynamic networks with frequent flow table changes. Our evaluation shows that Monocle is capable of finegrained monitoring for the majority of rules, and it can identify a rule suddenly missing from the data plane or misbehaving in a matter of seconds. Also, during network updates Monocle helps controllers cope with switches that exhibit transient inconsistencies between their control and data plane states

  • 42. Peresini, Peter
    et al.
    Kuzniar, Maciej
    Kostic, Dejan
    IMDEA Networks Institute.
    OpenFlow Needs You! A Call for a Discussion About a Cleaner OpenFlow API2013In: Proceedings of the 2nd European Workshop on Software Defined Networks (EWSDN), 2013Conference paper (Refereed)
    Abstract [en]

    Software defined networks are poised to dramatically simplify deployment and management of networks. OpenFlow, in particular, is becoming popular and starts being deployed. While the definition of the “northbound” API that can be used by the new services to interact with an OpenFlow controller is receiving considerable attention, the traditional, “southbound”, API that is used to program OpenFlow switches is far from perfect. In this paper, we analyze the current OpenFlow API and its usage in several controllers and show semantic differences between the intended and actual use. Thus, we argue for making the OpenFlow API clean and simple. In particular, we propose to mimic the process that exists in the Python community for deriving changes that result in a preferably only one, obvious way of performing a task. Toward this end, we propose three OpenFlow Enhancement Proposals: i) providing positive acknowledgment, ii) informing the controller about “silent” modifications, and iii) providing a partial order synchronization primitive.

  • 43. Peresini, Peter
    et al.
    Kuzniar, Maciej
    Kostic, Dejan
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Rule-Level Data Plane Monitoring With Monocle2015In: Computer communication review, ISSN 0146-4833, E-ISSN 1943-5819, Vol. 45, no 4, p. 595-596Article in journal (Refereed)
    Abstract [en]

    We present Monocle, a system that systematically monitors the network data plane, and verifies that it corresponds to the view that the SDN controller builds and tries to enforce in the switches. Our evaluation shows that Monocle is capable of fine-grained per-rule monitoring for the majority of rules. In addition, it can help controllers to cope with switches that exhibit transient inconsistencies between their control plane and data plane states.

  • 44. Peresini, Peter
    et al.
    Kuzniar, Maciej
    Vasic, Nedeljko
    Canini, Marco
    Kostic, Dejan
    EPFL.
    OF.CPP: Consistent Packet Processing for OpenFlow2013In: Proceedings of the ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking (HotSDN), Association for Computing Machinery (ACM), 2013, p. -102Conference paper (Refereed)
    Abstract [en]

    This paper demonstrates a new class of bugs that is likely to occur in enterprise OpenFlow deployments. In particular, step-by-step, reactive establishment of paths can cause network-wide inconsistencies or performance- and space-related inefficiencies. The cause for this behavior is inconsistent packet processing: as the packets travel through the network they do not encounter consistent state at the OpenFlow controller. To mitigate this problem, we propose to use transactional semantics at the controller to achieve consistent packet processing. We detail the challenges in achieving this goal (including the inability to directly apply database techniques), as well as a potentially promising approach. In particular, we envision the use of multi-commit transactions that could provide the necessary serialization and isolation properties without excessively reducing network performance.

  • 45.
    Peresini, Peter
    et al.
    EPFL.
    Kuzniar, Peter
    EPFL.
    Canini, Marco
    Université catholique de Louvain.
    Kostic, Dejan
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS.
    ESPRES: Transparent SDN Update Scheduling2014In: Proceedings of the Workshop on Hot Topics in Software Defined Networking (HotSDN), Association for Computing Machinery (ACM), 2014Conference paper (Refereed)
    Abstract [en]

    Network forwarding state undergoes frequent changes, in batches of forwarding rule modifications at multiple switches. Installing or modifying a large number of rules is time consuming given the performance limits of current programmable switches, which are also due to economical factors in addition to technological ones.

    In this paper, we observe that a large network-state update typically consists of a set of sub-updates that are independent of one another w.r.t. the traffic they affect, and hence sub-updates can be installed in parallel, in any order. Leveraging this observation, we treat update installation as a scheduling problem and design ESPRES, a runtime mechanism that rate-limits and reorders updates to fully utilize processing capacities of switches without overloading them. Our early results show that compared to using no scheduler, our schemes yield 2.17-3.88 times quicker sub-update completion time for 20th percentile of sub-updates and 1.27-1.57 times quicker for 50th percentile.

  • 46.
    Reda, Waleed
    et al.
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Canini, Marco
    KAUST.
    Suresh, Lalith
    VMware Research.
    Kostic, Dejan
    KTH, School of Information and Communication Technology (ICT), Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Braithwaite, Sean
    Soundcloud.
    Rein: Taming Tail Latency in Key-ValueStores via Multiget Scheduling2017Conference paper (Refereed)
    Abstract [en]

    We tackle the problem of reducing tail latencies in distributed key-value stores, such as the popular Cassandra database. We focus on workloads of multiget requests, which batch together access to several data elements and parallelize read operations across the data store machines. We first analyze a production trace of a real system and quantify the skew due to multiget sizes, key popularity, and other factors. We then proceed to identify opportunities for reduction of tail latencies by recognizing the composition of aggregate requests and by carefully scheduling bottleneck operations that can otherwise create excessive queues. We design and implement a system called Rein, which reduces latency via inter-multiget scheduling using low overhead techniques. We extensively evaluate Rein via experiments in Amazon Web Services (AWS) and simulations. Our scheduling algorithms reduce the median, 95th, and 99th percentile latencies by factors of 1.5, 1.5, and 1.9, respectively.

  • 47. Rodriguez, Adolfo
    et al.
    Killian, Charles
    Bhat, Sooraj
    Kostic, Dejan
    Duke.
    Vahdat, Amin
    MACEDON: methodology for automatically creating, evaluating, and designing overlay networks2004In: Proceedings of the First Symposium on Networked Systems Design and Implementation (NSDI ’04), 2004Conference paper (Refereed)
    Abstract [en]

    Currently, researchers designing and implementing large-scale overlay services employ disparate techniques at each stage in the production cycle: design, implementation, experimentation, and evaluation. As a result, complex and tedious tasks are often duplicated leading to ineffective resource use and difficulty in fairly comparing competing algorithms. In this paper, we present MACEDON, an infrastructure that provides facilities to: i) specify distributed algorithms in a concise domain-specific language; ii) generate code that executes in popular evaluation infrastructures and in live networks; iii) leverage an overlay-generic API to simplify the interoperability of algorithm implementations and applications; and iv) enable consistent experimental evaluation. We have used MACEDON to implement and evaluate a number of algorithms, including AMMO, Bullet, Chord, NICE, Overcast, Pastry, Scribe, and SplitStream, typically with only a few hundred lines of MACEDON code. Using our infrastructure, we are able to accurately reproduce or exceed published results and behavior demonstrated by current publicly available implementations

  • 48. Rodriguez, Adolfo
    et al.
    Kostic, Dejan
    Duke.
    Vahdat, Amin
    Scalability in adaptive multi-metric overlays2004In: Proceedings of the 2nd IEEE International Conference on Distributed Computing Systems, IEEE conference proceedings, 2004, p. -121Conference paper (Refereed)
    Abstract [en]

    Increasing application requirements have placed heavy emphasis on building overlay networks to efficiently deliver data to multiple receivers. A key performance challenge is simultaneously achieving adaptivity to changing network conditions and scalability to large numbers of users. In addition, most current algorithms focus on a single performance metric, such as delay or bandwidth, particular to individual application requirements. We introduce a two-fold approach for creating robust, high-performance overlays called adaptive multimetric overlays (AMMO). First, AMMO uses an adaptive, highly-parallel, and metric-independent protocol, TreeMaint, to build and maintain overlay trees. Second, AMMO provides a mechanism for comparing overlay edges along specified application performance goals to guide TreeMaint transformations. We have used AMMO to implement and evaluate a single-metric (bandwidth-optimized) tree similar to Overcast and a two-metric (delay-constrained, cost-optimized) overlay

  • 49.
    Roozbeh, Amir
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS. Ericsson AB, Ericsson Res, S-16483 Stockholm, Sweden..
    Soares, Joao
    Ericsson AB, Ericsson Res, S-16483 Stockholm, Sweden..
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS.
    Wuhib, Fetahi
    Ericsson AB, Ericsson Res, S-16483 Stockholm, Sweden..
    Padala, Chakri
    Ericsson AB, Ericsson Res, S-16483 Stockholm, Sweden..
    Mahloo, Mozhgan
    Ericsson AB, Ericsson Res, S-16483 Stockholm, Sweden..
    Turull, Daniel
    Ericsson AB, Ericsson Res, S-16483 Stockholm, Sweden..
    Yadhav, Vinay
    Ericsson AB, Ericsson Res, S-16483 Stockholm, Sweden..
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Communication Systems, CoS.
    Software-Defined "Hardware" Infrastructures: A Survey on Enabling Technologies and Open Research Directions2018In: IEEE Communications Surveys and Tutorials, ISSN 1553-877X, E-ISSN 1553-877X, Vol. 20, no 3, p. 2454-2485Article in journal (Refereed)
    Abstract [en]

    This paper provides an overview of software-defined "hardware" infrastructures (SDHI). SDHI builds upon the concept of hardware (HW) resource disaggregation. HW resource disaggregation breaks today's physical server-oriented model where the use of a physical resource (e.g., processor or memory) is constrained to a physical server's chassis. SDHI extends the definition of of software-defined infrastructures (SDI) and brings greater modularity, flexibility, and extensibility to cloud infrastructures, thus allowing cloud operators to employ resources more efficiently and allowing applications not to be bounded by the physical infrastructure's layout. This paper aims to be an initial introduction to SDHI and its associated technological advancements. This paper starts with an overview of the cloud domain and puts into perspective some of the most prominent efforts in the area. Then, it presents a set of differentiating use-cases that SDHI enables. Next, we state the fundamentals behind SDI and SDHI, and elaborate why SDHI is of great interest today. Moreover, it provides an overview of the functional architecture of a cloud built on SDHI, exploring how the impact of this transformation goes far beyond the cloud infrastructure level in its impact on platforms, execution environments, and applications. Finally, an in-depth assessment is made of the technologies behind SDHI, the impact of these technologies, and the associated challenges and potential future directions of SDHI.

  • 50. Schubert, Simon
    et al.
    Kostic, Dejan
    EPFL.
    Zwaenepoel, Willy
    Shin, Kang
    Profiling Software for Energy Consumption2012In: Proceedings of the IEEE International Conference on Green Computing and Communications (GreenCom), IEEE conference proceedings, 2012, p. -522Conference paper (Refereed)
    Abstract [en]

    The amount of energy consumed by computer systems can be lowered through the use of more efficient algorithms and software. Unfortunately, software developers lack the tools to pinpoint energy-hungry sections in their code and therefore have to rely on their intuition when trying to optimize their code for energy consumption. We have developed eprof, a profiler that relates energy consumption to code locations, it attributes both the synchronously consumed energy in the CPU and the asynchronously consumed energy in peripheral devices like hard drives, network cards, etc. Eprof requires minimal changes to the kernel (tens of lines of code) and does not require special hardware to energy-profile software. Therefore eprof can be widely used to help developers make energy-aware decisions.

12 1 - 50 of 63
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf