kth.sePublications
Planned maintenance
A system upgrade is planned for 10/12-2024, at 12:00-13:00. During this time DiVA will be unavailable.
Change search
Refine search result
1 - 37 of 37
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Anderson, Thomas
    et al.
    University of Washington.
    Canini, Marco
    KAUST.
    Kim, Jongyul
    KAIST.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Kwon, Youngjin
    KAIST.
    Peter, Simon
    The University of Texas at Austin.
    Reda, Waleed
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Schuh, Henry
    University of Washington.
    Witchel, Emmett
    The University of Texas at Austin.
    Assise: Performance and Availability via Client-local NVM in a Distributed File System2020In: / [ed] USENIX Association, USENIX - The Advanced Computing Systems Association, 2020, p. 1011--1027Conference paper (Refereed)
    Abstract [en]

    The adoption of low latency persistent memory modules (PMMs) upends the long-established model of remote storage for distributed file systems. Instead, by colocating computation with PMM storage, we can provide applications with much higher IO performance, sub-second application failover, and strong consistency. To demonstrate this, we built the Assise distributed file system, based on a persistent, replicated coherence protocol that manages client-local PMM as a linearizable and crash-recoverable cache between applications and slower (and possibly remote) storage. Assise maximizes locality for all file IO by carrying out IO on process-local, socket-local, and client-local PMM whenever possible. Assise minimizes coherence overhead by maintaining consistency at IO operation granularity, rather than at fixed block sizes.

    We compare Assise to Ceph/BlueStore, NFS, and Octopus on a cluster with Intel Optane DC PMMs and SSDs for common cloud applications and benchmarks, such as LevelDB, Postfix, and FileBench. We find that Assise improves write latency up to 22x, throughput up to 56x, fail-over time up to 103x, and scales up to 6x better than its counterparts, while providing stronger consistency semantics.

    Download full text (pdf)
    assise.pdf
  • 2. Antichi, Gianni
    et al.
    Castro, Ignacio
    Chiesa, Marco
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). Université catholique de Louvain.
    Fernandes, Eder L.
    Lapeyrade, Remy
    Kopp, Daniel
    Han, Jong Hun
    Bruyere, Marc
    Dietzel, Christoph
    Gusat, Mitchell
    Moore, Andrew W.
    Owezarski, Philippe
    Uhlig, Steve
    Canini, Marco
    ENDEAVOUR: A Scalable SDN Architecture For Real-World IXPs2017In: IEEE Journal on Selected Areas in Communications, ISSN 0733-8716, E-ISSN 1558-0008, Vol. 35, no 11, p. 2553-2562Article in journal (Refereed)
    Abstract [en]

    Innovation in interdomain routing has remained stagnant for over a decade. Recently, Internet eXchange Points (IXPs) have emerged as economically-advantageous interconnection points for reducing path latencies and exchanging ever increasing traffic volumes among, possibly, hundreds of networks. Given their far-reaching implications on interdomain routing, IXPs are the ideal place to foster network innovation and extend the benefits of software defined networking (SDN) to the interdomain level. In this paper, we present, evaluate, and demonstrate ENDEAVOUR, an SDN platform for IXPs. ENDEAVOUR can be deployed on a multi-hop IXP fabric, supports a large number of use cases, and is highly scalable, while avoiding broadcast storms. Our evaluation with real data from one of the largest IXPs, demonstrates the benefits and scalability of our solution: ENDEAVOUR requires around 70% fewer rules than alternative SDN solutions thanks to our rule partitioning mechanism. In addition, by providing an open source solution, we invite everyone from the community to experiment (and improve) our implementation as well as adapt it to new use cases.

  • 3.
    Barbette, Tom
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Chiesa, Marco
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Stateless CPU-aware datacenter load-balancing2020In: Poster: Stateless CPU-aware datacenter load-balancing, Association for Computing Machinery (ACM) , 2020, p. 548-549Conference paper (Refereed)
    Abstract [en]

    Today, datacenter operators deploy Load-balancers (LBs) to efficiently utilize server resources, but must over-provision server resources (by up to 30%) because of load imbalances and the desire to bound tail service latency. We posit one of the reasons for these imbalances is the lack of per-core load statistics in existing LBs. As a first step, we designed CrossRSS, a CPU core-aware LB that dynamically assigns incoming connections to the least loaded cores in the server pool. CrossRSS leverages knowledge of the dispatching by each server's Network Interface Card (NIC) to specific cores to reduce imbalances by more than an order of magnitude compared to existing LBs in a proof-of-concept datacenter environment, processing 12% more packets with the same number of cores.

    Download full text (pdf)
    fulltext
  • 4.
    Barbette, Tom
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Katsikas, Georgios P.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Radio Systems Laboratory (RS Lab).
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    RSS++: load and state-aware receive side scaling2019In: Proceedings of the 15th International Conference on emerging Networking EXperiments and Technologies / [ed] ACM, Orlando, FL, USA: Association for Computing Machinery (ACM), 2019Conference paper (Refereed)
    Abstract [en]

    While the current literature typically focuses on load-balancing among multiple servers, in this paper, we demonstrate the importance of load-balancing within a single machine (potentially with hundreds of CPU cores). In this context, we propose a new load-balancing technique (RSS++) that dynamically modifies the receive side scaling (RSS) indirection table to spread the load across the CPU cores in a more optimal way. RSS++ incurs up to 14x lower 95th percentile tail latency and orders of magnitude fewer packet drops compared to RSS under high CPU utilization. RSS++ allows higher CPU utilization and dynamic scaling of the number of allocated CPU cores to accommodate the input load, while avoiding the typical 25% over-provisioning. RSS++ has been implemented for both (i) DPDK and (ii) the Linux kernel. Additionally, we implement a new state migration technique, which facilitates sharding and reduces contention between CPU cores accessing per-flow data. RSS++ keeps the flow-state by groups that can be migrated at once, leading to a 20% higher efficiency than a state of the art shared flow table.

    Download full text (pdf)
    RSSPP
  • 5.
    Barbette, Tom
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Soldani, Cyril
    Université de Liège.
    Mathy, Laurent
    Université de Liège.
    Combined stateful classification and session splicing for high-speed NFV service chaining2021In: IEEE/ACM Transactions on Networking, ISSN 1063-6692, E-ISSN 1558-2566Article in journal (Refereed)
    Abstract [en]

    Network functions such as firewalls, NAT, DPI, content-aware optimizers, and load-balancers are increasingly realized as software to reduce costs and enable outsourcing. To meet performance requirements these virtual network functions (VNFs) often bypass the kernel and use their own user-space networking stack. A naïve realization of a chain of VNFs will exchange raw packets, leading to many redundant operations, wasting resources. In this work, we design a system to execute a pipeline of VNFs. We provide the user facilities to define (i) a traffic class of interest for the VNF, (ii) a session to group the packets (such as the TCP 4-tuple), and (iii) the amount of space per session. The system synthesizes a classifier and builds an efficient flow table that when possible will automatically be partially offloaded and accelerated by the network interface. We utilize an abstract view of flows to support seamless inspection and modification of the content of any flow (such as TCP or HTTP). By applying only surgical modifications to the protocol headers, we avoid the need for a complex, hard-to-maintain user-space TCP stack and can chain multiple VNFs without re-constructing the stream multiple times, allowing up to 5x improvement over standard approaches.

    Download full text (pdf)
    fulltext
  • 6.
    Barbette, Tom
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Tang, Chen
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Yao, Haoran
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Papadimitratos, Panagiotis
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Chiesa, Marco
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    A High-Speed Load-Balancer Design with Guaranteed Per-Connection-Consistency2020In: Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2020 / [ed] USENIX Association, Santa Clara, CA, USA: USENIX Association , 2020, p. 667-683Conference paper (Refereed)
    Abstract [en]

    Large service providers use load balancers to dispatch millions of incoming connections per second towards thousands of servers. There are two basic yet critical requirements for a load balancer: uniform load distribution of the incoming connections across the servers and per-connection-consistency (PCC), i.e., the ability to map packets belonging to the same connection to the same server even in the presence of changes in the number of active servers and load balancers. Yet, meeting both these requirements at the same time has been an elusive goal. Today's load balancers minimize PCC violations at the price of non-uniform load distribution.

    This paper presents Cheetah, a load balancer that supports uniform load distribution and PCC while being scalable, memory efficient, resilient to clogging attacks, and fast at processing packets. The Cheetah LB design guarantees PCC for any realizable server selection load balancing mechanism and can be deployed in both a stateless and stateful manner, depending on the operational needs. We implemented Cheetah on both a software and a Tofino-based hardware switch. Our evaluation shows that a stateless version of Cheetah guarantees PCC, has negligible packet processing overheads, and can support load balancing mechanisms that reduce the flow completion time by a factor of 2–3×.

    Download full text (pdf)
    cheetah.pdf
  • 7.
    Bogdanov, Kirill
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Enabling Fast and Accurate Run-Time Decisions in Geo-Distributed Systems: Better Achieving Service Level Objectives2018Doctoral thesis, monograph (Other academic)
    Abstract [en]

    Computing services are highly integrated into modern society and used  by millions of people daily. To meet these high demands, many popular  services are implemented and deployed as geo-distributed applications on  top of third-party virtualized cloud providers. However, the nature of  such a deployment leads to variable performance. To deliver high quality  of service, these systems strive to adapt to ever-changing conditions by  monitoring changes in state and making informed run-time decisions, such  as choosing server peering, replica placement, and redirection of requests. In  this dissertation, we seek to improve the quality of run-time decisions made  by geo-distributed systems. We attempt to achieve this through: (1) a better  understanding of the underlying deployment conditions, (2) systematic and  thorough testing of the decision logic implemented in these systems, and (3)  by providing a clear view of the network and system states allowing services  to make better-informed decisions.  First, we validate an application’s decision logic used in popular  storage systems by examining replica selection algorithms. We do this by  introducing GeoPerf, a tool that uses symbolic execution and modeling to  perform systematic testing of replica selection algorithms. GeoPerf was used  to test two popular storage systems and found one bug in each.  Then, using measurements across EC2, we observed persistent correlation  between network paths and network latency. Based on these observations,  we introduce EdgeVar, a tool that decouples routing and congestion based  changes in network latency. This additional information improves estimation  of latency, as well as increases the stability of network path selection.  Next, we introduce Tectonic, a tool that tracks an application’s requests  and responses both at the user and kernel levels. In combination with  EdgeVar, it decouples end-to-end request completion time into three  components of network routing, network congestion, and service time.  Finally, we demonstrate how this decoupling of request completion  time components can be leveraged in practice by developing Kurma, a  fast and accurate load balancer for geo-distributed storage systems. At  runtime, Kurma integrates network latency and service time distributions to  accurately estimate the rate of Service Level Objective (SLO) violations, for  requests redirected between geo-distributed datacenters. Using real-world  data, we demonstrate Kurma’s ability to effectively share load among  datacenters while reducing SLO violations by a factor of up to 3 in high  load settings or reducing the cost of running the service by up to 17%. The  techniques described in this dissertation are important for current and future  geo-distributed services that strive to provide the best quality of service to  customers while minimizing the cost of operating the service.  

    Download full text (pdf)
    fulltext
  • 8.
    Chiesa, Marco
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). Université Catholique de Louvain, Belgium.
    Demmler, D.
    Canini, M.
    Schapira, M.
    Schneider, T.
    SIXPACK: Securing internet eXchange points against curious onlookers2017In: CoNEXT 2017 - Proceedings of the 2017 13th International Conference on emerging Networking EXperiments and Technologies, Association for Computing Machinery (ACM), 2017, p. 120-133Conference paper (Refereed)
    Abstract [en]

    Internet eXchange Points (IXPs) play an ever-growing role in Internet inter-connection. To facilitate the exchange of routes amongst their members, IXPs provide Route Server (RS) services to dispatch the routes according to each member's peering policies. Nowadays, to make use of RSes, these policies must be disclosed to the IXP. This poses fundamental questions regarding the privacy guarantees of route-computation on confidential business information. Indeed, as evidenced by interaction with IXP administrators and a survey of network operators, this state of affairs raises privacy concerns among network administrators and even deters some networks from subscribing to RS services. We design sixpack1, an RS service that leverages Secure Multi-Party Computation (SMPC) to keep peering policies confidential, while extending, the functionalities of today's RSes. As SMPC is notoriously heavy in terms of communication and computation, our design and implementation of sixpack aims at moving computation outside of the SMPC without compromising the privacy guarantees. We assess the effectiveness and scalability of our system by evaluating a prototype implementation using traces of data from one of the largest IXPs in the world. Our evaluation results indicate that sixpack can scale to support privacy-preserving route-computation, even at IXPs with many hundreds of member networks.

  • 9.
    Chiesa, Marco
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Retvari, Gabor
    MTA BME Informat Syst Res Grp, H-1521 Budapest, Hungary..
    Schapira, Michael
    Hebrew Univ Jerusalem, IL-9190401 Jerusalem, Israel..
    Oblivious Routing in IP Networks2018In: IEEE/ACM Transactions on Networking, ISSN 1063-6692, E-ISSN 1558-2566, Vol. 26, no 3, p. 1292-1305Article in journal (Refereed)
    Abstract [en]

    To optimize the flow of traffic in IP networks, operators do traffic engineering (TE), i.e., tune routing-protocol parameters in response to traffic demands. TE in IP networks typically involves configuring static link weights and splitting traffic between the resulting shortest-paths via the equal-cost-multipath (ECMP) mechanism. Unfortunately, ECMP is a notoriously cumbersome and indirect means for optimizing traffic flow, often leading to poor network performance. Also, obtaining accurate knowledge of traffic demands as the input to TE is a non-trivial task that may require additional monitoring infrastructure, and traffic conditions can be highly variable, further complicating TE. We leverage recently proposed schemes for increasing ECMP's expressiveness via carefully disseminated bogus information (lies) to design COYOTE, a readily deployable TE scheme for robust and efficient network utilization. COYOTE leverages new algorithmic ideas to configure (static) traffic splitting ratios that are optimized with respect to all (even adversarial) traffic scenarios within the operator's "uncertainty bounds". Our experimental analyses show that COYOTE significantly outperforms today's prevalent TE schemes in a manner that is robust to traffic uncertainty and variation. We discuss experiments with a prototype implementation of COYOTE.

  • 10.
    Chiesa, Marco
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Sedar, Roshan
    Universitat Politècnica de Catalunya.
    Antichi, Gianni
    Queen Mary, University of London.
    Borokhovich, Michael
    Independent Researcher.
    Kamisiński, Andrzej
    AGH University of Science and Technology in Kraków.
    Nikolaidis, Georgios
    Barefoot Networks.
    Schmid, Stefan
    University of Vienna.
    PURR: A Primitive for Reconfigurable Fast Reroute: (hope for the best and program for the worst)2019In: In International Conference on emerging Networking EXperiments and Technologies, , 2019 / [ed] ACM, 2019Conference paper (Refereed)
    Abstract [en]

    Highly dependable communication networks usually rely on some kind of Fast Re-Route (FRR) mechanism which allows to quickly re-route traffic upon failures, entirely in the data plane. This paper studies the design of FRR mechanisms for emerging reconfigurable switches.

    Our main contribution is an FRR primitive for programmable data planes, PURR, which provides low failover latency and high switch throughput, by avoiding packet recirculation. PURR tolerates multiple concurrent failures and comes with minimal memory requirements, ensuring compact forwarding tables, by unveiling an intriguing connection to classic ``string theory'' (\textit{i.e.}, stringology), and in particular, the shortest common supersequence problem. PURR is well-suited for high-speed match\slash action forwarding architectures (e.g., PISA) and supports the implementation of arbitrary network-wide FRR mechanisms. Our simulations and prototype implementation (on an FPGA and Tofino) show that PURR~improves TCAM memory occupancy by a factor of 1.51.5x---10.810.8x compared to a na\"ive encoding when implementing state-of-the-art FRR mechanisms. PURR also improves the latency and throughput of datacenter traffic up to a factor of \mbox{2.82.8x---5.55.5x} and 1.21.2x---22x, respectively, compared to approaches based on recirculating packets.

    Download full text (pdf)
    fulltext
  • 11.
    Farshin, Alireza
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Realizing Low-Latency Internet Services via Low-Level Optimization of NFV Service Chains: Every nanosecond counts!2019Licentiate thesis, monograph (Other academic)
    Abstract [en]

    By virtue of the recent technological developments in cloud computing, more applications are deployed in a cloud. Among these modern cloud-based applications, some require bounded and predictable low-latency responses. However, the current cloud infrastructure is unsuitable as it cannot satisfy these requirements, due to many limitations in both hardware and software.

    This licentiate thesis describes attempts to reduce the latency of Internet services by carefully studying the currently available infrastructure, optimizing it, and improving its performance. The focus is to optimize the performance of network functions deployed on commodity hardware, known as network function virtualization (NFV). The performance of NFV is one of the major sources of latency for Internet services.

    The first contribution is related to optimizing the software. This project began by investigating the possibility of superoptimizing virtualized network functions(VNFs). This began with a literature review of available superoptimization techniques, then one of the state-of-the-art superoptimization tools was selected to analyze the crucial metrics affecting application performance. The result of our analysis demonstrated that having better cache metrics could potentially improve the performance of all applications.

    The second contribution of this thesis employs the results of the first part by taking a step toward optimizing cache performance of time-critical NFV service chains. By doing so, we reduced the tail latencies of such systems running at 100Gbps. This is an important achievement as it increases the probability of realizing bounded and predictable latency for Internet services.

    Download full text (pdf)
    FARSHIN_LIC_FULLTEXT
  • 12.
    Farshin, Alireza
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Barbette, Tom
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Roozbeh, Amir
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS. Ericsson Research.
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    PacketMill: Toward Per-Core 100-Gbps Networking2021In: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), ACM Digital Library, 2021Conference paper (Refereed)
    Abstract [en]

    We present PacketMill, a system for optimizing software packet processing, which (i) introduces a new model to efficiently manage packet metadata and (ii) employs code-optimization techniques to better utilize commodity hardware. PacketMill grinds the whole packet processing stack, from the high-level network function configuration file to the low-level userspace network (specifically DPDK) drivers, to mitigate inefficiencies and produce a customized binary for a given network function. Our evaluation results show that PacketMill increases throughput (up to 36.4 Gbps -- 70%) & reduces latency (up to 101 us -- 28%) and enables nontrivial packet processing (e.g., router) at ~100 Gbps, when new packets arrive >10× faster than main memory access times, while using only one processing core.

    Download full text (pdf)
    fulltext
    Download full text (pdf)
    extended abstract
  • 13.
    Farshin, Alireza
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS, Network Systems Laboratory (NS Lab).
    Rizzo, Luigi
    Google, Mountain View, CA, United States.
    Elmeleegy, Khaled
    Google, Mountain View, CA, United States.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Overcoming the IOTLB wall for multi-100-Gbps Linux-based networking2023In: PeerJ Computer Science, E-ISSN 2376-5992, Vol. 9, p. e1385-, article id cs-1385Article in journal (Refereed)
    Abstract [en]

    This article explores opportunities to mitigate the performance impact of IOMMU on high-speed network traffic, as used in the Linux kernel. We first characterize IOTLB behavior and its effects on recent Intel Xeon Scalable & AMD EPYC processors at 200 Gbps, by analyzing the impact of different factors contributing to IOTLB misses and causing throughput drop (up to 20% compared to the no-IOMMU case in our experiments). Secondly, we discuss and analyze possible mitigations, including proposals and evaluation of a practical hugepage-aware memory allocator for the network device drivers to employ hugepage IOTLB entries in the Linux kernel. Our evaluation shows that using hugepage-backed buffers can completely recover the throughput drop introduced by IOMMU. Moreover, we formulate a set of guidelines that enable network developers to tune their systems to avoid the “IOTLB wall”, i.e., the point where excessive IOTLB misses cause throughput drop. Our takeaways signify the importance of having a call to arms to rethink Linux-based I/O management at higher data rates.

    Download full text (pdf)
    iotlb-peerj23
  • 14.
    Farshin, Alireza
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Roozbeh, Amir
    Ericsson Research.
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Make the Most out of Last Level Cache in Intel Processors2019In: Proceedings of the Fourteenth EuroSys Conference (EuroSys'19), Dresden, Germany, 25-28 March 2019., ACM Digital Library, 2019Conference paper (Refereed)
    Abstract [en]

    In modern (Intel) processors, Last Level Cache (LLC) is divided into multiple slices and an undocumented hashing algorithm (aka Complex Addressing) maps different parts of memory address space among these slices to increase the effective memory bandwidth. After a careful study of Intel’s Complex Addressing, we introduce a slice-aware memory management scheme, wherein frequently used data can be accessed faster via the LLC. Using our proposed scheme, we show that a key-value store can potentially improve its average performance ∼12.2% and ∼11.4% for 100% & 95% GET workloads, respectively. Furthermore, we propose CacheDirector, a network I/O solution which extends Direct Data I/O (DDIO) and places the packet’s header in the slice of the LLC that is closest to the relevant processing core. We implemented CacheDirector as an extension to DPDK and evaluated our proposed solution for latency-critical applications in Network Function Virtualization (NFV) systems. Evaluation results show that CacheDirector makes packet processing faster by reducing tail latencies (90-99th percentiles) by up to 119 µs (∼21.5%) for optimized NFV service chains that are running at 100 Gbps. Finally, we analyze the effectiveness of slice-aware memory management to realize cache isolation

    Download full text (pdf)
    fulltext
    Download full text (pdf)
    Poster
  • 15.
    Farshin, Alireza
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Roozbeh, Amir
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). Ericsson Research.
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Optimizing Intel Data Direct I/O Technology for Multi-hundred-gigabit Networks2020In: Proceedings of the Fifteenth EuroSys Conference (EuroSys'20), Heraklion, Crete, Greece, April 27-30, 2020., 2020Conference paper (Refereed)
    Abstract [en]

    Digitalization across society is expected to produce a massive amount of data, leading to the introduction of faster network interconnects. In addition, many Internet services require high throughput and low latency. However, having only faster links does not guarantee throughput or low latency. Therefore, it is essential to perform holistic system optimization to fully take advantage of the faster links to provide high-performance services. Intel Data Direct I/O (DDIO) is a recent technology that was introduced to facilitate the deployment of high-performance services based on fast interconnects. We evaluated the effectiveness of DDIO for multi-hundred-gigabit networks. This paper briefly discusses our findings on DDIO, which show the necessity of optimizing/adapting it to address the challenges of multi-hundred-gigabit-per-second links.

    Download full text (pdf)
    Extended Abstract
    Download (pdf)
    Poster
    Download (mp4)
    1-min Video Teaser
    Download (pdf)
    1-slide Teaser
  • 16.
    Farshin, Alireza
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Roozbeh, Amir
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). Ericsson Research.
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Reexamining Direct Cache Access to Optimize I/O Intensive Applications for Multi-hundred-gigabit Networks2020In: 2020 USENIX Annual Technical Conference (USENIX ATC 20), 2020, p. 673-689Conference paper (Refereed)
    Abstract [en]

    Memory access is the major bottleneck in realizing multi-hundred-gigabit networks with commodity hardware, hence it is essential to make good use of cache memory that is a faster, but smaller memory closer to the processor. Our goal is to study the impact of cache management on the performance of I/O intensive applications. Specifically, this paper looks at one of the bottlenecks in packet processing, i.e., direct cache access (DCA). We systematically studied the current implementation of DCA in Intel processors, particularly Data Direct I/O technology (DDIO), which directly transfers data between I/O devices and the processor's cache. Our empirical study enables system designers/developers to optimize DDIO-enabled systems for I/O intensive applications. We demonstrate that optimizing DDIO could reduce the latency of I/O intensive network functions running at 100 Gbps by up to ~30%. Moreover, we show that DDIO causes a 30% increase in tail latencies when processing packets at 200 Gbps, hence it is crucial to selectively inject data into the cache or to explicitly bypass it.

    Download full text (pdf)
    fulltext
  • 17.
    Farshin, Alireza
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Roozbeh, Amir
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS. Ericsson Research.
    Schulte, Christian
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Scheduling - A Secret Sauce For Resource DisaggregationManuscript (preprint) (Other academic)
    Abstract [en]

    This technical report describes the design & implementation of a constraint-based framework for scheduling & resource allocation in a disaggregated data center (DDC) where we build logical servers from disaggregated resources. We show that an Service LevelObjective (SLO)-aware constraint-based solver could improve a data center’s resource utilization by finding better solutions based on provided workload characteristics.

    Download full text (pdf)
    fulltext
  • 18.
    Foerster, Klaus-Tycho
    et al.
    Univ Vienna, Vienna, Austria..
    Parham, Mahmoud
    Univ Vienna, Vienna, Austria..
    Chiesa, Marco
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Schmid, Stefan
    Univ Vienna, Vienna, Austria..
    TI-MFA: Keep Calm and Reroute Segments Fast2018In: IEEE INFOCOM 2018 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), IEEE , 2018, p. 415-420Conference paper (Refereed)
    Abstract [en]

    Segment Routing (SR) promises to provide scalable and fine-grained traffic engineering. However, little is known today on how to implement resilient routing in SR, i.e., routes which tolerate one or even multiple failures. This paper initiates the theoretical study of static fast failover mechanisms which do not depend on reconvergence and hence support a very fast reaction to failures. We introduce formal models and identify fundamental tradeoffs on what can and cannot be achieved in terms of static resilient routing. In particular, we identify an inherent price in terms of performance if routing paths need to be resilient, even in the absence of failures. Our main contribution is a first algorithm which is resilient even to multiple failures and which comes with provable resiliency and performance guarantees. We complement our formal analysis with simulations on real topologies, which show the benefits of our approach over existing algorithms.

  • 19.
    Ghasemirahni, Hamid
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Barbette, Tom
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Katsikas, Georgios P.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Farshin, Alireza
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Roozbeh, Amir
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). Ericsson Res, Stockholm, Sweden..
    Girondi, Massimo
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Chiesa, Marco
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Packet Order Matters! Improving Application Performance by Deliberately Delaying Packets2022In: Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2022, USENIX - The Advanced Computing Systems Association, 2022, p. 807-827Conference paper (Refereed)
    Abstract [en]

    Data centers increasingly deploy commodity servers with high-speed network interfaces to enable low-latency communication. However, achieving low latency at high data rates crucially depends on how the incoming traffic interacts with the system's caches. When packets that need to be processed in the same way are consecutive, i.e., exhibit high temporal and spatial locality, caches deliver great benefits.

    In this paper, we systematically study the impact of temporal and spatial traffic locality on the performance of commodity servers equipped with high-speed network interfaces. Our results show that (i) the performance of a variety of widely deployed applications degrades substantially with even the slightest lack of traffic locality, and (ii) a traffic trace from our organization reveals poor traffic locality as networking protocols, drivers, and the underlying switching/routing fabric spread packets out in time (reducing locality). To address these issues, we built Reframer, a software solution that deliberately delays packets and reorders them to increase traffic locality. Despite introducing μs-scale delays of some packets, we show that Reframer increases the throughput of a network service chain by up to 84% and reduces the flow completion time of a web server by 11% while improving its throughput by 20%.

    Download full text (pdf)
    fulltext
  • 20.
    Girondi, Massimo
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    Chiesa, Marco
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Barbette, Tom
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    High-speed Connection Tracking in Modern Servers2021In: 2021 IEEE 22nd International Conference on High Performance Switching and Routing (HPSR) (IEEE HPSR'21), 2021Conference paper (Refereed)
    Abstract [en]

    The rise of commodity servers equipped with high-speed network interface cards poses increasing demands on the efficient implementation of connection tracking, i.e., the task of associating the connection identifier of an incoming packet to the state stored for that connection. In this work, we thoroughly investigate and compare the performance obtainable by different implementations of connection tracking using high-speed real traffic traces. Based on a load balancer use case, our results show that connection tracking is an expensive operation, achieving at most 24 Gbps on a single core. Core-sharding and lock-free hash tables emerge as the only suitable multi-thread approaches for enabling 100 Gbps packet processing. In contrast to recent beliefs, we observe that newly proposed techniques to "lazily" delete connection states are not more effective than properly tuned traditional deletion techniques based on timer wheels.

    Download full text (pdf)
    fulltext
  • 21.
    Katsikas, Georgios P.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). RISE SICS.
    NFV Service Chains at the Speed of the Underlying Commodity Hardware2018Doctoral thesis, monograph (Other academic)
    Abstract [en]

    Link speeds in networks will in the near-future reach and exceed 100 Gbps. While available specialized hardware can accommodate these speeds, modern networks have adopted a new networking paradigm, also known as Network Functions Virtualization (NFV), that replaces expensive specialized hardware with open-source software running on commodity hardware. However, achieving high performance using commodity hardware is a hard problem mainly because of the processor-memory gap. This gap suggests that only the fastest memories of today’s commodity servers can achieve the desirable access latencies for high speed networks. Existing NFV systems realize chained network functions (also known as service chains) mostly using slower memories; this implies a need for multiple additional CPU cores or even multiple servers to achieve high speed packet processing. In contrast, this thesis combines four contributions to realize NFV service chains with dramatically higher performance and better efficiency than the state of the art.

    The first contribution is a framework that profiles NFV service chains to uncover reasons for performance degradation, while the second contribution leverages the profiler’s data to accelerate these service chains by combining multiplexing of system calls with scheduling strategies. The third contribution synthesizes input/output and processing service chain operations to increase the spatial locality of network traffic with respect to a system’s caches. The fourth contribution combines the profiler’s insights from the first contribution and the synthesis approach of the third contribution to realize NFV service chains at the speed of the underlying commodity hardware. To do so, stateless traffic classification operations are offloaded into available hardware (i.e., programmable switches and/or network cards) and a tag is associated with each traffic class. At the server side, input traffic classes are classified by the hardware based upon the values of these tags, which indicate the CPU core that should undertake their stateful processing, while ensuring zero inter-core communication.

    With commodity hardware, this thesis realizes Internet Service Provider-level service chains and deep packet inspection at a line-rate 40 Gbps and stateful service chains at the speed of a 100 GbE network card on a 16 core single server. This results in up to (i) 4.7x lower latency, (ii) 8.5x higher throughput, and (iii) 6.5x better efficiency than the state of the art. The techniques described in this thesis are crucial for realizing future high speed NFV deployments.

    Download full text (pdf)
    fulltext
  • 22.
    Katsikas, Georgios P.
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Barbette, Tom
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Chiesa, Marco
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.
    What you need to know about (Smart) Network Interface Cards2021In: Proceedings Passive and Active Measurement - 22nd International Conference, PAM 2021 / [ed] Springer International Publishing, Springer Nature , 2021Conference paper (Refereed)
    Abstract [en]

    Network interface cards (NICs) are fundamental componentsof modern high-speed networked systems, supporting multi-100 Gbpsspeeds and increasing programmability. Offloading computation from aserver’s CPU to a NIC frees a substantial amount of the server’s CPU resources, making NICs key to offer competitive cloud services.

    Therefore, understanding the performance benefits and limitations of offloading anetworking application to a NIC is of paramount importance.In this paper, we measure the performance of four different NICs fromone of the largest NIC vendors worldwide, supporting 100 Gbps and200 Gbps. We show that while today’s NICs can easily support multihundred-gigabit throughputs, performing frequent update operations ofa NIC’s packet classifier — as network address translators (NATs) andload balancers would do for each incoming connection — results in adramatic throughput reduction of up to 70 Gbps or complete denial ofservice. Our conclusion is that all tested NICs cannot support high-speednetworking applications that require keeping track of a large number offrequently arriving incoming connections. Furthermore, we show a variety of counter-intuitive performance artefacts including the performanceimpact of using multiple tables to classify flows of packets.

    Download full text (pdf)
    fulltext
  • 23.
    Katsikas, Georgios P.
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Barbette, Tom
    Kostic, Dejan
    Steinert, R.
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Metron: NFV service chains at the true speed of the underlying hardware2019Conference paper (Refereed)
    Abstract [en]

    In this paper we present Metron, a Network Functions Virtualization (NFV) platform that achieves high resource utilization by jointly exploiting the underlying network and commodity servers’ resources. This synergy allows Metron to: (i) offload part of the packet processing logic to the network, (ii) use smart tagging to setup and exploit the affinity of traffic classes, and (iii) use tag-based hardware dispatching to carry out the remaining packet processing at the speed of the servers’ fastest cache(s), with zero intercore communication. Metron also introduces a novel resource allocation scheme that minimizes the resource allocation overhead for large-scale NFV deployments. With commodity hardware assistance, Metron deeply inspects traffic at 40 Gbps and realizes stateful network functions at the speed of a 100 GbE network card on a single server. Metron has 2.75-6.5x better efficiency than OpenBox, a state of the art NFV system, while ensuring key requirements such as elasticity, fine-grained load balancing, and flexible traffic steering

  • 24.
    Katsikas, Georgios P.
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab). RISE SICS.
    Barbette, Tom
    University of Liege.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Steinert, Rebecca
    RISE SICS.
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Radio Systems Laboratory (RS Lab).
    Metron: NFV Service Chains at the True Speed of the Underlying Hardware2018Conference paper (Refereed)
    Abstract [en]

    In this paper we present Metron, a Network Functions Virtualization (NFV) platform that achieves high resource utilization by jointly exploiting the underlying network and commodity servers’ resources. This synergy allows Metron to: (i) offload part of the packet processing logic to the network, (ii) use smart tagging to setup and exploit the affinity of traffic classes, and (iii) use tag-based hardware dispatching to carry out the remaining packet processing at the speed of the servers’ fastest cache(s), with zero inter-core communication. Metron also introduces a novel resource allocation scheme that minimizes the resource allocation overhead for large-scale NFV deployments. With commodity hardware assistance, Metron deeply inspects traffic at 40 Gbps and realizes stateful network functions at the speed of a 100 GbE network card on a single server. Metron has 2.75-6.5x better efficiency than OpenBox, a state of the art NFV system, while ensuring key requirements such as elasticity, fine-grained load balancing, and flexible traffic steering.

    Download full text (pdf)
    fulltext
  • 25.
    Khodaei, Mohammad
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Jin, Hongyu
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Papadimitratos, Panagiotis
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    SECMACE: Scalable and Robust Identity and Credential Management Infrastructure in Vehicular Communication Systems2018In: IEEE transactions on intelligent transportation systems (Print), ISSN 1524-9050, E-ISSN 1558-0016, Vol. 19, no 5, p. 1430-1444Article in journal (Refereed)
    Abstract [en]

    Several years of academic and industrial research efforts have converged to a common understanding on fundamental security building blocks for the upcoming vehicular communication (VC) systems. There is a growing consensus toward deploying a special-purpose identity and credential management infrastructure, i.e., a vehicular public-key infrastructure (VPKI), enabling pseudonymous authentication, with standardization efforts toward that direction. In spite of the progress made by standardization bodies (IEEE 1609.2 and ETSI) and harmonization efforts [Car2Car Communication Consortium (C2C-CC)], significant questions remain unanswered toward deploying a VPKI. Deep understanding of the VPKI, a central building block of secure and privacy-preserving VC systems, is still lacking. This paper contributes to the closing of this gap. We present SECMACE, a VPKI system, which is compatible with the IEEE 1609.2 and ETSI standards specifications. We provide a detailed description of our state-of-the-art VPKI that improves upon existing proposals in terms of security and privacy protection, and efficiency. SECMACE facilitates multi-domain operations in the VC systems and enhances user privacy, notably preventing linking pseudonyms based on timing information and offering increased protection even against honest-but-curious VPKI entities. We propose multiple policies for the vehicle-VPKI interactions and two large-scale mobility trace data sets, based on which we evaluate the full-blown implementation of SECMACE. With very little attention on the VPKI performance thus far, our results reveal that modest computing resources can support a large area of vehicles with very few delays and the most promising policy in terms of privacy protection can be supported with moderate overhead.

    Download full text (pdf)
    secmace-tits
  • 26.
    Khodaei, Mohammad
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Network and Systems Engineering.
    Papadimitratos, Panos
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Poster: Mix-Zones Everywhere: A Dynamic Cooperative Location Privacy Protection Scheme2018In: 2018 IEEE Vehicular Networking Conference, (VNC) / [ed] Altintas, O Tsai, HM Lin, K Boban, M Wang, CY Sahin, T, IEEE, 2018, article id 8628340Conference paper (Refereed)
    Abstract [en]

    Inter-vehicle communications disclose rich information about vehicle whereabouts. Pseudonymous authentication secures communication while enhancing user privacy. To enhance location privacy, cryptographic mix-zones are proposed where vehicles can covertly update their credentials. But, the resilience of such schemes against linking attacks highly depends on the geometry of the mix-zones, mobility patterns, vehicle density, and arrival rates. In this poster, we propose "mix-zones everywhere",a cooperative location privacy protection scheme to mitigate linking attacks during pseudonym transition. Time-aligned pseudonyms are issued for all vehicles to facilitate synchronous pseudonym updates. Our scheme thwarts Sybil-based misbehavior, strongly maintains user privacy in the presence of honest-but-curious system entities, and is resilient against misbehaving insiders.

    Download full text (pdf)
    fulltext
  • 27. Kim, Jongyul
    et al.
    Jang, Insu
    Reda, Waleed
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Im, Jaeseong
    Canini, Marco
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Kwon, Youngjin
    Peter, Simon
    Witchel, Emmett
    LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism2021In: ACM SIGOPS 28th Symposium on Operating Systems Principles, 2021Conference paper (Refereed)
    Abstract [en]

    In multi-tenant systems, the CPU overhead of distributed file systems (DFSes) is increasingly a burden to application performance. CPU and memory interference cause degraded and unstable application and storage performance, in particular for operation latency. Recent client-local DFSes for persistent memory (PM) accelerate this trend. DFS offload to SmartNICs is a promising solution to these problems, but it is challenging to fit the complex demands of a DFS onto simple SmartNIC processors located across PCIe.

    We present LineFS, a SmartNIC-offloaded, high-performance DFS with support for client-local PM. To fully leverage the SmartNIC architecture, we decompose DFS operations into execution stages that can be offloaded to a parallel datapath execution pipeline on the SmartNIC. LineFS offloads CPU-intensive DFS tasks, like replication, compression, data publication, index and consistency management to a Smart-NIC. We implement LineFS on the Mellanox BlueField Smart-NIC and compare it to Assise, a state-of-the-art PM DFS. LineFS improves latency in LevelDB up to 80% and throughput in Filebench up to 79%, while providing extended DFS availability during host system failures.

    Download full text (pdf)
    fulltext
  • 28. Liu, S.
    et al.
    Steinert, R.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Control under Intermittent Network Partitions2018In: 2018 IEEE International Conference on Communications (ICC), Institute of Electrical and Electronics Engineers (IEEE), 2018, article id 8422615Conference paper (Refereed)
    Abstract [en]

    We propose a novel distributed leader election algorithm to deal with the controller and control service availability issues in programmable networks, such as Software Defined Networks (SDN) or programmable Radio Access Network (RAN). Our approach can deal with a wide range of network failures, especially intermittent network partitions, where splitting and merging of a network repeatedly occur. In contrast to traditional leader election algorithms that mainly focus on the (eventual) consensus on one leader, the proposed algorithm aims at optimizing control service availability, stability and reducing the controller state synchronization effort during intermittent network partitioning situations. To this end, we design a new framework that enables dynamic leader election based on real-time estimates acquired from statistical monitoring. With this framework, the proposed leader election algorithm has the capability of being flexibly configured to achieve different optimization objectives, while adapting to various failure patterns. Compared with two existing algorithms, our approach can significantly reduce the synchronization overhead (up to 12x) due to controller state updates, and maintain up to twice more nodes under a controller.

  • 29.
    Liu, Shaoteng
    et al.
    RISE SICS.
    Steinert, Rebecca
    RISE SICS.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Flexible distributed control plane deployment2018In: Proceedings 2018 IEEE/IFIP Network Operations and Management Symposium, NOMS 2018: Cognitive Management in a Cyber World, NOMS 2018, Institute of Electrical and Electronics Engineers (IEEE) , 2018, p. 1-7Conference paper (Refereed)
    Abstract [en]

    For large-scale programmable networks, flexible deployment of distributed control planes is essential for service availability and performance. However, existing approaches only focus on placing controllers whereas the consequent control traffic is often ignored. In this paper, we propose a black-box optimization framework offering the additional steps for quanti-fying the effect of the consequent control traffic when deploying a distributed control plane. Evaluating different implementations of the framework over real-world topologies shows that close to optimal solutions can be achieved. Moreover, experiments indicate that running a method for controller placement without considering the control traffic, cause excessive bandwidth usage (worst cases varying between 20.1%-50.1% more) and congestion, compared to our approach.

  • 30.
    Marcos, Pedro
    et al.
    UFRGS and FURG.
    Chiesa, Marco
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Dietzel, Christoph
    DE-CIX/MPI for Informatics.
    Canini, Marco
    KAUST.
    Barcellos, Marinho
    UFRGS.
    A Survey on the Current Internet Interconnection Practices2020In: Computer communication review, ISSN 0146-4833, E-ISSN 1943-5819, Vol. 50, no 1, p. 10-17Article in journal (Refereed)
    Abstract [en]

    The Internet topology has significantly changed in the past years. Today, it is richly connected and flattened. Such a change has been driven mostly by the fast growth of peering infrastructures and the expansion of Content Delivery Networks as alternatives to reduce interconnection costs and improve traffic delivery performance. While the topology evolution is perceptible, it is unclear whether or not the interconnection process has evolved or if it continues to be an ad-hoc and lengthy process. To shed light on the current practices of the Internet interconnection ecosystem and how these could impact the Internet, we surveyed more than 100 network operators and peering coordinators. We divide our results into two parts: (i)(i) the current interconnection practices, including the steps of the process and the reasons to establish new interconnection agreements or to renegotiate existing ones, and the parameters discussed by network operators. In part (ii)(ii), we report the existing limitations and how the interconnection ecosystem can evolve in the future. We show that despite the changes in the topology, interconnecting continues to be a cumbersome process that usually takes days, weeks, or even months to complete, which is in stark contrast with the desire of most operators in reducing the interconnection setup time. We also identify that even being primary candidates to evolve the interconnection process, emerging on-demand connectivity companies are only fulfilling part of the existing gap between the current interconnection practices and the network operators' desires.

  • 31.
    Marcos, Pedro
    et al.
    Univ Fed Rio Grande do Sul, Porto Alegre, RS, Brazil.;Fundacao Univ Fed Rio Grande, Rio Grande, Brazil..
    Chiesa, Marco
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Muller, Lucas
    Univ Fed Rio Grande do Sul, Porto Alegre, RS, Brazil..
    Kathiravelu, Pradeeban
    INESC ID, Lisbon, Portugal.;UCLouvain, Ottignies, Belgium..
    Dietzel, Christoph
    TU Berlin, Berlin, Germany.;DE CIX, Cologne, Germany..
    Canini, Marco
    KAUST, Thuwal, Saudi Arabia..
    Barcellos, Marinho
    Univ Fed Rio Grande do Sul, Porto Alegre, RS, Brazil..
    Dynam-IX: a Dynamic Interconnection eXchange2018In: PROCEEDINGS OF THE 2018 APPLIED NETWORKING RESEARCH WORKSHOP (ANRW '18), Association for Computing Machinery (ACM), 2018, p. 94-94Conference paper (Refereed)
  • 32.
    Noroozi, Hamid
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Khodaei, Mohammad
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Network and Systems Engineering.
    Papadimitratos, Panos
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    VPKIaaS: Towards Scaling Pseudonymous Authentication for Large Mobile Systems2019Report (Other academic)
    Download full text (pdf)
    Cysep19-VPKIaaS
  • 33.
    Németh, Felicián
    et al.
    MTA-BME Network Softwarization Research Group.
    Chiesa, Marco
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Rétvári, Gábor
    MTA-BME Information Systems Research Group.
    Normal Forms for Match-Action Programs2019In: Proceedings CoNEXT 2019 - The 15th International Conference on emerging Networking EXperiments and Technologies / [ed] ACM, ACM Digital Library, 2019Conference paper (Refereed)
    Abstract [en]

    Packet processing programs may have multiple semantically equivalent representations in terms of the match-action abstraction exposed by the underlying data plane. Some representations may encode the entire packet processing program into one large table allowing packets to be matched in a single lookup, while others may encode the same functionality decomposed into a pipeline of smaller match-action tables, maximizing modularity at the cost of increased lookup latency. In this paper, we provide the first systematic study of match-action program representations in order to assist network programmers in navigating this vast design space. Borrowing from relational database and formal language theory, we define a framework for the equivalent transformation of match-action programs to obtain certain irredundant representations that we call ``normal forms''. We find that normalization generally improves the capacity of the control plane to program the data-plane and to observe its state, at the same time having negligible, or positive, performance impact.

  • 34.
    Omer Mahgoub Saied, Khalid
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Network Latency Estimation Leveraging Network Path Classification2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    With the development of the Internet, new network services with strict network latency requirements have been made possible. These services are implemented as distributed systems deployed across multiple geographical locations. To provide low response time, these services require knowledge about the current network latency. Unfortunately, network latency among geo-distributed sites often change, thus distributed services rely on continuous network latency measurements. One goal of such measurements is to differentiate between momentary latency spikes from relatively long-term latency changes. The differentiation is achieved through statistical processing of the collected samples. This approach of high-frequency network latency measurements has high overhead, slow to identify network latency changes and lacks accuracy.

    We propose a novel approach for network latency estimation by correlating network paths to network latency. We demonstrate that network latency can be accurately estimated by first measuring and identifying the network path used and then fetching the expected latency for that network path based on previous set of measurements. Based on these principles, we introduce Sudan traceroute, a network latency estimation tool. Sudan traceroute can be used to both reduce the latency estimation time as well as to reduce the overhead of network path measurements. Sudan traceroute uses an improved path detection mechanism that sends only a few carefully selected probes in order to identify the current network path.

    We have developed and evaluated Sudan traceroute in a test environment and evaluated the feasibility of Sudan traceroute on real-world networks using Amazon EC2. Using Sudan traceroute we have shortened the time it takes for hosts to identify network latency level changes compared to existing approaches.

    Download full text (pdf)
    Sudan_traceroute
  • 35. Peresini, Peter
    et al.
    Kuzniar, Maciej
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Dynamic, Fine-Grained Data Plane Monitoring with Monocle2018In: IEEE/ACM Transactions on Networking, ISSN 1063-6692, E-ISSN 1558-2566, Vol. 26, no 1, p. 534-547Article in journal (Refereed)
    Abstract [en]

    Ensuring network reliability is important for satisfying service-level objectives. However, diagnosing network anomalies in a timely fashion is difficult due to the complex nature of network configurations. We present Monocle — a system that uncovers forwarding problems due to hardware or software failures in switches, by verifying that the data plane corresponds to the view that an SDN controller installs via the control plane. Monocle works by systematically probing the switch data plane; the probes are constructed by formulating the switch forwarding table logic as a Boolean satisfiability (SAT) problem. Our SAT formulation quickly generates probe packets targeting a particular rule considering both existing and new rules. Monocle can monitor not only static flow tables (as is currently typically the case), but also dynamic networks with frequent flow table changes. Our evaluation shows that Monocle is capable of fine-grained monitoring for the majority of rules, and it can identify a rule suddenly missing from the data plane or misbehaving in a matter of seconds. In fact, during our evaluation Monocle uncovered problems with two hardware switches that we were using in our evaluation. Finally, during network updates Monocle helps controllers cope with switches that exhibit transient inconsistencies between their control and data plane states.

  • 36.
    Reda, Waleed
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Bogdanov, Kirill
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Milolidakis, Alexandros
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Ghasemirahni, Hamid
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Chiesa, Marco
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Maguire Jr., Gerald Q.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Kostic, Dejan
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS.
    Path Persistence in the Cloud: A Study of the Effects of Inter-Region Traffic Engineering in a Large Cloud Provider's Network2020In: Computer communication review, ISSN 0146-4833, E-ISSN 1943-5819, Vol. 50, no 2, p. 11-23Article in journal (Refereed)
    Abstract [en]

    A commonly held belief is that traffic engineering and routing changes are infrequent. However, based on our measurements over a number of years of traffic between data centers in one of the largest cloud provider's networks, we found that it is common for flows to change paths at ten-second intervals or even faster. These frequent path and, consequently, latency variations can negatively impact the performance of cloud applications, specifically, latency-sensitive and geo-distributed applications.

    Our recent measurements and analysis focused on observing path changes and latency variations between different Amazon aws regions. To this end, we devised a path change detector that we validated using both ad hoc experiments and feedback from cloud networking experts. The results provide three main insights: (1) Traffic Engineering (TE) frequently moves (TCP and UDP) flows among network paths of different latency, (2) Flows experience unfair performance, where a subset of flows between two machines can suffer large latency penalties (up to 32% at the 95th percentile) or excessive number of latency changes, and (3) Tenants may have incentives to selfishly move traffic to low latency classes (to boost the performance of their applications). We showcase this third insight with an example using rsync synchronization.

    To the best of our knowledge, this is the first paper to reveal the high frequency of TE activity within a large cloud provider's network. Based on these observations, we expect our paper to spur discussions and future research on how cloud providers and their tenants can ultimately reconcile their independent and possibly conflicting objectives. Our data is publicly available for reproducibility and further analysis at http://goo.gl/25BKte.

    Download full text (pdf)
    persistence_ccr
  • 37.
    Tanyingyong, Voravit
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Olsson, Robert
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Hidell, Markus
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Sjödin, Peter
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Communication Systems, CoS, Network Systems Laboratory (NS Lab).
    Ahlgren, Bengt
    RISE SICS.
    Implementation and Deployment of an Outdoor IoT-based Air Quality Monitoring Testbed2018In: 2018 IEEE Global Communications Conference, GLOBECOM 2018 - Proceedings, Institute of Electrical and Electronics Engineers (IEEE), 2018, article id 8647287Conference paper (Refereed)
    Abstract [en]

    This paper presents an outdoor IoT-based air quality monitoring testbed deployed in the city of Uppsala, Sweden. Our IoT sensing unit is designed and developed using low-cost hardware components and open source software, which makes it easy to replicate. We demonstrate that it can serve as an affordable solution for real-time measurements and has potentials to complement traditional monitoring to cover larger areas. We use low-power communication based on IEEE 802.15.4, RPL, and MQTT, and achieve high end-to-end delivery ratio (>98%) in an outdoor setting. Moreover, we carry out network analysis of our testbed and provide detailed insights into its characteristics.

1 - 37 of 37
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf