kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Publications (3 of 3) Show all publications
Foerster, K. T., Marette, T., Neumann, S., Plant, C., Sadikaj, Y., Schmid, S. & Velaj, Y. (2023). Analyzing the Communication Clusters in Datacenters. In: ACM Web Conference 2023: Proceedings of the World Wide Web Conference, WWW 2023. Paper presented at 2023 World Wide Web Conference, WWW 2023, Austin, United States of America, Apr 30 2023 - May 4 2023 (pp. 3022-3032). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Analyzing the Communication Clusters in Datacenters
Show others...
2023 (English)In: ACM Web Conference 2023: Proceedings of the World Wide Web Conference, WWW 2023, Association for Computing Machinery (ACM) , 2023, p. 3022-3032Conference paper, Published paper (Refereed)
Abstract [en]

Datacenter networks have become a critical infrastructure of our digital society and over the last years, great efforts have been made to better understand the communication patterns inside datacenters. In particular, existing empirical studies showed that datacenter traffic typically features much temporal and spatial structure, and that at any given time, some communication pairs interact much more frequently than others. This paper generalizes this study to communication groups and analyzes how clustered the datacenter traffic is, and how stable these clusters are over time. To this end, we propose a methodology which revolves around a biclustering approach, allowing us to identify groups of racks and servers which communicate frequently over the network. In particular, we consider communication patterns occurring in three different Facebook datacenters: a Web cluster consisting of web servers serving web traffic, a Database cluster which mainly consists of MySQL servers, and a Hadoop cluster. Interestingly, we find that in all three clusters, small groups of racks and servers can produce a large fraction of the network traffic, and we can determine these groups even when considering short snapshots of network traffic. We also show empirically that these clusters are fairly stable across time. Our insights on the size and stability of communication clusters hence uncover an interesting potential for resource optimizations in datacenter infrastructures.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2023
Keywords
Clustering, Data Center, Network Traffic
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-333332 (URN)10.1145/3543507.3583410 (DOI)2-s2.0-85159326653 (Scopus ID)
Conference
2023 World Wide Web Conference, WWW 2023, Austin, United States of America, Apr 30 2023 - May 4 2023
Note

Part of ISBN 9781450394161

QC 20230801

Available from: 2023-08-01 Created: 2023-08-01 Last updated: 2023-08-01Bibliographically approved
Marette, T., Miettinen, P. & Neumann, S. (2023). Visualizing Overlapping Biclusterings and Boolean Matrix Factorizations. In: Machine Learning and Knowledge Discovery in Databases: Research Track - European Conference, ECML PKDD 2023, Turin, Italy, September 18-22, 2023, Proceedings, Part I. Paper presented at European Conference, ECML PKDD 2023, Turin, Italy, September 18-22, 2023 (pp. 743-758). Springer Nature, 14169
Open this publication in new window or tab >>Visualizing Overlapping Biclusterings and Boolean Matrix Factorizations
2023 (English)In: Machine Learning and Knowledge Discovery in Databases: Research Track - European Conference, ECML PKDD 2023, Turin, Italy, September 18-22, 2023, Proceedings, Part I, Springer Nature , 2023, Vol. 14169, p. 743-758Conference paper, Published paper (Refereed)
Abstract [en]

Finding (bi-)clusters in bipartite graphs is a popular data analysis approach. Analysts typically want to visualize the clusters, which is simple as long as the clusters are disjoint. However, many modern algorithms find overlapping clusters, making visualization more complicated. In this paper, we study the problem of visualizing a given clustering of overlapping clusters in bipartite graphs and the related problem of visualizing Boolean Matrix Factorizations. We conceptualize three different objectives that any good visualization should satisfy: (1) proximity of cluster elements, (2) large consecutive areas of elements from the same cluster, and (3) large uninterrupted areas in the visualization, regardless of the cluster membership. We provide objective functions that capture these goals and algorithms that optimize these objective functions. Interestingly, in experiments on real-world datasets, we find that the best trade-off between these competing goals is achieved by a novel heuristic, which locally aims to place rows and columns with similar cluster membership next to each other.

Place, publisher, year, edition, pages
Springer Nature, 2023
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 14169
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-338942 (URN)10.1007/978-3-031-43412-9_44 (DOI)001156137100044 ()2-s2.0-85174449746 (Scopus ID)
Conference
European Conference, ECML PKDD 2023, Turin, Italy, September 18-22, 2023
Note

Part of book ISBN 9783031434112

QC 20231101

Available from: 2023-10-31 Created: 2023-10-31 Last updated: 2024-03-05Bibliographically approved
Marchal, L., Marette, T., Pichon, G. & Vivien, F. (2022). Trading performance for memory in sparse direct solvers using low-rank compression. Future Generation Computer Systems, 130, 307-320
Open this publication in new window or tab >>Trading performance for memory in sparse direct solvers using low-rank compression
2022 (English)In: Future Generation Computer Systems, ISSN 0167739X, Vol. 130, p. 307-320Article in journal (Refereed) Published
Abstract [en]

Sparse direct solvers using Block Low-Rank compression have been proven efficient to solve problems arising in many real-life applications. Improving those solvers is crucial for being able to (1) solve larger problems and (2) speed up computations. A main characteristic of a sparse direct solver using low-rank compression is at what point in the algorithm the compression is performed. There are two distinct approaches: (1) all blocks are compressed before starting the factorization, which reduces the memory as much as possible, or (2) each block is compressed as late as possible, which usually leads to better speedup. Approach 1 reaches a very small memory footprint generally at the expense of a greater execution time. Approach 2 achieves a smaller execution time but requires more memory. The objective of this paper is to design a composite approach, to speedup computations while staying under a given memory limit. This should allow to solve large problems that cannot be solved with Approach 2 while reducing the execution time compared to Approach 1. We propose a memory-aware strategy where each block can be compressed either at the beginning or as late as possible. We first consider the problem of choosing when to compress each block, under the assumption that all information on blocks is perfectly known, i.e., memory requirement and execution time of a block when compressed or not. We show that this problem is a variant of the NP-complete Knapsack problem, and adapt an existing approximation algorithm for our problem. Unfortunately, the required information on blocks depends on numerical properties and in practice cannot be known in advance. We thus introduce models to estimate those values. Experiments on the PaStiX solver demonstrate that our new approach can achieve an excellent trade-off between memory consumption and computational cost. For instance on matrix Geo1438, Approach 2 uses three times as much memory as Approach 1 while being three times faster. Our new approach leads to an execution time only 30% larger than Approach 2 when given a memory 30% larger than the one needed by Approach 1.

Place, publisher, year, edition, pages
Elsevier BV, 2022
Keywords
Sparse direct solvers, Low-rank compression, Scheduling, Memory constraints
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-338943 (URN)10.1016/J.FUTURE.2021.12.018 (DOI)000819692500024 ()2-s2.0-85122923974 (Scopus ID)
Note

QC 20231204

Available from: 2023-10-31 Created: 2023-10-31 Last updated: 2023-12-04Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-4090-7283

Search in DiVA

Show all publications