kth.sePublications
Change search
Refine search result
1 - 3 of 3
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ciaperoni, Martino
    et al.
    Department of Computer Science, Aalto University, Espoo, Finland.
    Xiao, Han
    Department of Computer Science, Aalto University, Espoo, Finland.
    Gionis, Aristides
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS.
    Concise and interpretable multi-label rule sets2023In: Knowledge and Information Systems, ISSN 0219-1377, E-ISSN 0219-3116, Vol. 65, no 12, p. 5657-5694Article in journal (Refereed)
    Abstract [en]

    Multi-label classification is becoming increasingly ubiquitous, but not much attention has been paid to interpretability. In this paper, we develop a multi-label classifier that can be represented as a concise set of simple “if-then” rules, and thus, it offers better interpretability compared to black-box models. Notably, our method is able to find a small set of relevant patterns that lead to accurate multi-label classification, while existing rule-based classifiers are myopic and wasteful in searching rules, requiring a large number of rules to achieve high accuracy. In particular, we formulate the problem of choosing multi-label rules to maximize a target function, which considers not only discrimination ability with respect to labels, but also diversity. Accounting for diversity helps to avoid redundancy, and thus, to control the number of rules in the solution set. To tackle the said maximization problem, we propose a 2-approximation algorithm, which circumvents the exponential-size search space of rules using a novel technique to sample highly discriminative and diverse rules. In addition to our theoretical analysis, we provide a thorough experimental evaluation and a case study, which indicate that our approach offers a trade-off between predictive performance and interpretability that is unmatched in previous work.

  • 2.
    Görnerup, Olof
    et al.
    RISE.
    Gillblad, Daniel
    RISE.
    Vasiloudis, Theodore
    RISE.
    Domain-agnostic discovery of similarities and concepts at scale2017In: Knowledge and Information Systems, ISSN 0219-1377, E-ISSN 0219-3116, Vol. 51, no 2, p. 531-560Article in journal (Refereed)
    Abstract [en]

    Appropriately defining and efficiently calculating similarities from large data sets are often essential in data mining, both for gaining understanding of data and generating processes and for building tractable representations. Given a set of objects and their correlations, we here rely on the premise that each object is characterized by its context, i.e., its correlations to the other objects. The similarity between two objects can then be expressed in terms of the similarity between their contexts. In this way, similarity pertains to the general notion that objects are similar if they are exchangeable in the data. We propose a scalable approach for calculating all relevant similarities among objects by relating them in a correlation graph that is transformed to a similarity graph. These graphs can express rich structural properties among objects. Specifically, we show that concepts—abstractions of objects—are constituted by groups of similar objects that can be discovered by clustering the objects in the similarity graph. These principles and methods are applicable in a wide range of fields and will be demonstrated here in three domains: computational linguistics, music, and molecular biology, where the numbers of objects and correlations range from small to very large.

  • 3.
    Matakos, Antonis
    et al.
    Department of Computer Science, Aalto University, Espoo, Finland.
    Tu, Sijing
    Department of Computer Science, Aalto University, Espoo, Finland.
    Gionis, Aristides
    KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Theoretical Computer Science, TCS. Department of Computer Science, Aalto University, Espoo, Finland.
    Tell me something my friends do not know: diversity maximization in social networks2020In: Knowledge and Information Systems, ISSN 0219-1377, E-ISSN 0219-3116Article in journal (Refereed)
    Abstract [en]

    Social media have a great potential to improve information dissemination in our society, yet they have been held accountable for a number of undesirable effects, such as polarization and filter bubbles. It is thus important to understand these negative phenomena and develop methods to combat them. In this paper, we propose a novel approach to address the problem of breaking filter bubbles in social media. We do so by aiming to maximize the diversity of the information exposed to connected social-media users. We formulate the problem of maximizing the diversity of exposure as a quadratic-knapsack problem. We show that the proposed diversity-maximization problem is inapproximable, and thus, we resort to polynomial nonapproximable algorithms, inspired by solutions developed for the quadratic-knapsack problem, as well as scalable greedy heuristics. We complement our algorithms with instance-specific upper bounds, which are used to provide empirical approximation guarantees for the given problem instances. Our experimental evaluation shows that a proposed greedy algorithm followed by randomized local search is the algorithm of choice given its quality-vs.-efficiency trade-off.

1 - 3 of 3
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf