kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Personalized Privacy Amplification via Importance Sampling
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control).ORCID iD: 0000-0002-5530-2714
Linköping University and Uppsala University, Sweden.
Uppsala University, Sweden.
2024 (English)In: Transactions on Machine Learning Research, E-ISSN 2835-8856, Vol. 2024Article in journal (Refereed) Published
Abstract [en]

For scalable machine learning on large data sets, subsampling a representative subset is a common approach for efficient model training. This is often achieved through importance sampling, whereby informative data points are sampled more frequently. In this paper, we examine the privacy properties of importance sampling, focusing on an individualized privacy analysis. We find that, in importance sampling, privacy is well aligned with utility but at odds with sample size. Based on this insight, we propose two approaches for constructing sampling distributions: one that optimizes the privacy-efficiency trade-off; and one based on a utility guarantee in the form of coresets. We evaluate both approaches empirically in terms of privacy, efficiency, and accuracy on the differentially private k-means problem. We observe that both approaches yield similar outcomes and consistently outperform uniform sampling across a wide range of data sets. Our code is available on GitHub.

Place, publisher, year, edition, pages
Transactions on Machine Learning Research , 2024. Vol. 2024
National Category
Computer Sciences Probability Theory and Statistics Signal Processing
Identifiers
URN: urn:nbn:se:kth:diva-361194Scopus ID: 2-s2.0-85219504011OAI: oai:DiVA.org:kth-361194DiVA, id: diva2:1944149
Note

QC 20250313

Available from: 2025-03-12 Created: 2025-03-12 Last updated: 2025-03-13Bibliographically approved

Open Access in DiVA

No full text in DiVA

Scopus

Authority records

Fay, Dominik

Search in DiVA

By author/editor
Fay, Dominik
By organisation
Decision and Control Systems (Automatic Control)
In the same journal
Transactions on Machine Learning Research
Computer SciencesProbability Theory and StatisticsSignal Processing

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 12 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf