kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Towards generating privacy-preserving and useful time series ECG data for arrhythmia detection
KTH, School of Engineering Sciences (SCI), Mathematics (Dept.), Mathematics (Div.).
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Mot att generera integritetsskyddande och användbara tids EKG-data i tidsserier för arytmi detektering (Swedish)
Abstract [en]

Differential privacy has emerges as the de-facto standard for ensuring privacy of various data-related tasks, but commonly comes with a loss in data utility. This thesis challenges the privacy-utility-tradeoff for differential private synthetic ECG data by looking at one particular common machine learning problem — anomaly detection. Therefore, two generators are designed with differential privacy guarantees. The utility of the synthetic data is further assessed by training a downstream anomaly detection model on this synthetic data. This method is verified on the MITBIH ECG data set. We found that for anomaly detection the privacy-utility-tradeofffor synthetic data is more nuanced: replacing the original data with (non-privacy-preserving) synthetic data degrades the utility for anomaly detection but there is only a slight decrease in utility when adding a differential private mechanism to the data generation process. Furthermore, we observed that both our adapted time series data generator and the differential private mechanism introduce robustness to the synthetic data.

Abstract [sv]

Differentiell integritet har vuxit fram som de-facto-standard för att säkerställa integritet för olika datarelaterade uppgifter, men kommer ofta med en förlust av datanytta. Denna avhandling utmanar avvägningen mellan integritet och användbarhet för differentierade privata syntetiska EKG-data genom att titta på ett särskilt vanligt maskininlärningsproblem --- anomalidetektering. Därför utformas två generatorer med differentierade integritetsgarantier. Nyttan av de syntetiska data bedöms ytterligare genom att träna en nedströms anomalidetekteringsmodell på dessa syntetiska data. Denna metod verifieras på MITBIH ECG-datauppsättningen. Vi fann att för anomalidetektering är avvägningen mellan integritet och nytta för syntetiska data mer nyanserad: att ersätta originaldata med (icke integritetsbevarande) syntetiska data försämrar nyttan för anomalidetektering, men det sker endast en liten minskning av nyttan när en differentierad privat mekanism läggs till i datagenereringsprocessen. Dessutom observerade vi att både vår anpassade tidsseriedatagenerator och den differentiella privata mekanismen ger robusthet till de syntetiska data.

Place, publisher, year, edition, pages
2024. , p. 74
Series
TRITA-SCI-GRU ; 2024:428
Keywords [en]
Differential Privacy, Synthetic Data, Anomaly Detection, Heartbeat Arrythmia, Privacy-preserving Machine Learning, Privacy-Utility-tradeoff, Times Series Data Generation
Keywords [sv]
Differentiell integritet, syntetiska data, anomalidetektering, hj¨artrytm, integritetssky- ddande maskininl¨arning, integritets- och nyttoavv¨agning, generering av tidsseriedata
National Category
Computational Mathematics
Identifiers
URN: urn:nbn:se:kth:diva-359216OAI: oai:DiVA.org:kth-359216DiVA, id: diva2:1932343
External cooperation
RISE
Educational program
Master of Science - Applied and Computational Mathematics
Supervisors
Examiners
Available from: 2025-01-29 Created: 2025-01-29 Last updated: 2025-01-29Bibliographically approved

Open Access in DiVA

fulltext(3169 kB)46 downloads
File information
File name FULLTEXT01.pdfFile size 3169 kBChecksum SHA-512
17ca9f1a0c3cd13dea0b4bf42b981c9cd854668d60415c228437bffd782102f2b8a675de3bce96e87c2dd5796d40cb56d513a7389e3bd652f301d81f65a56ee3
Type fulltextMimetype application/pdf

By organisation
Mathematics (Div.)
Computational Mathematics

Search outside of DiVA

GoogleGoogle Scholar
Total: 46 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 536 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf