Asymptotic Utility of Spectral Anonymization

dc.contributor.authorPerkonoja, Katariina
dc.contributor.authorVirta, Joni
dc.contributor.organizationfi=terveysteknologia|en=Health Technology|
dc.contributor.organizationfi=tilastotiede|en=Statistics|
dc.contributor.organization-code1.2.246.10.2458963.20.28696315432
dc.contributor.organization-code1.2.246.10.2458963.20.42133013740
dc.converis.publication-id458391801
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/458391801
dc.date.accessioned2026-01-21T13:32:40Z
dc.date.available2026-01-21T13:32:40Z
dc.description.abstract<p>In the contemporary data landscape characterized by multi-source data collection and third-party sharing, ensuring individual privacy stands as a critical concern. While various anonymization methods exist, their utility preservation and privacy guarantees remain challenging to quantify. In this work, we address this gap by studying the utility and privacy of the spectral anonymization (SA) algorithm, particularly in an asymptotic framework. Unlike conventional anonymization methods that directly modify the original data, SA operates by perturbing the data in a spectral basis and subsequently reverting them to their original basis. Alongside the original version P-SA, employing random permutation transformation, we introduce two novel SA variants: J-spectral anonymization and O-spectral anonymization, which employ sign-change and orthogonal matrix transformations, respectively. We show how well, under some practical assumptions, these SA algorithms preserve the first and second moments of the original data. Our results reveal, in particular, that the asymptotic efficiency of all three SA algorithms in covariance estimation is exactly 50% when compared to the original data. To assess the applicability of these asymptotic results in practice, we conduct a simulation study with finite data and also evaluate the privacy protection offered by these algorithms using distance-based record linkage. Our research reveals that while no method exhibits clear superiority in finite-sample utility, O-SA distinguishes itself for its exceptional privacy preservation, never producing identical records, albeit with increased computational complexity. Conversely, P-SA emerges as a computationally efficient alternative, demonstrating unmatched efficiency in mean estimation.<br></p>
dc.format.pagerange51
dc.format.pagerange66
dc.identifier.eisbn978-3-031-69651-0
dc.identifier.isbn978-3-031-69650-3
dc.identifier.issn0302-9743
dc.identifier.jour-issn0302-9743
dc.identifier.olddbid213064
dc.identifier.oldhandle10024/196082
dc.identifier.urihttps://www.utupub.fi/handle/11111/54655
dc.identifier.urlhttps://link.springer.com/chapter/10.1007/978-3-031-69651-0_4
dc.identifier.urnURN:NBN:fi-fe2025082786877
dc.language.isoen
dc.okm.affiliatedauthorPerkonoja, Katariina
dc.okm.affiliatedauthorVirta, Joni
dc.okm.discipline111 Mathematicsen_GB
dc.okm.discipline112 Statistics and probabilityen_GB
dc.okm.discipline111 Matematiikkafi_FI
dc.okm.discipline112 Tilastotiedefi_FI
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA4 Conference Article
dc.publisher.countrySwitzerlanden_GB
dc.publisher.countrySveitsifi_FI
dc.publisher.country-codeCH
dc.relation.conferenceInternational Conference on Privacy in Statistical Databases
dc.relation.doi10.1007/978-3-031-69651-0_4
dc.relation.ispartofjournalLecture Notes in Computer Science
dc.relation.volume14915
dc.source.identifierhttps://www.utupub.fi/handle/10024/196082
dc.titleAsymptotic Utility of Spectral Anonymization
dc.title.bookPrivacy in Statistical Databases : International Conference, PSD 2024, Antibes Juan-les-Pins, France, September 25–27, 2024 Proceedings
dc.year.issued2024

Tiedostot