Skip to main content

2024 | OriginalPaper | Buchkapitel

Privacy-Preserving Clustering for Multi-dimensional Data Randomization Under LDP

verfasst von : Hiroaki Kikuchi

Erschienen in: ICT Systems Security and Privacy Protection

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Randomization of multi-dimensional data under local differential privacy is a significant and practical application of big data. Because of the dimensionality issues, most existing works suffer from low accuracy when estimating joint probability distributions. In this paper, a set of attributes is divided into smaller clusters where the attributes are associated in terms of their dependencies. A privacy-preserving algorithm is proposed to estimate the dependencies of an attribute without disclosing the private values in the multi-dimensional data. Local differential privacy is guaranteed in the scheme. Using the clusters of attributes, the joint probabilities for multi-dimensional data can be estimated efficiently using two building blocks, called RR-independent and RR-Ind-Joint schemes. The experiments using some open datasets demonstrate that the dependencies of attributes can be estimated accurately and that the proposed algorithm outperforms existing state-of-the-art schemes in cases where the dimensionality is high.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Chen, R., Xiao, Q., Zhang, Y., Xu, J.: Differentially private high-dimensional data publication via sampling-based inference (2015) Chen, R., Xiao, Q., Zhang, Y., Xu, J.: Differentially private high-dimensional data publication via sampling-based inference (2015)
3.
Zurück zum Zitat Cramér, H.: Mathematical Methods of Statistics. Princeton University Press, Princeton (1946) Cramér, H.: Mathematical Methods of Statistics. Princeton University Press, Princeton (1946)
4.
Zurück zum Zitat Ding, B., Kulkarni, J., Yekhanin, S.: Collecting telemetry data privately. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 3574–3583. Curran Associates, Inc (2017) Ding, B., Kulkarni, J., Yekhanin, S.: Collecting telemetry data privately. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 3574–3583. Curran Associates, Inc (2017)
5.
Zurück zum Zitat Domingo-Ferrer, J., Soria-Comas, J.: Multi-dimensional randomized response. IEEE Trans. Knowl. Data Eng. 34(10), 4933–4946 (2022)CrossRef Domingo-Ferrer, J., Soria-Comas, J.: Multi-dimensional randomized response. IEEE Trans. Knowl. Data Eng. 34(10), 4933–4946 (2022)CrossRef
6.
Zurück zum Zitat Dwork, C., Roth, A.: The Algorithmic Foundations of Differential Privacy, vol. 9. Now Publishers Inc., Hanover (2014) Dwork, C., Roth, A.: The Algorithmic Foundations of Differential Privacy, vol. 9. Now Publishers Inc., Hanover (2014)
7.
Zurück zum Zitat Erlingsson, U., Pihur, V., Korolova, A.: Rappor: Randomized Aggregatable Privacy-preserving Ordinal Response. Association for Computing Machinery, New York (2014)CrossRef Erlingsson, U., Pihur, V., Korolova, A.: Rappor: Randomized Aggregatable Privacy-preserving Ordinal Response. Association for Computing Machinery, New York (2014)CrossRef
9.
Zurück zum Zitat Jiang, X., Zhou, X., Grossklags, J.: Privacy-preserving high-dimensional data collection with federated generative autoencoder. Proc. Priv. Enhancing Technol 2022(1), 481–500 (2022)CrossRef Jiang, X., Zhou, X., Grossklags, J.: Privacy-preserving high-dimensional data collection with federated generative autoencoder. Proc. Priv. Enhancing Technol 2022(1), 481–500 (2022)CrossRef
11.
Zurück zum Zitat McSherry, F.D.: Privacy Integrated Queries: An Extensible Platform for Privacy-preserving Data Analysis. Association for Computing Machinery, New York (2009)CrossRef McSherry, F.D.: Privacy Integrated Queries: An Extensible Platform for Privacy-preserving Data Analysis. Association for Computing Machinery, New York (2009)CrossRef
12.
Zurück zum Zitat Meek, C., Thiesson, B., Heckerman, D.: The learning curve method applied to clustering. In: Richardson, T.S., Jaakkola, T.S. (eds.) Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. R3, pp. 196–202. PMLR (2001) Meek, C., Thiesson, B., Heckerman, D.: The learning curve method applied to clustering. In: Richardson, T.S., Jaakkola, T.S. (eds.) Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. R3, pp. 196–202. PMLR (2001)
13.
Zurück zum Zitat Qardaji, W., Yang, W., Li, N.: Priview: Practical Differentially Private Release of Marginal Contingency Tables. Association for Computing Machinery, New York (2014)CrossRef Qardaji, W., Yang, W., Li, N.: Priview: Practical Differentially Private Release of Marginal Contingency Tables. Association for Computing Machinery, New York (2014)CrossRef
15.
Zurück zum Zitat Ren, X., et al.: LoPub: high-dimensional crowdsourced data publication with local differential privacy. IEEE Trans. Inf. Forensics Secur. 13(9), 2151–2166 (2018)CrossRef Ren, X., et al.: LoPub: high-dimensional crowdsourced data publication with local differential privacy. IEEE Trans. Inf. Forensics Secur. 13(9), 2151–2166 (2018)CrossRef
17.
Zurück zum Zitat Tolstikhin, I., Bousquet, O., Gelly, S., Schoelkopf, B.: Wasserstein auto-encoders. In: International Conference on Learning Representations (2018) Tolstikhin, I., Bousquet, O., Gelly, S., Schoelkopf, B.: Wasserstein auto-encoders. In: International Conference on Learning Representations (2018)
18.
Zurück zum Zitat Wang, T., Blocki, J., Li, N., Jha, S.: Locally differentially private protocols for frequency estimation. In: 26th USENIX Security Symposium (USENIX Security 17), pp. 729–745. Vancouver, BC (2017) Wang, T., Blocki, J., Li, N., Jha, S.: Locally differentially private protocols for frequency estimation. In: 26th USENIX Security Symposium (USENIX Security 17), pp. 729–745. Vancouver, BC (2017)
19.
Zurück zum Zitat Warner, S.L.: Randomized response: a survey technique for eliminating evasive answer bias. J. Am. Stat. Assoc. 60(309), 63–69 (1965)CrossRef Warner, S.L.: Randomized response: a survey technique for eliminating evasive answer bias. J. Am. Stat. Assoc. 60(309), 63–69 (1965)CrossRef
20.
Zurück zum Zitat Xu, C., Ren, J., Zhang, Y., Qin, Z., Ren, K.: DPPro: differentially private high-dimensional data release via random projection. IEEE Trans. Inf. Forensics Secur. 12(12), 3081–3093 (2017)CrossRef Xu, C., Ren, J., Zhang, Y., Qin, Z., Ren, K.: DPPro: differentially private high-dimensional data release via random projection. IEEE Trans. Inf. Forensics Secur. 12(12), 3081–3093 (2017)CrossRef
21.
Zurück zum Zitat Zhang, J., Cormode, G., Procopiuc, C.M., Srivastava, D., Xiao, X.: Privbayes: private data release via Bayesian networks. ACM Trans. Database Syst. (TODS) 42(4), 1–41 (2017)MathSciNetCrossRef Zhang, J., Cormode, G., Procopiuc, C.M., Srivastava, D., Xiao, X.: Privbayes: private data release via Bayesian networks. ACM Trans. Database Syst. (TODS) 42(4), 1–41 (2017)MathSciNetCrossRef
22.
Zurück zum Zitat Zhang, Z., Wang, T., Li, N., He, S., Chen, J.: CALM: Consistent Adaptive Local Marginal for Marginal Release Under Local Differential Privacy. Association for Computing Machinery, New York (2018) Zhang, Z., Wang, T., Li, N., He, S., Chen, J.: CALM: Consistent Adaptive Local Marginal for Marginal Release Under Local Differential Privacy. Association for Computing Machinery, New York (2018)
Metadaten
Titel
Privacy-Preserving Clustering for Multi-dimensional Data Randomization Under LDP
verfasst von
Hiroaki Kikuchi
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-56326-3_2

Premium Partner