nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

On Dark Knowledge for Distilling Generators

verfasst von : Chi Hong, Robert Birke, Pin-Yu Chen, Lydia Y. Chen

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer Nature Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Knowledge distillation has been applied on generative models, such as Variational Autoencoder (VAE) and Generative Adversarial Networks (GANs). To distill the knowledge, the synthetic outputs of a teacher generator are used to train a student model. While the dark knowledge, i.e., the probabilistic output, is well explored in distilling classifiers, little is known about the existence of an equivalent dark knowledge for generative models and its extractability. In this paper, we derive the first kind of empirical risk bound for distilling generative models from a Bayesian perspective. Through our analysis, we show the existence of the dark knowledge for generative models, i.e., Bayes probability distribution of a synthetic output from a given input, which achieves lower empirical risk bound than merely using the synthetic output of the generators. Furthermore, we propose a Dark Knowledge based Distillation, DKtill, which trains the student generator based on the (approximate) dark knowledge. Our extensive evaluation on distilling VAE, conditional GANs, and translation GANs on Facades and CelebA datasets show that the FID of student generators trained by DKtill combining dark knowledge are lower than student generators trained only by the synthetic outputs by up to 42.66%, and 78.99%, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Enhancing YOLOv7 for Plant Organs Detection Using Attention-Gate Mechanism

Nächstes Kapitel RPH-PGD: Randomly Projected Hessian for Perturbed Gradient Descent

We interchangeably use teacher or target model/generator.

The analysis of this paper can be straightforwardly extended to three channel images.

Aguinaldo, A., Chiang, P., Gain, A., Patil, A., Pearson, K., Feizi, S.: Compressing GANs using knowledge distillation. CoRR abs/1902.00159 (2019)

Chandrasekaran, V., Chaudhuri, K., Giacomelli, I., Jha, S., Yan, S.: Exploring connections between active learning and model extraction. In: USENIX Security (2020)

Chen, H., et al.: Distilling portable generative adversarial networks for image translation. In: AAAI (2020)

Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015)

Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR, pp. 1125–1134 (2017)

Jagielski, M., Carlini, N., Berthelot, D., Kurakin, A., Papernot, N.: High accuracy and high fidelity extraction of neural networks. In: USENIX Security (2020)

Ji, G., Zhu, Z.: Knowledge distillation in wide neural networks: risk bound, data efficiency and imperfect teacher. In: NeurIPS 2020 (2020)

Kanwal, N., Eftestøl, T., Khoraminia, F., Zuiverloon, T.C., Engan, K.: Vision transformers for small histological datasets learned through knowledge distillation. In: PAKDD (2023)

Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2014)

10.

Krishna, K., Tomar, G.S., Parikh, A.P., Papernot, N., Iyyer, M.: Thieves on sesame street! Model extraction of BERT-based APIs. In: ICLR (2020)

11.

Liu, Z., Zhu, Y., Gao, Z., Sheng, X., Xu, L.: Itrievalkd: an iterative retrieval framework assisted with knowledge distillation for noisy text-to-image retrieval. In: PAKDD (2023)

12.

Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: (ICCV) (2015)

13.

Lopez-Paz, D., Bottou, L., Schölkopf, B., Vapnik, V.: Unifying distillation and privileged information. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2016)

14.

Maurer, A., Pontil, M.: Empirical Bernstein bounds and sample variance penalization. In: COLT 2009 - The 22nd Conference on Learning Theory (2009)

15.

Mobahi, H., Farajtabar, M., Bartlett, P.L.: Self-distillation amplifies regularization in Hilbert space. In: NeurIPS (2020)

16.

Phuong, M., Lampert, C.H.: Towards understanding knowledge distillation. CoRR abs/2105.13093 (2021)

17.

Pu, Y., Gan, Z., Henao, R., Yuan, X., Li, C., Stevens, A., Carin, L.: Variational autoencoder for deep learning of images, labels and captions. In: NIPS 29 (2016)

18.

Truong, J., Maini, P., Walls, R.J., Papernot, N.: Data-free model extraction. In: CVPR (2021)

19.

Wang, X., Zhang, R., Sun, Y., Qi, J.: KDGAN: knowledge distillation with generative adversarial networks. In: NeurIPS, pp. 783–794 (2018)

20.

Zhang, Z., Sabuncu, M.R.: Self-distillation as instance-specific label smoothing. In: NeurIPS (2020)

21.

Zhou, H., et al.: Rethinking soft labels for knowledge distillation: a bias-variance tradeoff perspective. In: 9th International Conference on Learning Representations, ICLR 2021 (2021)

22.

Zhou, M., Wu, J., Liu, Y., Liu, S., Zhu, C.: Dast: data-free substitute training for adversarial attacks. In: CVPR, pp. 231–240. IEEE (2020)

Titel: On Dark Knowledge for Distilling Generators
verfasst von: Chi Hong
Robert Birke
Pin-Yu Chen
Lydia Y. Chen
Verlag: Springer Nature Singapore
Buch: Advances in Knowledge Discovery and Data Mining
Print ISBN: 978-981-9722-52-5

Electronic ISBN: 978-981-9722-53-2

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-981-97-2253-2_19

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner