Skip to main content

2024 | OriginalPaper | Buchkapitel

On Dark Knowledge for Distilling Generators

verfasst von : Chi Hong, Robert Birke, Pin-Yu Chen, Lydia Y. Chen

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Knowledge distillation has been applied on generative models, such as Variational Autoencoder (VAE) and Generative Adversarial Networks (GANs). To distill the knowledge, the synthetic outputs of a teacher generator are used to train a student model. While the dark knowledge, i.e., the probabilistic output, is well explored in distilling classifiers, little is known about the existence of an equivalent dark knowledge for generative models and its extractability. In this paper, we derive the first kind of empirical risk bound for distilling generative models from a Bayesian perspective. Through our analysis, we show the existence of the dark knowledge for generative models, i.e., Bayes probability distribution of a synthetic output from a given input, which achieves lower empirical risk bound than merely using the synthetic output of the generators. Furthermore, we propose a Dark Knowledge based Distillation, DKtill, which trains the student generator based on the (approximate) dark knowledge. Our extensive evaluation on distilling VAE, conditional GANs, and translation GANs on Facades and CelebA datasets show that the FID of student generators trained by DKtill combining dark knowledge are lower than student generators trained only by the synthetic outputs by up to 42.66%, and 78.99%, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
We interchangeably use teacher or target model/generator.
 
2
The analysis of this paper can be straightforwardly extended to three channel images.
 
Literatur
1.
Zurück zum Zitat Aguinaldo, A., Chiang, P., Gain, A., Patil, A., Pearson, K., Feizi, S.: Compressing GANs using knowledge distillation. CoRR abs/1902.00159 (2019) Aguinaldo, A., Chiang, P., Gain, A., Patil, A., Pearson, K., Feizi, S.: Compressing GANs using knowledge distillation. CoRR abs/1902.00159 (2019)
2.
Zurück zum Zitat Chandrasekaran, V., Chaudhuri, K., Giacomelli, I., Jha, S., Yan, S.: Exploring connections between active learning and model extraction. In: USENIX Security (2020) Chandrasekaran, V., Chaudhuri, K., Giacomelli, I., Jha, S., Yan, S.: Exploring connections between active learning and model extraction. In: USENIX Security (2020)
3.
Zurück zum Zitat Chen, H., et al.: Distilling portable generative adversarial networks for image translation. In: AAAI (2020) Chen, H., et al.: Distilling portable generative adversarial networks for image translation. In: AAAI (2020)
4.
Zurück zum Zitat Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015) Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015)
5.
Zurück zum Zitat Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR, pp. 1125–1134 (2017) Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR, pp. 1125–1134 (2017)
6.
Zurück zum Zitat Jagielski, M., Carlini, N., Berthelot, D., Kurakin, A., Papernot, N.: High accuracy and high fidelity extraction of neural networks. In: USENIX Security (2020) Jagielski, M., Carlini, N., Berthelot, D., Kurakin, A., Papernot, N.: High accuracy and high fidelity extraction of neural networks. In: USENIX Security (2020)
7.
Zurück zum Zitat Ji, G., Zhu, Z.: Knowledge distillation in wide neural networks: risk bound, data efficiency and imperfect teacher. In: NeurIPS 2020 (2020) Ji, G., Zhu, Z.: Knowledge distillation in wide neural networks: risk bound, data efficiency and imperfect teacher. In: NeurIPS 2020 (2020)
8.
Zurück zum Zitat Kanwal, N., Eftestøl, T., Khoraminia, F., Zuiverloon, T.C., Engan, K.: Vision transformers for small histological datasets learned through knowledge distillation. In: PAKDD (2023) Kanwal, N., Eftestøl, T., Khoraminia, F., Zuiverloon, T.C., Engan, K.: Vision transformers for small histological datasets learned through knowledge distillation. In: PAKDD (2023)
9.
Zurück zum Zitat Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2014) Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2014)
10.
Zurück zum Zitat Krishna, K., Tomar, G.S., Parikh, A.P., Papernot, N., Iyyer, M.: Thieves on sesame street! Model extraction of BERT-based APIs. In: ICLR (2020) Krishna, K., Tomar, G.S., Parikh, A.P., Papernot, N., Iyyer, M.: Thieves on sesame street! Model extraction of BERT-based APIs. In: ICLR (2020)
11.
Zurück zum Zitat Liu, Z., Zhu, Y., Gao, Z., Sheng, X., Xu, L.: Itrievalkd: an iterative retrieval framework assisted with knowledge distillation for noisy text-to-image retrieval. In: PAKDD (2023) Liu, Z., Zhu, Y., Gao, Z., Sheng, X., Xu, L.: Itrievalkd: an iterative retrieval framework assisted with knowledge distillation for noisy text-to-image retrieval. In: PAKDD (2023)
12.
Zurück zum Zitat Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: (ICCV) (2015) Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: (ICCV) (2015)
13.
Zurück zum Zitat Lopez-Paz, D., Bottou, L., Schölkopf, B., Vapnik, V.: Unifying distillation and privileged information. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2016) Lopez-Paz, D., Bottou, L., Schölkopf, B., Vapnik, V.: Unifying distillation and privileged information. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2016)
14.
Zurück zum Zitat Maurer, A., Pontil, M.: Empirical Bernstein bounds and sample variance penalization. In: COLT 2009 - The 22nd Conference on Learning Theory (2009) Maurer, A., Pontil, M.: Empirical Bernstein bounds and sample variance penalization. In: COLT 2009 - The 22nd Conference on Learning Theory (2009)
15.
Zurück zum Zitat Mobahi, H., Farajtabar, M., Bartlett, P.L.: Self-distillation amplifies regularization in Hilbert space. In: NeurIPS (2020) Mobahi, H., Farajtabar, M., Bartlett, P.L.: Self-distillation amplifies regularization in Hilbert space. In: NeurIPS (2020)
16.
Zurück zum Zitat Phuong, M., Lampert, C.H.: Towards understanding knowledge distillation. CoRR abs/2105.13093 (2021) Phuong, M., Lampert, C.H.: Towards understanding knowledge distillation. CoRR abs/2105.13093 (2021)
17.
Zurück zum Zitat Pu, Y., Gan, Z., Henao, R., Yuan, X., Li, C., Stevens, A., Carin, L.: Variational autoencoder for deep learning of images, labels and captions. In: NIPS 29 (2016) Pu, Y., Gan, Z., Henao, R., Yuan, X., Li, C., Stevens, A., Carin, L.: Variational autoencoder for deep learning of images, labels and captions. In: NIPS 29 (2016)
18.
Zurück zum Zitat Truong, J., Maini, P., Walls, R.J., Papernot, N.: Data-free model extraction. In: CVPR (2021) Truong, J., Maini, P., Walls, R.J., Papernot, N.: Data-free model extraction. In: CVPR (2021)
19.
Zurück zum Zitat Wang, X., Zhang, R., Sun, Y., Qi, J.: KDGAN: knowledge distillation with generative adversarial networks. In: NeurIPS, pp. 783–794 (2018) Wang, X., Zhang, R., Sun, Y., Qi, J.: KDGAN: knowledge distillation with generative adversarial networks. In: NeurIPS, pp. 783–794 (2018)
20.
Zurück zum Zitat Zhang, Z., Sabuncu, M.R.: Self-distillation as instance-specific label smoothing. In: NeurIPS (2020) Zhang, Z., Sabuncu, M.R.: Self-distillation as instance-specific label smoothing. In: NeurIPS (2020)
21.
Zurück zum Zitat Zhou, H., et al.: Rethinking soft labels for knowledge distillation: a bias-variance tradeoff perspective. In: 9th International Conference on Learning Representations, ICLR 2021 (2021) Zhou, H., et al.: Rethinking soft labels for knowledge distillation: a bias-variance tradeoff perspective. In: 9th International Conference on Learning Representations, ICLR 2021 (2021)
22.
Zurück zum Zitat Zhou, M., Wu, J., Liu, Y., Liu, S., Zhu, C.: Dast: data-free substitute training for adversarial attacks. In: CVPR, pp. 231–240. IEEE (2020) Zhou, M., Wu, J., Liu, Y., Liu, S., Zhu, C.: Dast: data-free substitute training for adversarial attacks. In: CVPR, pp. 231–240. IEEE (2020)
Metadaten
Titel
On Dark Knowledge for Distilling Generators
verfasst von
Chi Hong
Robert Birke
Pin-Yu Chen
Lydia Y. Chen
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-2253-2_19

Premium Partner