nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

StyleAutoEncoder for Manipulating Image Attributes Using Pre-trained StyleGAN

verfasst von : Andrzej Bedychaj, Jacek Tabor, Marek Śmieja

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer Nature Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Deep conditional generative models are excellent tools for creating high-quality images and editing their attributes. However, training modern generative models from scratch is very expensive and requires large computational resources. In this paper, we introduce StyleAutoEncoder (StyleAE), a lightweight AutoEncoder module, which works as a plugin for pre-trained generative models and allows for manipulating the requested attributes of images. The proposed method offers a cost-effective solution for training deep generative models with limited computational resources, making it a promising technique for a wide range of applications. We evaluate StyleAE by combining it with StyleGAN, which is currently one of the top generative models. Our experiments demonstrate that StyleAE is at least as effective in manipulating image attributes as the state-of-the-art algorithms based on invertible normalizing flows. However, it is simpler, faster, and gives more freedom in designing neural architecture.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Path-Aware Cross-Attention Network for Question Answering

Nächstes Kapitel SEE: Spherical Embedding Expansion for Improving Deep Metric Learning

Abdal, R., Qin, Y., Wonka, P.: Image2StyleGAN: how to embed images into the StyleGAN latent space? CoRR abs/1904.03189 (2019)

Abdal, R., Zhu, P., Femiani, J., Mitra, N.J., Wonka, P.: CLIP2StyleGAN: unsupervised extraction of StyleGAN edit directions. CoRR abs/2112.05219 (2021)

Abdal, R., Zhu, P., Mitra, N.J., Wonka, P.: StyleFlow: attribute-conditioned exploration of StyleGAN-generated images using conditional continuous normalizing flows. CoRR abs/2008.02401 (2020)

Cha, J., Thiyagalingam, J.: Disentangling autoencoders (DAE) (2022)

Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: interpretable representation learning by information maximizing generative adversarial nets (2016)

Choi, Y., Uh, Y., Yoo, J., Ha, J.: StarGAN v2: diverse image synthesis for multiple domains. CoRR abs/1912.01865 (2019)

Deng, J., Guo, J., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. CoRR abs/1801.07698 (2018)

Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real NVP (2016)

Gao, Y., et al.: High-fidelity and arbitrary face editing. CoRR abs/2103.15814 (2021)

10.

Goodfellow, I.J., et al.: Generative adversarial networks (2014)

11.

Härkönen, E., Hertzmann, A., Lehtinen, J., Paris, S.: GANSpace: discovering interpretable GAN controls. CoRR abs/2004.02546 (2020)

12.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015)

13.

He, Z., Zuo, W., Kan, M., Shan, S., Chen, X.: AttGAN: facial attribute editing by only changing what you want. IEEE Trans. Image Process. 28(11), 5464–5478 (2019)MathSciNetCrossRef

14.

Ho, J., Chen, X., Srinivas, A., Duan, Y., Abbeel, P.: Flow++: improving flow-based generative models with variational dequantization and architecture design. CoRR abs/1902.00275 (2019)

15.

Karras, T., et al.: Alias-free generative adversarial networks. In: Proceedings of the NeurIPS (2021)

16.

Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. CoRR abs/1812.04948 (2018)

17.

Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN. CoRR abs/1912.04958 (2019)

18.

Kingma, D.P., Dhariwal, P.: Glow: generative flow with invertible 1 \(\times \) 1 convolutions (2018)

19.

Kingma, D.P., Rezende, D.J., Mohamed, S., Welling, M.: Semi-supervised learning with deep generative models. CoRR abs/1406.5298 (2014)

20.

Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)

21.

Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., Ranzato, M.: Fader networks: manipulating images by sliding attributes. CoRR abs/1706.00409 (2017)

22.

Liu, R., Liu, Y., Gong, X., Wang, X., Li, H.: Conditional adversarial generative flow for controllable image synthesis. CoRR abs/1904.01782 (2019)

23.

Preechakul, K., Chatthee, N., Wizadwongsa, S., Suwajanakorn, S.: Diffusion autoencoders: toward a meaningful and decodable representation. In: CVPR (2022)

24.

Shen, Y., Yang, C., Tang, X., Zhou, B.: InterFaceGAN: interpreting the disentangled face representation learned by GANs. CoRR abs/2005.09635 (2020)

25.

Suwała, A., Wójcik, B., Proszewska, M., Tabor, J., Spurek, P., Śmieja, M.: Face identity-aware disentanglement in StyleGAN. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 5222–5231 (2024)

26.

Tewari, A., et al.: PIE: portrait image embedding for semantic control. ACM Trans. Graph. 39(6), 1–14 (2020)CrossRef

27.

Vidal, A., Wu Fung, S., Tenorio, L., Osher, S., Nurbekyan, L.: Taming hyperparameter tuning in continuous normalizing flows using the JKO scheme. Sci. Rep. 13, 4501 (2023)CrossRef

28.

Wang, H., Yu, N., Fritz, M.: Hijack-GAN: unintended-use of pretrained, black-box GANs. CoRR abs/2011.14107 (2020)

29.

Wołczyk, M., et al.: PluGeN: multi-label conditional generation from pre-trained models. In: AAAI 2022 (2022)

Titel: StyleAutoEncoder for Manipulating Image Attributes Using Pre-trained StyleGAN
verfasst von: Andrzej Bedychaj
Jacek Tabor
Marek Śmieja
Verlag: Springer Nature Singapore
Buch: Advances in Knowledge Discovery and Data Mining
Print ISBN: 978-981-9722-52-5

Electronic ISBN: 978-981-9722-53-2

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-981-97-2253-2_10

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner