nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

A New Loss for Image Retrieval: Class Anchor Margin

verfasst von : Alexandru Ghiţă, Radu Tudor Ionescu

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer Nature Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The performance of neural networks in content-based image retrieval (CBIR) is highly influenced by the chosen loss (objective) function. The majority of objective functions for neural models can be divided into metric learning and statistical learning. Metric learning approaches require a pair mining strategy that often lacks efficiency, while statistical learning approaches are not generating highly compact features due to their indirect feature optimization. To this end, we propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimizes for the \(L_{2}\) metric without the need of generating pairs. Our loss is formed of three components. One leading objective ensures that the learned features are attracted to each designated learnable class anchor. The second loss component regulates the anchors and forces them to be separable by a margin, while the third objective ensures that the anchors do not collapse to zero. Furthermore, we develop a more efficient two-stage retrieval system by harnessing the learned class anchors during the first stage of the retrieval process, eliminating the need of comparing the query with every image in the database. We establish a set of three datasets (CIFAR-100, Food-101, and ImageNet-200) and evaluate the proposed objective on the CBIR task, by using both convolutional and transformer architectures. Compared to existing objective functions, our empirical evidence shows that the proposed objective is generating superior and more consistent results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Treatment Effect Estimation Under Unknown Interference

Nächstes Kapitel Personalized EDM Subject Generation via Co-factored User-Subject Embedding

Barz, B., Denzler, J.: Deep learning on small datasets without pre-training using cosine loss. In: Proceedings of WACV, pp. 1360–1369. IEEE (2020)

Bossard, L., Guillaumin, M., Van Gool, L.: Food-101 – mining discriminative components with random forests. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 446–461. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_29CrossRef

Cao, B., Araujo, A., Sim, J.: Unifying deep local and global features for image search. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12365, pp. 726–743. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58565-5_43CrossRef

Chen, W., Chen, X., Zhang, J., Huang, K.: Beyond triplet loss: a deep quadruplet network for person re-identification. In: Proceedings of CVPR, pp. 1320–1329. IEEE (2017)

Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)CrossRef

Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of CVPR, pp. 4685–4694. IEEE (2019)

Dubey, S.R.: A decade survey of content based image retrieval using deep learning. IEEE Trans. Circuits Syst. Video Technol. 32(5), 2687–2704 (2021)CrossRef

Elezi, I., et al.: the group loss++: a deeper look into group loss for deep metric learning. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 2505–2518 (2022)CrossRef

Gajić, B., Amato, A., Baldrich, R., van de Weijer, J., Gatta, C.: Area under the ROC curve maximization for metric learning. In: Proceedings of CVPR, pp. 2807–2816. IEEE (2022)

10.

Georgescu, M.I., Duţǎ, G.E., Ionescu, R.T.: Teacher-student training and triplet loss to reduce the effect of drastic face occlusion: application to emotion recognition, gender identification and age estimation. Mach. Vis. Appl. 33(1), 12 (2022)CrossRef

11.

Georgescu, M.I., Ionescu, R.T.: Teacher-student training and triplet loss for facial expression recognition under occlusion. In: Proceedings of ICPR, pp. 2288–2295. IEEE (2021)

12.

Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: Proceedings of CVPR, vol. 2, pp. 1735–1742. IEEE (2006)

13.

Harwood, B., Kumar, V.B., Carneiro, G., Reid, I., Drummond, T.: Smart mining for deep metric learning. In: Proceedings of ICCV, pp. 2821–2829. IEEE (2017)

14.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of CVPR, pp. 770–778. IEEE (2016)

15.

Khosla, P., et al.: Supervised contrastive learning. In: Proceedings of NeurIPS, vol. 33, pp. 18661–18673. Curran Associates, Inc. (2020)

16.

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of ICLR (2015)

17.

Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report, University of Toronto (2009)

18.

Lee, S., Seong, H., Lee, S., Kim, E.: Correlation verification for image retrieval. In: Proceedings of CVPR, pp. 5374–5384. IEEE (2022)

19.

Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep hypersphere embedding for face recognition. In: Proceedings of CVPR, pp. 6738–6746. IEEE, Los Alamitos, CA, USA (2017)

20.

Liu, W., Wen, Y., Yu, Z., Yang, M.: Large-margin softmax loss for convolutional neural networks. In: Proceedings of ICML, pp. 507–516. JMLR.org (2016)

21.

Liu, Z., et al.: Swin Transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of ICCV, pp. 10012–10022. IEEE (2021)

22.

Min, W., Mei, S., Li, Z., Jiang, S.: A two-stage triplet network training framework for image retrieval. IEEE Trans. Multimedia 22(12), 3128–3138 (2020)CrossRef

23.

Muller, S.G., Hutter, F.: TrivialAugment: tuning-free yet state-of-the-art data augmentation. In: Proceedings of ICCV, pp. 754–762. IEEE, Los Alamitos, CA, USA (2021)

24.

Murphy, K.P.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)

25.

Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Proceedings of NeurIPS, pp. 8024–8035. Curran Associates, Inc. (2019)

26.

Patel, Y., Tolias, G., Matas, J.: Recall@k surrogate loss with large batches and similarity mixup. In: Proceedings of CVPR, pp. 7502–7511. IEEE (2022)

27.

Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of CVPR, pp. 1–8. IEEE (2007)

28.

Polley, S., Mondal, S., Mannam, V.S., Kumar, K., Patra, S., Nürnberger, A.: X-vision: explainable image retrieval by re-ranking in semantic space. In: Proceedings of CIKM, pp. 4955–4959. Association for Computing Machinery, New York, NY, USA (2022)

29.

Radenović, F., Tolias, G., Chum, O.: Fine-tuning CNN image retrieval with no human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 41(7), 1655–1668 (2019)CrossRef

30.

Revaud, J., Almazán, J., Rezende, R.S., Souza, C.R.d.: Learning with average precision: training image retrieval with a listwise loss. In: Proceedings of ICCV, pp. 5107–5116. IEEE (2019)

31.

Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115, 211–252 (2015)MathSciNetCrossRef

32.

Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of CVPR, pp. 815–823. IEEE (2015)

33.

Sohn, K.: Improved deep metric learning with multi-class N-pair loss objective. In: Proceedings of NIPS, vol. 29. Curran Associates, Inc. (2016)

34.

Suh, Y., Han, B., Kim, W., Lee, K.M.: Stochastic class-based hard example mining for deep metric learning. In: Proceedings of CVPR, pp. 7244–7252. IEEE (2019)

35.

Tang, Y., Bai, W., Li, G., Liu, X., Zhang, Y.: CROLoss: towards a customizable loss for retrieval models in recommender systems. In: Proceedings of CIKM, pp. 1916–1924. Association for Computing Machinery, New York, NY, USA (2022)

36.

Balntas, V., Riba, E., Ponsa, D., Mikolajczyk, K.: Learning local feature descriptors with triplets and shallow convolutional neural networks. In: Proceedings of BMVC, pp. 119.1–119.11. BMVA Press (2016)

37.

Wang, F., Cheng, J., Liu, W., Liu, H.: Additive margin softmax for face verification. IEEE Sig. Process. Lett. 25(7), 926–930 (2018)CrossRef

38.

Wang, H., et al.: CosFace: large margin cosine loss for deep face recognition. In: Proceedings of CVPR, pp. 5265–5274. IEEE (2018)

39.

Wen, Y., Zhang, K., Li, Z., Qiao, Yu.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_31CrossRef

40.

Wu, C.Y., Manmatha, R., Smola, A.J., Krähenbühl, P.: Sampling matters in deep embedding learning. In: Proceedings of ICCV, pp. 2859–2867. IEEE (2017)

41.

Wu, H., Wang, M., Zhou, W., Li, H.: Learning deep local features with multiple dynamic attentions for large-scale image retrieval. In: Proceedings of ICCV, pp. 11416–11425. IEEE (2021)

42.

Yadan, O.: Hydra - a framework for elegantly configuring complex applications. Github (2019). https://github.com/facebookresearch/hydra

43.

Yu, B., Tao, D.: Deep metric learning with tuplet margin loss. In: Proceedings of ICCV, pp. 6489–6498. IEEE (2019)

44.

Zhu, Q., Zhang, P., Wang, Z., Ye, X.: A new loss function for CNN classifier based on predefined evenly-distributed class centroids. IEEE Access 8, 10888–10895 (2019)

Titel: A New Loss for Image Retrieval: Class Anchor Margin
verfasst von: Alexandru Ghiţă
Radu Tudor Ionescu
Verlag: Springer Nature Singapore
Buch: Advances in Knowledge Discovery and Data Mining
Print ISBN: 978-981-9722-52-5

Electronic ISBN: 978-981-9722-53-2

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-981-97-2253-2_4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner