Skip to main content

2024 | OriginalPaper | Buchkapitel

Generating Image Captions in Hindi Based on Encoder-Decoder Based Deep Learning Techniques

verfasst von : Priya Singh, Farhan Raja, Hariom Sharma

Erschienen in: Reliability Engineering for Industrial Processes

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Image Captioning has experienced significant advancements recently, combining computer vision and natural language processing to create a new field that describes images in words. These approaches utilize an encoder-decoder architecture, where an image is encoded into features by an encoder and those features are decoded into a text sequence by a decoder. Typically, Convolutional Neural Networks (CNNs) are employed as encoders, while Recurrent Neural Networks (RNNs) serve as decoders in these models. Although much of the work in this domain focuses on English, research on Image Captioning models for regional languages is limited. Hindi, being a morphologically rich language and the third most spoken language worldwide, is the focus of this paper. The study conducts a comparative analysis of four state-of-the-art Image Captioning models (ResNet50, InceptionV3, VGG16, and VGG19) specifically applied to the Hindi language. The evaluation of these models’ performance in generating image captions on the widely used Flickr8k dataset employs BLEU, METEOR, and RIBES scores. The results indicate that the InceptionV3 model surpasses the other three models in terms of both BLEU and METEOR scores, making it a valuable reference for researchers operating within this field.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Kumari (2020) Automated image captioning for Flickr8K dataset. In: International conference on artificial intelligence, smart grid and smart city applications. Springer, Cham Kumari (2020) Automated image captioning for Flickr8K dataset. In: International conference on artificial intelligence, smart grid and smart city applications. Springer, Cham
6.
Zurück zum Zitat Mishra SK, Rai G, Saha S, Bhattacharyya P (2022) Efficient channel attention based encoder–decoder approach for image captioning in Hindi. ACM Trans Asian Low Resour Lang Inf Process 21(3):1–17. https://doi.org/10.1145/3483597 Mishra SK, Rai G, Saha S, Bhattacharyya P (2022) Efficient channel attention based encoder–decoder approach for image captioning in Hindi. ACM Trans Asian Low Resour Lang Inf Process 21(3):1–17. https://​doi.​org/​10.​1145/​3483597
20.
Zurück zum Zitat Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, ACL’02. Association for Computational Linguistics, Stroudsburg, pp 311–318. https://doi.org/10.3115/107.3083.1073135 Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, ACL’02. Association for Computational Linguistics, Stroudsburg, pp 311–318. https://​doi.​org/​10.​3115/​107.​3083.​1073135
Metadaten
Titel
Generating Image Captions in Hindi Based on Encoder-Decoder Based Deep Learning Techniques
verfasst von
Priya Singh
Farhan Raja
Hariom Sharma
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-55048-5_6

Premium Partner