Skip to main content

2024 | OriginalPaper | Buchkapitel

VioNet: An Enhanced Violence Detection Approach for Videos Using a Fusion Model of Vision Transformer with Bi-LSTM and 3D Convolutional Neural Networks

verfasst von : Md. Akil Raihan Iftee, Md. Mominur Rahman, Sunanda Das

Erschienen in: Proceedings of the 2nd International Conference on Big Data, IoT and Machine Learning

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The identification of violence in real-world scenarios is imperative as it enables the detection of aggressive behavior, thereby preventing harm to individuals and communities. This is crucial for ensuring public safety, facilitating effective crime investigation, promoting child safety, safeguarding mental health, and facilitating social media moderation. Various methods, including handcrafted techniques and deep learning algorithms, can be utilized in surveillance or CCTV cameras, as well as smartphones, to enable timely detection of violent behavior and facilitate appropriate action and intervention. In this study, we introduce VioNET, a novel approach that combines a 3D Convolutional Neural Network and a Vision Transformer with Bidirectional LSTM for the purpose of accurately detecting violence in video data. Since video data is inherently sequential, the extraction of spatiotemporal features is essential to accurate detection. The use of these two deep learning methods facilitates the extraction of maximum features, which are then fused together to classify videos with the highest possible accuracy. We evaluate the effectiveness of our approach by employing three datasets: Hockey, Movies, and Violent Flow, for analysis. The proposed model achieved impressive accuracies of 97.85%, 100.00%, and 96.33% on the Hokey, Movie, and Violent Flow datasets, respectively. Based on the obtained results, it is evident that our method showcases superior performance, outperforming several existing approaches in the field and establishing itself as a robust and competitive solution for violence detection in videos.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bermejo Nievas E, Deniz Suarez O, Bueno García G, Sukthankar R (2011) Violence detection in video using computer vision techniques. In: Computer analysis of images and patterns: 14th international conference, CAIP 2011, Seville, Spain, 29–31 Aug 2011. Proceedings, Part II 14. Springer, pp 332–339 Bermejo Nievas E, Deniz Suarez O, Bueno García G, Sukthankar R (2011) Violence detection in video using computer vision techniques. In: Computer analysis of images and patterns: 14th international conference, CAIP 2011, Seville, Spain, 29–31 Aug 2011. Proceedings, Part II 14. Springer, pp 332–339
2.
Zurück zum Zitat De Souza FD, Chavez GC, do Valle Jr EA, Araújo ADA (2010) Violence detection in video using spatio-temporal features. In: 2010 23rd SIBGRAPI conference on graphics, patterns and images. IEEE, pp 224–230 De Souza FD, Chavez GC, do Valle Jr EA, Araújo ADA (2010) Violence detection in video using spatio-temporal features. In: 2010 23rd SIBGRAPI conference on graphics, patterns and images. IEEE, pp 224–230
3.
Zurück zum Zitat Zhou P, Ding Q, Luo H, Hou X (2018) Violence detection in surveillance video using low-level features. PLoS ONE 13(10):e0203668 Zhou P, Ding Q, Luo H, Hou X (2018) Violence detection in surveillance video using low-level features. PLoS ONE 13(10):e0203668
4.
Zurück zum Zitat Das S, Sarker A, Mahmud T (2019) Violence detection from videos using hog features. In: 2019 4th international conference on electrical information and communication technology (EICT). IEEE, pp 1–5 Das S, Sarker A, Mahmud T (2019) Violence detection from videos using hog features. In: 2019 4th international conference on electrical information and communication technology (EICT). IEEE, pp 1–5
8.
Zurück zum Zitat Mugunga I, Dong J, Rigall E, Guo S, Madessa AH, Nawaz HS (2021) A frame-based feature model for violence detection from surveillance cameras using ConvLSTM network. In: 2021 6th international conference on image, vision and computing (ICIVC). IEEE, pp 55–60 Mugunga I, Dong J, Rigall E, Guo S, Madessa AH, Nawaz HS (2021) A frame-based feature model for violence detection from surveillance cameras using ConvLSTM network. In: 2021 6th international conference on image, vision and computing (ICIVC). IEEE, pp 55–60
10.
Zurück zum Zitat Song W, Zhang D, Zhao X, Yu J, Zheng R, Wang A (2019) A novel violent video detection scheme based on modified 3D convolutional neural networks. IEEE Access 7:39172–39179CrossRef Song W, Zhang D, Zhao X, Yu J, Zheng R, Wang A (2019) A novel violent video detection scheme based on modified 3D convolutional neural networks. IEEE Access 7:39172–39179CrossRef
11.
Zurück zum Zitat Sudhakaran S, Lanz O (2017) Learning to detect violent videos using convolutional long short-term memory. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6 Sudhakaran S, Lanz O (2017) Learning to detect violent videos using convolutional long short-term memory. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6
13.
Zurück zum Zitat Giannakopoulos T, Makris A, Kosmopoulos D, Perantonis S, Theodoridis S (2010) Audio-visual fusion for detecting violent scenes in videos. In: Artificial intelligence: theories, models and applications: 6th Hellenic conference on AI, SETN 2010, Athens, Greece, 4–7 May 2010. Proceedings 6. Springer, pp 91–100 Giannakopoulos T, Makris A, Kosmopoulos D, Perantonis S, Theodoridis S (2010) Audio-visual fusion for detecting violent scenes in videos. In: Artificial intelligence: theories, models and applications: 6th Hellenic conference on AI, SETN 2010, Athens, Greece, 4–7 May 2010. Proceedings 6. Springer, pp 91–100
14.
Zurück zum Zitat Peixoto BM, Lavi B, Dias Z, Rocha A (2021) Harnessing high-level concepts, visual, and auditory features for violence detection in videos. J Vis Commun Image Represent 78:103174 Peixoto BM, Lavi B, Dias Z, Rocha A (2021) Harnessing high-level concepts, visual, and auditory features for violence detection in videos. J Vis Commun Image Represent 78:103174
18.
Zurück zum Zitat Ding C, Fan S, Zhu M, Feng W, Jia B (2014) Violence detection in video by using 3D convolutional neural networks. In: Advances in visual computing: 10th international symposium, ISVC 2014, Las Vegas, NV, 8–10 Dec 2014. Proceedings, Part II 10. Springer, pp 551–558 Ding C, Fan S, Zhu M, Feng W, Jia B (2014) Violence detection in video by using 3D convolutional neural networks. In: Advances in visual computing: 10th international symposium, ISVC 2014, Las Vegas, NV, 8–10 Dec 2014. Proceedings, Part II 10. Springer, pp 551–558
19.
Zurück zum Zitat Honarjoo N, Abdari A, Mansouri A (2021) Violence detection using pre-trained models. In: 2021 5th international conference on pattern recognition and image analysis (IPRIA). IEEE, pp 1–4 Honarjoo N, Abdari A, Mansouri A (2021) Violence detection using pre-trained models. In: 2021 5th international conference on pattern recognition and image analysis (IPRIA). IEEE, pp 1–4
20.
Zurück zum Zitat Abdali AMR, Al-Tuma RF (2019) Robust real-time violence detection in video using CNN and LSTM. In: 2019 2nd scientific conference of computer sciences (SCCS). IEEE, pp 104–108 Abdali AMR, Al-Tuma RF (2019) Robust real-time violence detection in video using CNN and LSTM. In: 2019 2nd scientific conference of computer sciences (SCCS). IEEE, pp 104–108
Metadaten
Titel
VioNet: An Enhanced Violence Detection Approach for Videos Using a Fusion Model of Vision Transformer with Bi-LSTM and 3D Convolutional Neural Networks
verfasst von
Md. Akil Raihan Iftee
Md. Mominur Rahman
Sunanda Das
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-99-8937-9_10

Premium Partner