Skip to main content

2024 | OriginalPaper | Buchkapitel

35. Drone Watch: A Novel Dataset for Violent Action Recognition from Aerial Videos

verfasst von : Nitish Mahajan, Amita Chauhan, Harish Kumar, Sakshi Kaushal, Sarbjeet Singh

Erschienen in: Proceedings of Congress on Control, Robotics, and Mechatronics

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In recent developments, a lot has been done for computer vision applied to human action recognition and violence detection. Although various datasets are available for action and violence recognition, there is a clear lack of datasets that include non-violent and violent activities simultaneously from an aerial view. A new aerial video dataset for concurrent human action recognition, including violence detection, is presented in this study. It consists of 60 min of fully annotated data with two action classes, namely violent and normal (non-violent). The current dataset addresses various factors that are not considered in the existing datasets, like changes in the altitude of the drone, changes in the angle at which the video is being captured, video captured during motion, changes in frame rates, videos from different cameras with different configurations, multiple labels for every subject, and labels for violent activities. The resulting dataset is a multifaceted representation of the real-world scenarios, which addresses various shortfalls in the existing datasets. The current dataset will push forward computer vision applications for action recognition, particularly automated violence detection in real-time video streams from an aerial view. Furthermore, the curated dataset is validated for violence detection using machine and deep learning algorithms, namely Support Vector Machine (SVM), Long Short-Term Memory (LSTM), Bi-Directional LSTM (Bi-LSTM) and Adaptive Boosting (AdaBoost).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012) Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:​1212.​0402 (2012)
2.
Zurück zum Zitat Azkune, G., Almeida, A., Lopez-de Ipi ´ na, D., Chen, L.: Combining users’ activity survey and simulators to evaluate human activity recognition systems. Sensors 15(4), 8192–8213 (2015) Azkune, G., Almeida, A., Lopez-de Ipi ´ na, D., Chen, L.: Combining users’ activity survey and simulators to evaluate human activity recognition systems. Sensors 15(4), 8192–8213 (2015)
3.
Zurück zum Zitat Shahroudy, A., Liu, J., Ng, T.-T., Wang, G.: Ntu rgb+ d: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1010–1019 (2016) Shahroudy, A., Liu, J., Ng, T.-T., Wang, G.: Ntu rgb+ d: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1010–1019 (2016)
4.
Zurück zum Zitat Barekatain, M., Mart´ı, M., Shih, H.-F., Murray, S., Nakayama, K., Matsuo, Y., Prendinger, H.: Okutama-action: an aerial view video dataset for concurrent human action detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 28–35 (2017) Barekatain, M., Mart´ı, M., Shih, H.-F., Murray, S., Nakayama, K., Matsuo, Y., Prendinger, H.: Okutama-action: an aerial view video dataset for concurrent human action detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 28–35 (2017)
5.
Zurück zum Zitat Wang, H.-Y., Chang, Y.-C., Hsieh, Y.-Y., Chen, H.-T., Chuang, J.-H.: Deep learning-based human activity analysis for aerial images. In: 2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 713–718. IEEE (2017) Wang, H.-Y., Chang, Y.-C., Hsieh, Y.-Y., Chen, H.-T., Chuang, J.-H.: Deep learning-based human activity analysis for aerial images. In: 2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 713–718. IEEE (2017)
6.
Zurück zum Zitat Sargano, A.B., Angelov, P., Habib, Z.: A comprehensive review on handcrafted and learning-based action representation approaches for human activity recognition. Appl. Sci. 7(1), 110 (2017) Sargano, A.B., Angelov, P., Habib, Z.: A comprehensive review on handcrafted and learning-based action representation approaches for human activity recognition. Appl. Sci. 7(1), 110 (2017)
7.
Zurück zum Zitat Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 3, pp. 32–36. IEEE (2004) Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 3, pp. 32–36. IEEE (2004)
8.
Zurück zum Zitat Rodriguez, M.: Spatio-temporal maximum average correlation height templates in action recognition and video summarization (2010) Rodriguez, M.: Spatio-temporal maximum average correlation height templates in action recognition and video summarization (2010)
9.
Zurück zum Zitat Soomro, K., Zamir, A.R.: Action recognition in realistic sports videos. In: Computer Vision in Sports, pp. 181–208. Springer (2014) Soomro, K., Zamir, A.R.: Action recognition in realistic sports videos. In: Computer Vision in Sports, pp. 181–208. Springer (2014)
10.
Zurück zum Zitat Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2929–2936. IEEE (2009) Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2929–2936. IEEE (2009)
11.
Zurück zum Zitat Heilbron, F.C., Escorcia, V., Ghanem, B., Niebles, J.C.: Activitynet: a large-scale video benchmark for human activity understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–970 (2015) Heilbron, F.C., Escorcia, V., Ghanem, B., Niebles, J.C.: Activitynet: a large-scale video benchmark for human activity understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–970 (2015)
12.
Zurück zum Zitat Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1996–2003. IEEE (2009) Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1996–2003. IEEE (2009)
13.
Zurück zum Zitat Weinland, D., Boyer, E., Ronfard, R.: Action recognition from arbitrary views using 3d exemplars. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–7. IEEE (2007) Weinland, D., Boyer, E., Ronfard, R.: Action recognition from arbitrary views using 3d exemplars. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–7. IEEE (2007)
14.
Zurück zum Zitat Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)CrossRef Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)CrossRef
15.
Zurück zum Zitat Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: a large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563. IEEE (2011) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: a large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563. IEEE (2011)
16.
Zurück zum Zitat Moencks, M., De Silva, V., Roche, J., Kondoz, A.: Adaptive feature processing for robust human activity recognition on a novel multi-modal dataset. arXiv preprint arXiv:1901.02858 (2019) Moencks, M., De Silva, V., Roche, J., Kondoz, A.: Adaptive feature processing for robust human activity recognition on a novel multi-modal dataset. arXiv preprint arXiv:​1901.​02858 (2019)
17.
Zurück zum Zitat Wijekoon, A., Wiratunga, N., Cooper, K.: MEx: multimodal exercises dataset for human activity recognition. arXiv preprint arXiv:1908.08992 (2019) Wijekoon, A., Wiratunga, N., Cooper, K.: MEx: multimodal exercises dataset for human activity recognition. arXiv preprint arXiv:​1908.​08992 (2019)
18.
Zurück zum Zitat Singh, R., Sonawane, A., Srivastava, R.: Recent evolution of modern datasets for human activity recognition: a deep survey. Multimedia Syst. 26(2), 83–106 (2020)CrossRef Singh, R., Sonawane, A., Srivastava, R.: Recent evolution of modern datasets for human activity recognition: a deep survey. Multimedia Syst. 26(2), 83–106 (2020)CrossRef
19.
Zurück zum Zitat Mou, L., Hua, Y., Jin, P., Zhu, X.X.: Event and activity recognition in aerial videos using deep neural networks and a new dataset. In: IGARSS 2020—2020 IEEE International Geoscience and Remote Sensing Symposium, pp. 952–955. IEEE (2020) Mou, L., Hua, Y., Jin, P., Zhu, X.X.: Event and activity recognition in aerial videos using deep neural networks and a new dataset. In: IGARSS 2020—2020 IEEE International Geoscience and Remote Sensing Symposium, pp. 952–955. IEEE (2020)
20.
Zurück zum Zitat Mmereki, W., Jamisola, R.S., Mpoeleng, D., Petso, T.: Yolov3-based human activity recognition as viewed from a moving high-altitude aerial camera. In: 2021 7th International Conference on Automation, Robotics and Applications (ICARA), pp. 241–246. IEEE (2021) Mmereki, W., Jamisola, R.S., Mpoeleng, D., Petso, T.: Yolov3-based human activity recognition as viewed from a moving high-altitude aerial camera. In: 2021 7th International Conference on Automation, Robotics and Applications (ICARA), pp. 241–246. IEEE (2021)
21.
Zurück zum Zitat Farhadi, A., Redmon, J.: Yolov3: an incremental improvement. Comput. Vis. Pattern Recogn. 1804 (2018) Farhadi, A., Redmon, J.: Yolov3: an incremental improvement. Comput. Vis. Pattern Recogn. 1804 (2018)
22.
Zurück zum Zitat Sultani, W., Shah, M.: Human action recognition in drone videos using a few aerial training examples. Comput. Vis. Image Underst. 206, 103186 (2021)CrossRef Sultani, W., Shah, M.: Human action recognition in drone videos using a few aerial training examples. Comput. Vis. Image Underst. 206, 103186 (2021)CrossRef
23.
Zurück zum Zitat Singh, A., Patil, D., Omkar, S.N.: Eye in the sky: real-time drone surveillance system (DSS) for violent individuals identification using scatternet hybrid deep learning network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1629–1637 (2018) Singh, A., Patil, D., Omkar, S.N.: Eye in the sky: real-time drone surveillance system (DSS) for violent individuals identification using scatternet hybrid deep learning network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1629–1637 (2018)
24.
Zurück zum Zitat Mliki, H., Bouhlel, F., Hammami, M.: Human activity recognition from UAV-captured video sequences. Pattern Recogn. 100, 107140 (2020)CrossRef Mliki, H., Bouhlel, F., Hammami, M.: Human activity recognition from UAV-captured video sequences. Pattern Recogn. 100, 107140 (2020)CrossRef
25.
Zurück zum Zitat Aviles-Cruz, C., Ferreyra-Ram ´ ´ırez, A., Zu´niga-L ˜ opez, A., Villegas-Cortez, J.: Coarse-fine convolutional deep-learning strategy for human activity recognition. Sensors 19(7), 1556 (2019) Aviles-Cruz, C., Ferreyra-Ram ´ ´ırez, A., Zu´niga-L ˜ opez, A., Villegas-Cortez, J.: Coarse-fine convolutional deep-learning strategy for human activity recognition. Sensors 19(7), 1556 (2019)
26.
Zurück zum Zitat Ajmal, M., Ahmad, F., Naseer, M., Jamjoom, M.: Recognizing human activities from video using weakly supervised contextual features. IEEE Access 7, 98420–98435 (2019)CrossRef Ajmal, M., Ahmad, F., Naseer, M., Jamjoom, M.: Recognizing human activities from video using weakly supervised contextual features. IEEE Access 7, 98420–98435 (2019)CrossRef
27.
Zurück zum Zitat Ramzan, M., Abid, A., Khan, H.U., Awan, S.M., Ismail, A., Ahmed, M., Ilyas, M., Mahmood, A.: A review on state-of-the-art violence detection techniques. IEEE Access 7, 107560–107575 (2019) Ramzan, M., Abid, A., Khan, H.U., Awan, S.M., Ismail, A., Ahmed, M., Ilyas, M., Mahmood, A.: A review on state-of-the-art violence detection techniques. IEEE Access 7, 107560–107575 (2019)
28.
Zurück zum Zitat Aktı, S., Tataro ¨ glu, G.A., Ekenel, H.K.: Vision-based fight detection from surveillance cameras. In: 2019 Ninth International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1–6. IEEE (2019) Aktı, S., Tataro ¨ glu, G.A., Ekenel, H.K.: Vision-based fight detection from surveillance cameras. In: 2019 Ninth International Conference on Image Processing Theory, Tools and Applications (IPTA), pp. 1–6. IEEE (2019)
29.
Zurück zum Zitat Jain, A., Vishwakarma, D.K.: State-of-the-arts violence detection using convnets. In: 2020 International Conference on Communication and Signal Processing (ICCSP), pp. 0813–0817. IEEE (2020) Jain, A., Vishwakarma, D.K.: State-of-the-arts violence detection using convnets. In: 2020 International Conference on Communication and Signal Processing (ICCSP), pp. 0813–0817. IEEE (2020)
30.
Zurück zum Zitat Challa, S.K., Kumar, A., Semwal, V.B.: A multibranch CNN-BiLSTM model for human activity recognition using wearable sensor data. Vis. Comput. 1–15 (2021) Challa, S.K., Kumar, A., Semwal, V.B.: A multibranch CNN-BiLSTM model for human activity recognition using wearable sensor data. Vis. Comput. 1–15 (2021)
31.
Zurück zum Zitat Pawar, K., Attar, V.: Application of deep learning for crowd anomaly detection from surveillance videos. In: 2021 11th International Conference on Cloud Computing, Data Science and Engineering (Confluence), pp 506–511. IEEE (2021) Pawar, K., Attar, V.: Application of deep learning for crowd anomaly detection from surveillance videos. In: 2021 11th International Conference on Cloud Computing, Data Science and Engineering (Confluence), pp 506–511. IEEE (2021)
32.
Zurück zum Zitat Srivastava, A., Badal, T., Garg, A., Vidyarthi, A., Singh, R.: Recognizing human violent action using drone surveillance within real-time proximity. J. Real-Time Image Process. 1–13 (2021) Srivastava, A., Badal, T., Garg, A., Vidyarthi, A., Singh, R.: Recognizing human violent action using drone surveillance within real-time proximity. J. Real-Time Image Process. 1–13 (2021)
Metadaten
Titel
Drone Watch: A Novel Dataset for Violent Action Recognition from Aerial Videos
verfasst von
Nitish Mahajan
Amita Chauhan
Harish Kumar
Sakshi Kaushal
Sarbjeet Singh
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-99-5180-2_35