Skip to main content

2024 | OriginalPaper | Buchkapitel

Deep Learning: How to Apply Machine Learning and Deep Learning Methods to Audio Analysis

verfasst von : Manan Dabral, Tejinder Kaur, Abhay Khanna, Ashish Yadav, Ojas Sharma, Nakul

Erschienen in: Mobile Radio Communications and 5G Networks

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

So before understanding about deep learning, we should also look at Artificial Intelligence (AI) and Machine Learning (ML). The purpose of AI is to train machines in such a way that they can function like the human mind. The field of AI includes machine learning, the purpose of which is that the machine can learn by itself according to its experience and can develop such skills in which human involvement is not equal. Let us now understand what Deep Learning is. You can also say that very complex neural networks have been named deep learning, and you can also see it as an advancement in machine learning. Basic machine learning had limited data processing capabilities and generally required structured data. While the data processing capacity of deep learning algorithm is very high, and compared to traditional machine learning, it does not require structured data, rather it can handle both structured and unstructured data. In one sentence, deep learning enables computers to think, understand, and experience like humans.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. NIPS’12 Proc 25th Int Conf Neural Inf Process Syst 1:1097–1105 Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. NIPS’12 Proc 25th Int Conf Neural Inf Process Syst 1:1097–1105
2.
Zurück zum Zitat O’ Mahony N, Murphy T, Panduru K et al (2016) Adaptive process control and sensor fusion for process analytical technology. In: 2016 27th Irish signals and systems conference (ISSC). IEEE, pp 1–6 O’ Mahony N, Murphy T, Panduru K et al (2016) Adaptive process control and sensor fusion for process analytical technology. In: 2016 27th Irish signals and systems conference (ISSC). IEEE, pp 1–6
3.
Zurück zum Zitat Schöning J, Faion P, Heidemann G (2016) Pixel-wise ground truth annotation in videos—an semi-automatic approach for pixel-wise and semantic object annotation. In: Proceedings of the 5th international conference on pattern recognition applications and methods. SCITEPRESS—Science and and Technology Publications, pp 690–697 Schöning J, Faion P, Heidemann G (2016) Pixel-wise ground truth annotation in videos—an semi-automatic approach for pixel-wise and semantic object annotation. In: Proceedings of the 5th international conference on pattern recognition applications and methods. SCITEPRESS—Science and and Technology Publications, pp 690–697
4.
Zurück zum Zitat Zhang X, Lee J-Y, Sunkavalli K, Wang Z (2017) Photometric stabilization for Fastforward videos Zhang X, Lee J-Y, Sunkavalli K, Wang Z (2017) Photometric stabilization for Fastforward videos
5.
Zurück zum Zitat Karami E, Shehata M, Smith A (2017) Image identification using SIFT algorithm: performance analysis against different image deformations Karami E, Shehata M, Smith A (2017) Image identification using SIFT algorithm: performance analysis against different image deformations
6.
Zurück zum Zitat Horiguchi S, Ikami D, Aizawa K (2017) Significance of Softmax-based features in comparison to distance metric learning-based features Horiguchi S, Ikami D, Aizawa K (2017) Significance of Softmax-based features in comparison to distance metric learning-based features
7.
Zurück zum Zitat Alhaija HA, Mustikovela SK, Mescheder L, et al (2017) Augmented reality meets computer vision: efficient data generation for urban driving scenes Alhaija HA, Mustikovela SK, Mescheder L, et al (2017) Augmented reality meets computer vision: efficient data generation for urban driving scenes
10.
Zurück zum Zitat Zeng G, Zhou J, Jia X, et al (2018) Hand-crafted feature guided deep learning for facial expression recognition. In: 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE, pp 423–430 Zeng G, Zhou J, Jia X, et al (2018) Hand-crafted feature guided deep learning for facial expression recognition. In: 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE, pp 423–430
11.
Zurück zum Zitat Ahmed E, Saint A, Shabayek AER et al (2018) Deep learning advances on different 3D data representations: a survey. arXiv Prepr arXiv 180801462 Ahmed E, Saint A, Shabayek AER et al (2018) Deep learning advances on different 3D data representations: a survey. arXiv Prepr arXiv 180801462
12.
Zurück zum Zitat Braeger S, Foroosh H (2018) Curvature augmented deep learning for 3D object recognition. In: 2018 25th IEEE International conference on image processing (ICIP). IEEE, pp 3648–3652 Braeger S, Foroosh H (2018) Curvature augmented deep learning for 3D object recognition. In: 2018 25th IEEE International conference on image processing (ICIP). IEEE, pp 3648–3652
13.
Zurück zum Zitat Niall O’ Mahony (Institute of Technology Tralee), Sean Campbell (Institute of Technology Tralee), Lenka Krpalkova (Institute of Technology Tralee), et al (2018) Deep learning for visual navigation of unmanned ground vehicles; a review Niall O’ Mahony (Institute of Technology Tralee), Sean Campbell (Institute of Technology Tralee), Lenka Krpalkova (Institute of Technology Tralee), et al (2018) Deep learning for visual navigation of unmanned ground vehicles; a review
15.
Zurück zum Zitat Hayou S, Doucet A, Rousseau J (2018) On The selection of initialization and activation function for deep neural networks. arXiv Prepr arXiv 180508266v2 Hayou S, Doucet A, Rousseau J (2018) On The selection of initialization and activation function for deep neural networks. arXiv Prepr arXiv 180508266v2
17.
Zurück zum Zitat Miikkulainen, R, Liang J, Meyerson E, Rawal A, Fink D, Francon O, Raju B et al (2019) Evolving deep neural networks. In: Artificial intelligence in the age of neural networks and brain computing, pp 293–312. Academic Press Miikkulainen, R, Liang J, Meyerson E, Rawal A, Fink D, Francon O, Raju B et al (2019) Evolving deep neural networks. In: Artificial intelligence in the age of neural networks and brain computing, pp 293–312. Academic Press
18.
Zurück zum Zitat Manohar V, Chen S-J, Wang Z, Fujita Y, Watanabe S, Khudanpur S (2019) Acoustic modeling for overlapping speech recognition: Jhu Chime-5 challenge system. In: ICASSP 2019–2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 6665–6669. IEEE Manohar V, Chen S-J, Wang Z, Fujita Y, Watanabe S, Khudanpur S (2019) Acoustic modeling for overlapping speech recognition: Jhu Chime-5 challenge system. In: ICASSP 2019–2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 6665–6669. IEEE
19.
Zurück zum Zitat Bischke B, Helber P, Folz J, Borth D, Dengel A (2019) Multi-task learning for segmentation of building footprints with deep neural networks. In: 2019 IEEE International conference on image processing (ICIP), pp 1480–1484. IEEE Bischke B, Helber P, Folz J, Borth D, Dengel A (2019) Multi-task learning for segmentation of building footprints with deep neural networks. In: 2019 IEEE International conference on image processing (ICIP), pp 1480–1484. IEEE
20.
Zurück zum Zitat Chen J, Wu L, Zhang J, Zhang L, Gong D, Zhao Y, Hu S, Wang Y, Hu X, Zheng B et al (2020) Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: a prospective study, medRxiv Chen J, Wu L, Zhang J, Zhang L, Gong D, Zhao Y, Hu S, Wang Y, Hu X, Zheng B et al (2020) Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: a prospective study, medRxiv
21.
Zurück zum Zitat Maghdid HS, Ghafoor KZ, Sadiq AS, Curran K, Rabie K (2020) A novel ai-enabled framework to diagnose coronavirus covid 19 using smartphone embedded sensors: Design study, arXiv preprint arXiv:2003.07434 Maghdid HS, Ghafoor KZ, Sadiq AS, Curran K, Rabie K (2020) A novel ai-enabled framework to diagnose coronavirus covid 19 using smartphone embedded sensors: Design study, arXiv preprint arXiv:​2003.​07434
22.
Zurück zum Zitat Kadra A, Lindauer M, Hutter F, Grabocka J (2021) Regularization is all you need: Simple neural nets can excel on tabular data. arXiv preprint arXiv:2106.11189 Kadra A, Lindauer M, Hutter F, Grabocka J (2021) Regularization is all you need: Simple neural nets can excel on tabular data. arXiv preprint arXiv:​2106.​11189
23.
Zurück zum Zitat Ghantasala GSP, Rao DN, Patan R (2022) Recognition of dubious tissue by using supervised machine learning strategy. Applications of computational methods in manufacturing and product design, Springer, Singapore, pp 395–404 Ghantasala GSP, Rao DN, Patan R (2022) Recognition of dubious tissue by using supervised machine learning strategy. Applications of computational methods in manufacturing and product design, Springer, Singapore, pp 395–404
24.
Zurück zum Zitat Sachdeva RK, Bathla P (2022) A machine learning-based framework for diagnosis of breast cancer. Int J Software Innov 10(1):1–11CrossRef Sachdeva RK, Bathla P (2022) A machine learning-based framework for diagnosis of breast cancer. Int J Software Innov 10(1):1–11CrossRef
25.
26.
Metadaten
Titel
Deep Learning: How to Apply Machine Learning and Deep Learning Methods to Audio Analysis
verfasst von
Manan Dabral
Tejinder Kaur
Abhay Khanna
Ashish Yadav
Ojas Sharma
Nakul
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-0700-3_2