nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

MDC-Net: Multimodal Detection and Captioning Network for Steel Surface Defects

verfasst von : Anthony Ashwin Peter Chazhoor, Shanfeng Hu, Bin Gao, Wai Lok Woo

Erschienen in: Robotics, Computer Vision and Intelligent Systems

Verlag: Springer Nature Switzerland

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In the highly competitive steel sector, product quality, particularly in terms of surface integrity, is critical. Surface defect detection (SDD) is essential in maintaining high production standards, as it directly impacts product quality and manufacturing efficiency. Traditional SDD approaches, which rely primarily on manual inspection or traditional computer vision techniques, are plagued with difficulties, including reduced accuracy and potential health concerns to inspectors. This research describes an innovative solution that uses a sequence generation model with transformers to improve the defect detection process while manufacturing hot-rolled steel sheets and generating captions about the defect and its spatial location. This method, which views object detection as a sequence generation problem, allows for a more sophisticated understanding of image content and a complete and contextually rich investigation of surface defects whilst providing captions. While this method can potentially improve detection accuracy, its actual power rests in its scalability and flexibility to various industrial applications. Furthermore, this technique has the potential to be further enhanced for visual question-answering applications, opening up opportunities for interactive and intelligent image analysis.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel DAFDeTr: Deformable Attention Fusion Based 3D Detection Transformer

Nächstes Kapitel Operational Modeling of Temporal Intervals for Intelligent Systems

Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

Xie, Y., Hu, W., Xie, S., He, L.: Surface defect detection algorithm based on feature-enhanced yolo. Cogn. Comput. 15(2), 565–579 (2023)CrossRef

Chen, T., Saxena, S., Li, L., Fleet, D.J., Hinton, G.: Pix2seq: a language modeling framework for object detection. arXiv preprint arXiv:2109.10852 (2021)

Luo, Q., He, Y.: A cost-effective and automatic surface defect inspection system for hot-rolled flat steel. Robot. Comput.-Integr. Manuf. 38, 16–30 (2016)CrossRef

Helifa, B., Oulhadj, A., Benbelghit, A., Lefkaier, I., Boubenider, F., Boutassouna, D.: Detection and measurement of surface cracks in ferromagnetic materials using eddy current testing. Ndt & E Int. 39(5), 384–390 (2006)CrossRef

Li, X., Gao, B., Woo, W.L., Tian, G.Y., Qiu, X., Gu, L.: Quantitative surface crack evaluation based on eddy current pulsed thermography. IEEE Sens. J. 17(2), 412–421 (2016)CrossRef

Shrestha, R., Park, J., Kim, W.: Application of thermal wave imaging and phase shifting method for defect detection in stainless steel. Infrared Phys. Technol. 76, 676–683 (2016)CrossRef

Gao, B., Li, X., Woo, W.L., Yun Tian, G.: Physics-based image segmentation using first order statistical properties and genetic algorithm for inductive thermography imaging. IEEE Trans. Image Process. 27(5) (2017) 2160–2175

10.

Li, X.G., Miao, C.Y., Wang, J., Zhang, Y.: Automatic defect detection method for the steel cord conveyor belt based on its x-ray images. In: 2011 International Conference on Control, Automation and Systems Engineering (CASE), pp. 1–4. IEEE (2011)

11.

Zhang, Y., et al.: Development of a cross-scale weighted feature fusion network for hot-rolled steel surface defect detection. Eng. Appl. Artif. Intell. 117, 105628 (2023)CrossRef

12.

Demir, K., Ay, M., Cavas, M., Demir, F.: Automated steel surface defect detection and classification using a new deep learning-based approach. Neural Comput. Appl. 35(11), 8389–8406 (2023)CrossRef

13.

Yang, L., Xu, S., Fan, J., Li, E., Liu, Y.: A pixel-level deep segmentation network for automatic defect detection. Expert Syst. Appl. 215, 119388 (2023)CrossRef

14.

Ji, A., Thee, Q.Y., Woo, W.L., Wong, E.: Experimental investigations of a convolutional neural network model for detecting railway track anomalies. In: IECON 2023-49th Annual Conference of the IEEE Industrial Electronics Society, pp. 1–7. IEEE (2023)

15.

Chazhoor, A.A.P., Zhu, M., Ho, E.S.L., Gao, B., Woo, W.L.: Classification of different types of plastics using deep transfer learning. In: ROBOVIS, SciTePress, Science and Technology Publications, pp. 190–195 (2021)

16.

Chazhoor, A.A.P., Ho, E.S., Gao, B., Woo, W.L.: Deep transfer learning benchmark for plastic waste classification. Intell. Robot. 2, 1–19 (2022)

17.

He, Y., Song, K., Meng, Q., Yan, Y.: An end-to-end steel surface defect detection approach via fusing multiple hierarchical features. IEEE Trans. Instrum. Meas. 69(4), 1493–1504 (2019)CrossRef

18.

Chazhoor, A.A.P., Ho, E.S., Gao, B., Woo, W.L.: A review and benchmark on state-of-the-art steel defects detection. SN Comput. Sci. 5(1), 114 (2023)CrossRef

19.

Zhao, W., Chen, F., Huang, H., Li, D., Cheng, W.: A new steel defect detection algorithm based on deep learning. Comput. Intell. Neurosci. 2021, 1–13 (2021)

20.

Mirzaei, S., Mao, H., Al-Nima, R.R.O., Woo, W.L.: Explainable AI evaluation: a top-down approach for selecting optimal explanations for black box models. Information 15(1), 4 (2023)CrossRef

21.

Wang, J., Madhyastha, P., Specia, L.: Object counts! bringing explicit detections back into image captioning. arXiv preprint arXiv:1805.00314 (2018)

22.

Chun, P.J., Yamane, T., Maemura, Y.: A deep learning-based image captioning method to automatically generate comprehensive explanations of bridge damage. Comput.-Aided Civ. Infrastruct. Eng. 37(11), 1387–1401 (2022)CrossRef

23.

Iwamura, K., Louhi Kasahara, J.Y., Moro, A., Yamashita, A., Asama, H.: Image captioning using motion-CNN with object detection. Sensors 21(4), 1270 (2021)CrossRef

24.

Shao, X., Xiang, Z., Li, Y., Zhang, M.: Variational joint self-attention for image captioning. IET Image Process. 16(8), 2075–2086 (2022)CrossRef

25.

Wei, D., Wei, X., Jia, L.: Automatic defect description of railway track line image based on dense captioning. Sensors 22(17), 6419 (2022)CrossRef

26.

Yong, C., Yingchi, M., Yi, W., Ping, P., Longbao, W.: Keywords-based dam defect image caption generation. In: 2021 IEEE Seventh International Conference on Big Data Computing Service and Applications (BigDataService), pp. 214–221. IEEE (2021)

27.

Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

28.

Tsaniya, H., Fatichah, C., Suciati, N.: Transformer approaches in image captioning: a literature review. In: 2022 14th International Conference on Information Technology and Electrical Engineering (ICITEE), pp. 1–6. IEEE (2022)

29.

Wang, Y., Xu, J., Sun, Y.: End-to-end transformer based model for image captioning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 2585–2594 (2022)

30.

Dittakan, K., Prompitak, K., Thungklang, P., Wongwattanakit, C.: Image caption generation using transformer learning methods: a case study on instagram image. Multimed. Tools Appl. 1–21 (2023)

31.

Song, K., Yan, Y.: A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl. Surf. Sci. 285, 858–864 (2013)CrossRef

32.

Touvron, H., Cord, M., Jégou, H.: DeiT III: revenge of the ViT. In: Avidan, S., Brostow, G., Cisse, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. LNCS, vol. 13684, pp. 516–533. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20053-3_30

33.

Zhang, Z.: Improved adam optimizer for deep neural networks. In: IEEE/ACM 26th International Symposium on Quality of Service (IWQoS). IEEE 2018, pp. 1–2 (2018)

34.

Yao, Z., Gholami, A., Shen, S., Mustafa, M., Keutzer, K., Mahoney, M.: Adahessian: an adaptive second order optimizer for machine learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 10665–10673 (2021)

35.

Zhang, Z., Sabuncu, M.: Generalized cross entropy loss for training deep neural networks with noisy labels. Adv. Neural Inf. Process. Syst. 31 (2018)

36.

Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

Titel: MDC-Net: Multimodal Detection and Captioning Network for Steel Surface Defects
verfasst von: Anthony Ashwin Peter Chazhoor
Shanfeng Hu
Bin Gao
Wai Lok Woo
Verlag: Springer Nature Switzerland
Buch: Robotics, Computer Vision and Intelligent Systems
Print ISBN: 978-3-031-59056-6

Electronic ISBN: 978-3-031-59057-3

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-3-031-59057-3_20

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner