Skip to main content

2024 | OriginalPaper | Buchkapitel

MDC-Net: Multimodal Detection and Captioning Network for Steel Surface Defects

verfasst von : Anthony Ashwin Peter Chazhoor, Shanfeng Hu, Bin Gao, Wai Lok Woo

Erschienen in: Robotics, Computer Vision and Intelligent Systems

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the highly competitive steel sector, product quality, particularly in terms of surface integrity, is critical. Surface defect detection (SDD) is essential in maintaining high production standards, as it directly impacts product quality and manufacturing efficiency. Traditional SDD approaches, which rely primarily on manual inspection or traditional computer vision techniques, are plagued with difficulties, including reduced accuracy and potential health concerns to inspectors. This research describes an innovative solution that uses a sequence generation model with transformers to improve the defect detection process while manufacturing hot-rolled steel sheets and generating captions about the defect and its spatial location. This method, which views object detection as a sequence generation problem, allows for a more sophisticated understanding of image content and a complete and contextually rich investigation of surface defects whilst providing captions. While this method can potentially improve detection accuracy, its actual power rests in its scalability and flexibility to various industrial applications. Furthermore, this technique has the potential to be further enhanced for visual question-answering applications, opening up opportunities for interactive and intelligent image analysis.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016) Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
2.
Zurück zum Zitat Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015) Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
3.
Zurück zum Zitat Xie, Y., Hu, W., Xie, S., He, L.: Surface defect detection algorithm based on feature-enhanced yolo. Cogn. Comput. 15(2), 565–579 (2023)CrossRef Xie, Y., Hu, W., Xie, S., He, L.: Surface defect detection algorithm based on feature-enhanced yolo. Cogn. Comput. 15(2), 565–579 (2023)CrossRef
4.
Zurück zum Zitat Chen, T., Saxena, S., Li, L., Fleet, D.J., Hinton, G.: Pix2seq: a language modeling framework for object detection. arXiv preprint arXiv:2109.10852 (2021) Chen, T., Saxena, S., Li, L., Fleet, D.J., Hinton, G.: Pix2seq: a language modeling framework for object detection. arXiv preprint arXiv:​2109.​10852 (2021)
5.
Zurück zum Zitat Luo, Q., He, Y.: A cost-effective and automatic surface defect inspection system for hot-rolled flat steel. Robot. Comput.-Integr. Manuf. 38, 16–30 (2016)CrossRef Luo, Q., He, Y.: A cost-effective and automatic surface defect inspection system for hot-rolled flat steel. Robot. Comput.-Integr. Manuf. 38, 16–30 (2016)CrossRef
6.
Zurück zum Zitat Helifa, B., Oulhadj, A., Benbelghit, A., Lefkaier, I., Boubenider, F., Boutassouna, D.: Detection and measurement of surface cracks in ferromagnetic materials using eddy current testing. Ndt & E Int. 39(5), 384–390 (2006)CrossRef Helifa, B., Oulhadj, A., Benbelghit, A., Lefkaier, I., Boubenider, F., Boutassouna, D.: Detection and measurement of surface cracks in ferromagnetic materials using eddy current testing. Ndt & E Int. 39(5), 384–390 (2006)CrossRef
7.
Zurück zum Zitat Li, X., Gao, B., Woo, W.L., Tian, G.Y., Qiu, X., Gu, L.: Quantitative surface crack evaluation based on eddy current pulsed thermography. IEEE Sens. J. 17(2), 412–421 (2016)CrossRef Li, X., Gao, B., Woo, W.L., Tian, G.Y., Qiu, X., Gu, L.: Quantitative surface crack evaluation based on eddy current pulsed thermography. IEEE Sens. J. 17(2), 412–421 (2016)CrossRef
8.
Zurück zum Zitat Shrestha, R., Park, J., Kim, W.: Application of thermal wave imaging and phase shifting method for defect detection in stainless steel. Infrared Phys. Technol. 76, 676–683 (2016)CrossRef Shrestha, R., Park, J., Kim, W.: Application of thermal wave imaging and phase shifting method for defect detection in stainless steel. Infrared Phys. Technol. 76, 676–683 (2016)CrossRef
9.
Zurück zum Zitat Gao, B., Li, X., Woo, W.L., Yun Tian, G.: Physics-based image segmentation using first order statistical properties and genetic algorithm for inductive thermography imaging. IEEE Trans. Image Process. 27(5) (2017) 2160–2175 Gao, B., Li, X., Woo, W.L., Yun Tian, G.: Physics-based image segmentation using first order statistical properties and genetic algorithm for inductive thermography imaging. IEEE Trans. Image Process. 27(5) (2017) 2160–2175
10.
Zurück zum Zitat Li, X.G., Miao, C.Y., Wang, J., Zhang, Y.: Automatic defect detection method for the steel cord conveyor belt based on its x-ray images. In: 2011 International Conference on Control, Automation and Systems Engineering (CASE), pp. 1–4. IEEE (2011) Li, X.G., Miao, C.Y., Wang, J., Zhang, Y.: Automatic defect detection method for the steel cord conveyor belt based on its x-ray images. In: 2011 International Conference on Control, Automation and Systems Engineering (CASE), pp. 1–4. IEEE (2011)
11.
Zurück zum Zitat Zhang, Y., et al.: Development of a cross-scale weighted feature fusion network for hot-rolled steel surface defect detection. Eng. Appl. Artif. Intell. 117, 105628 (2023)CrossRef Zhang, Y., et al.: Development of a cross-scale weighted feature fusion network for hot-rolled steel surface defect detection. Eng. Appl. Artif. Intell. 117, 105628 (2023)CrossRef
12.
Zurück zum Zitat Demir, K., Ay, M., Cavas, M., Demir, F.: Automated steel surface defect detection and classification using a new deep learning-based approach. Neural Comput. Appl. 35(11), 8389–8406 (2023)CrossRef Demir, K., Ay, M., Cavas, M., Demir, F.: Automated steel surface defect detection and classification using a new deep learning-based approach. Neural Comput. Appl. 35(11), 8389–8406 (2023)CrossRef
13.
Zurück zum Zitat Yang, L., Xu, S., Fan, J., Li, E., Liu, Y.: A pixel-level deep segmentation network for automatic defect detection. Expert Syst. Appl. 215, 119388 (2023)CrossRef Yang, L., Xu, S., Fan, J., Li, E., Liu, Y.: A pixel-level deep segmentation network for automatic defect detection. Expert Syst. Appl. 215, 119388 (2023)CrossRef
14.
Zurück zum Zitat Ji, A., Thee, Q.Y., Woo, W.L., Wong, E.: Experimental investigations of a convolutional neural network model for detecting railway track anomalies. In: IECON 2023-49th Annual Conference of the IEEE Industrial Electronics Society, pp. 1–7. IEEE (2023) Ji, A., Thee, Q.Y., Woo, W.L., Wong, E.: Experimental investigations of a convolutional neural network model for detecting railway track anomalies. In: IECON 2023-49th Annual Conference of the IEEE Industrial Electronics Society, pp. 1–7. IEEE (2023)
15.
Zurück zum Zitat Chazhoor, A.A.P., Zhu, M., Ho, E.S.L., Gao, B., Woo, W.L.: Classification of different types of plastics using deep transfer learning. In: ROBOVIS, SciTePress, Science and Technology Publications, pp. 190–195 (2021) Chazhoor, A.A.P., Zhu, M., Ho, E.S.L., Gao, B., Woo, W.L.: Classification of different types of plastics using deep transfer learning. In: ROBOVIS, SciTePress, Science and Technology Publications, pp. 190–195 (2021)
16.
Zurück zum Zitat Chazhoor, A.A.P., Ho, E.S., Gao, B., Woo, W.L.: Deep transfer learning benchmark for plastic waste classification. Intell. Robot. 2, 1–19 (2022) Chazhoor, A.A.P., Ho, E.S., Gao, B., Woo, W.L.: Deep transfer learning benchmark for plastic waste classification. Intell. Robot. 2, 1–19 (2022)
17.
Zurück zum Zitat He, Y., Song, K., Meng, Q., Yan, Y.: An end-to-end steel surface defect detection approach via fusing multiple hierarchical features. IEEE Trans. Instrum. Meas. 69(4), 1493–1504 (2019)CrossRef He, Y., Song, K., Meng, Q., Yan, Y.: An end-to-end steel surface defect detection approach via fusing multiple hierarchical features. IEEE Trans. Instrum. Meas. 69(4), 1493–1504 (2019)CrossRef
18.
Zurück zum Zitat Chazhoor, A.A.P., Ho, E.S., Gao, B., Woo, W.L.: A review and benchmark on state-of-the-art steel defects detection. SN Comput. Sci. 5(1), 114 (2023)CrossRef Chazhoor, A.A.P., Ho, E.S., Gao, B., Woo, W.L.: A review and benchmark on state-of-the-art steel defects detection. SN Comput. Sci. 5(1), 114 (2023)CrossRef
19.
Zurück zum Zitat Zhao, W., Chen, F., Huang, H., Li, D., Cheng, W.: A new steel defect detection algorithm based on deep learning. Comput. Intell. Neurosci. 2021, 1–13 (2021) Zhao, W., Chen, F., Huang, H., Li, D., Cheng, W.: A new steel defect detection algorithm based on deep learning. Comput. Intell. Neurosci. 2021, 1–13 (2021)
20.
Zurück zum Zitat Mirzaei, S., Mao, H., Al-Nima, R.R.O., Woo, W.L.: Explainable AI evaluation: a top-down approach for selecting optimal explanations for black box models. Information 15(1), 4 (2023)CrossRef Mirzaei, S., Mao, H., Al-Nima, R.R.O., Woo, W.L.: Explainable AI evaluation: a top-down approach for selecting optimal explanations for black box models. Information 15(1), 4 (2023)CrossRef
21.
Zurück zum Zitat Wang, J., Madhyastha, P., Specia, L.: Object counts! bringing explicit detections back into image captioning. arXiv preprint arXiv:1805.00314 (2018) Wang, J., Madhyastha, P., Specia, L.: Object counts! bringing explicit detections back into image captioning. arXiv preprint arXiv:​1805.​00314 (2018)
22.
Zurück zum Zitat Chun, P.J., Yamane, T., Maemura, Y.: A deep learning-based image captioning method to automatically generate comprehensive explanations of bridge damage. Comput.-Aided Civ. Infrastruct. Eng. 37(11), 1387–1401 (2022)CrossRef Chun, P.J., Yamane, T., Maemura, Y.: A deep learning-based image captioning method to automatically generate comprehensive explanations of bridge damage. Comput.-Aided Civ. Infrastruct. Eng. 37(11), 1387–1401 (2022)CrossRef
23.
Zurück zum Zitat Iwamura, K., Louhi Kasahara, J.Y., Moro, A., Yamashita, A., Asama, H.: Image captioning using motion-CNN with object detection. Sensors 21(4), 1270 (2021)CrossRef Iwamura, K., Louhi Kasahara, J.Y., Moro, A., Yamashita, A., Asama, H.: Image captioning using motion-CNN with object detection. Sensors 21(4), 1270 (2021)CrossRef
24.
Zurück zum Zitat Shao, X., Xiang, Z., Li, Y., Zhang, M.: Variational joint self-attention for image captioning. IET Image Process. 16(8), 2075–2086 (2022)CrossRef Shao, X., Xiang, Z., Li, Y., Zhang, M.: Variational joint self-attention for image captioning. IET Image Process. 16(8), 2075–2086 (2022)CrossRef
25.
Zurück zum Zitat Wei, D., Wei, X., Jia, L.: Automatic defect description of railway track line image based on dense captioning. Sensors 22(17), 6419 (2022)CrossRef Wei, D., Wei, X., Jia, L.: Automatic defect description of railway track line image based on dense captioning. Sensors 22(17), 6419 (2022)CrossRef
26.
Zurück zum Zitat Yong, C., Yingchi, M., Yi, W., Ping, P., Longbao, W.: Keywords-based dam defect image caption generation. In: 2021 IEEE Seventh International Conference on Big Data Computing Service and Applications (BigDataService), pp. 214–221. IEEE (2021) Yong, C., Yingchi, M., Yi, W., Ping, P., Longbao, W.: Keywords-based dam defect image caption generation. In: 2021 IEEE Seventh International Conference on Big Data Computing Service and Applications (BigDataService), pp. 214–221. IEEE (2021)
27.
Zurück zum Zitat Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017) Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
28.
Zurück zum Zitat Tsaniya, H., Fatichah, C., Suciati, N.: Transformer approaches in image captioning: a literature review. In: 2022 14th International Conference on Information Technology and Electrical Engineering (ICITEE), pp. 1–6. IEEE (2022) Tsaniya, H., Fatichah, C., Suciati, N.: Transformer approaches in image captioning: a literature review. In: 2022 14th International Conference on Information Technology and Electrical Engineering (ICITEE), pp. 1–6. IEEE (2022)
29.
Zurück zum Zitat Wang, Y., Xu, J., Sun, Y.: End-to-end transformer based model for image captioning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 2585–2594 (2022) Wang, Y., Xu, J., Sun, Y.: End-to-end transformer based model for image captioning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 2585–2594 (2022)
30.
Zurück zum Zitat Dittakan, K., Prompitak, K., Thungklang, P., Wongwattanakit, C.: Image caption generation using transformer learning methods: a case study on instagram image. Multimed. Tools Appl. 1–21 (2023) Dittakan, K., Prompitak, K., Thungklang, P., Wongwattanakit, C.: Image caption generation using transformer learning methods: a case study on instagram image. Multimed. Tools Appl. 1–21 (2023)
31.
Zurück zum Zitat Song, K., Yan, Y.: A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl. Surf. Sci. 285, 858–864 (2013)CrossRef Song, K., Yan, Y.: A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl. Surf. Sci. 285, 858–864 (2013)CrossRef
32.
33.
Zurück zum Zitat Zhang, Z.: Improved adam optimizer for deep neural networks. In: IEEE/ACM 26th International Symposium on Quality of Service (IWQoS). IEEE 2018, pp. 1–2 (2018) Zhang, Z.: Improved adam optimizer for deep neural networks. In: IEEE/ACM 26th International Symposium on Quality of Service (IWQoS). IEEE 2018, pp. 1–2 (2018)
34.
Zurück zum Zitat Yao, Z., Gholami, A., Shen, S., Mustafa, M., Keutzer, K., Mahoney, M.: Adahessian: an adaptive second order optimizer for machine learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 10665–10673 (2021) Yao, Z., Gholami, A., Shen, S., Mustafa, M., Keutzer, K., Mahoney, M.: Adahessian: an adaptive second order optimizer for machine learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 10665–10673 (2021)
35.
Zurück zum Zitat Zhang, Z., Sabuncu, M.: Generalized cross entropy loss for training deep neural networks with noisy labels. Adv. Neural Inf. Process. Syst. 31 (2018) Zhang, Z., Sabuncu, M.: Generalized cross entropy loss for training deep neural networks with noisy labels. Adv. Neural Inf. Process. Syst. 31 (2018)
36.
Zurück zum Zitat Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017) Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Metadaten
Titel
MDC-Net: Multimodal Detection and Captioning Network for Steel Surface Defects
verfasst von
Anthony Ashwin Peter Chazhoor
Shanfeng Hu
Bin Gao
Wai Lok Woo
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-59057-3_20

Premium Partner