Skip to main content

2024 | OriginalPaper | Buchkapitel

Improved Remote Sensing Image Rotating Target Detection Algorithm Based on Transformer

verfasst von : Shujun Hui, Pengcheng Wang, Bin Luan, Xin Zhao, Shang Ma

Erschienen in: Proceedings of the 2nd International Conference on Internet of Things, Communication and Intelligent Technology

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As satellite remote sensing and aerial photography technologies continue to advance in recent years, there has been a noticeable increase in both the resolution and image quality of remote sensing images. Furthermore, an abundance of data sources has emerged, intensifying the challenges associated with detection. To address the challenges posed by small object size and dense distribution in remote sensing images, an innovative solution has been introduced. This solution entails an enhanced rotating object detection algorithm which leverages the power of vision Transformer technology. By utilizing this approach, the aim is to overcome the limitations of poor robustness and low detection accuracy commonly encountered in such scenarios.The enhancement of the feature extraction capability of the detection algorithm in YOLOv4’s feature fusion part is achieved through the introduction of the MS-Transformer module. This module, known for its self-attention mechanism, facilitates the acquisition of pertinent information among targets, thereby bolstering the algorithm’s ability to detect densely distributed targets. Moreover, the advancement of the five-coordinate YOLOv4 object detection framework enables the realization of multi-angle remote sensing object detection. To mitigate the issue of overlapping prediction frames on dense targets, the model incorporates the soft-NMS suppression method, ultimately refining the detection performance. The efficacy of the proposed algorithm in improving the model’s detection capability is substantiated through experimentation using the DOTA dataset.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Kou, Q., Cheng, D., Zhuang, H., Gao, R.: Cross-complementary local binary pattern for robust texture classification. IEEE Signal Process. Lett. 26(1), 129–133 (2019)CrossRef Kou, Q., Cheng, D., Zhuang, H., Gao, R.: Cross-complementary local binary pattern for robust texture classification. IEEE Signal Process. Lett. 26(1), 129–133 (2019)CrossRef
2.
Zurück zum Zitat Cheng, D., Chen, L., Lv, C., Guo, L., Kou, Q.: Light-guided and cross-fusion U-Net for anti-illumination image super-resolution. IEEE Trans. Circuits Syst. Video Technol. 32(12), 8436–8449 (2022)CrossRef Cheng, D., Chen, L., Lv, C., Guo, L., Kou, Q.: Light-guided and cross-fusion U-Net for anti-illumination image super-resolution. IEEE Trans. Circuits Syst. Video Technol. 32(12), 8436–8449 (2022)CrossRef
3.
Zurück zum Zitat Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint: arXiv:2004.10934 (2020) Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint: arXiv:2004.10934 (2020)
4.
Zurück zum Zitat Xia, G., et al.: DOTA: a large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3974–3983 (2018) Xia, G., et al.: DOTA: a large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3974–3983 (2018)
5.
Zurück zum Zitat Wei, H., et al.: Oriented objects as pairs of middle lines. ISPRS J. Photogramm. Remote. Sens. 169, 268–279 (2020)CrossRef Wei, H., et al.: Oriented objects as pairs of middle lines. ISPRS J. Photogramm. Remote. Sens. 169, 268–279 (2020)CrossRef
6.
Zurück zum Zitat Lin, T., et al.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 318–327 (2017)CrossRef Lin, T., et al.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 318–327 (2017)CrossRef
7.
Zurück zum Zitat Wang, J., et al.: Learning center probability mAP for detecting objects in aerial images. IEEE Trans. Geosci. Remote Sens. 59(5), 4307–4323 (2020)CrossRef Wang, J., et al.: Learning center probability mAP for detecting objects in aerial images. IEEE Trans. Geosci. Remote Sens. 59(5), 4307–4323 (2020)CrossRef
8.
Zurück zum Zitat Wang, J., Ding, J., Guo, H., Cheng, W., Pan, T., Yang, W.: Mask OBB: a semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images. Remote Sens. 11(24), 2930–2951 (2019)CrossRef Wang, J., Ding, J., Guo, H., Cheng, W., Pan, T., Yang, W.: Mask OBB: a semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images. Remote Sens. 11(24), 2930–2951 (2019)CrossRef
Metadaten
Titel
Improved Remote Sensing Image Rotating Target Detection Algorithm Based on Transformer
verfasst von
Shujun Hui
Pengcheng Wang
Bin Luan
Xin Zhao
Shang Ma
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-2757-5_60

Premium Partner