Boundary-Aware Mask Refinement for YOLO11 Instance Segmentation of Aerial Imagery

K.A. Budakov, E.V. Druzhinskaya

Abstract


This study investigates the Boundary-Aware Mask Refinement (BAMR) module for improving instance segmentation quality in aerial imagery using the YOLO11 architecture. Experiments are conducted on a two-class LandCover.ai setup under a unified training protocol with a pretrained baseline model. The BAMR v1 configuration shows only a directional improvement trend that does not reach statistical significance on five matched seeds (p = 0.130). For the BAMR v2 configuration, implemented as a minimal Proto-branch modification with low-rank adapters, five-seed paired validation confirms a statistically significant improvement in mask mAP50-95: the mean paired difference is +0.00415 ± 0.00276 with t = 3.365, p = 0.0282, a 95 % confidence interval of [+0.00073, +0.00757], and 5 of 5 positive paired differences. Qualitative analysis indicates that the gain is most visible on elongated and curved boundaries, whereas the effect becomes small on large smooth woodland polygons. Overall, the results confirm the hypothesis that the main reserve for improvement on this dataset lies in better mask-branch boundary refinement while preserving pretrained weights.


Full Text:

PDF (Russian)

References


J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in Proc. IEEE CVPR, 2016, pp. 779-788.

J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," in Proc. IEEE CVPR, 2017, pp. 6517-6525.

J. Redmon and A. Farhadi, "YOLOv3: An Incremental Improvement," arXiv:1804.02767, 2018.

A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, "YOLOv4: Optimal Speed and Accuracy of Object Detection," arXiv:2004.10934, 2020.

C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, "YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors," in Proc. IEEE CVPR, 2023, pp. 7464-7475.

Ultralytics, "YOLO11," https://docs.ultralytics.com/models/yolo11/, 2025. [Accessed: Feb. 20, 2026].

D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, "YOLACT: Real-time Instance Segmentation," in Proc. IEEE ICCV, 2019, pp. 9157-9166.

D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, "YOLACT++: Better Real-time Instance Segmentation," IEEE Trans. PAMI, vol. 44, no. 2, pp. 1108-1121, 2022.

K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask R-CNN," in Proc. IEEE ICCV, 2017, pp. 2961-2969.

X. Wang, T. Kong, C. Shen, Y. Jiang, and L. Li, "SOLO: Segmenting Objects by Locations," in Proc. ECCV, 2020, pp. 649-665.

X. Wang, R. Zhang, T. Kong, L. Li, and C. Shen, "SOLOv2: Dynamic and Fast Instance Segmentation," in Proc. NeurIPS, 2020, pp. 17721-17732.

Z. Tian, C. Shen, and H. Chen, "Conditional Convolutions for Instance Segmentation," in Proc. ECCV, 2020, pp. 282-298.

A. Kirillov, Y. Wu, K. He, and R. Girshick, "PointRend: Image Segmentation as Rendering," in Proc. IEEE CVPR, 2020, pp. 9799-9808.

B. Cheng, R. Girshick, P. Dollar, A. C. Berg, and A. Kirillov, "Boundary IoU: Improving Object-Centric Image Segmentation Evaluation," in Proc. IEEE CVPR, 2021, pp. 15334-15342.

T. Takikawa, D. Acuna, V. Jampani, and S. Fidler, "Gated-SCNN: Gated Shape CNNs for Semantic Segmentation," in Proc. IEEE ICCV, 2019, pp. 5229-5238.

J. Hu, L. Shen, and G. Sun, "Squeeze-and-Excitation Networks," in Proc. IEEE CVPR, 2018, pp. 7132-7141.

S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, "CBAM: Convolutional Block Attention Module," in Proc. ECCV, 2018, pp. 3-19.

X. Wang, R. Girshick, A. Gupta, and K. He, "Non-local Neural Networks," in Proc. IEEE CVPR, 2018, pp. 7794-7803.

A. Vaswani et al., "Attention Is All You Need," in Proc. NeurIPS, 2017, pp. 5998-6008.

T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and S. Belongie, "Feature Pyramid Networks for Object Detection," in Proc. IEEE CVPR, 2017, pp. 2117-2125.

S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, "Path Aggregation Network for Instance Segmentation," in Proc. IEEE CVPR, 2018, pp. 8759-8768.

F. Milletari, N. Navab, and S.-A. Ahmadi, "V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation," in Proc. 3DV, 2016, pp. 565-571.

O. Ronneberger, P. Fischer, and T. Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," in Proc. MICCAI, 2015, pp. 234-241.

A. Boguszewski, D. Batorski, N. Ziemba-Jankowska, T. Dziedzic, and A. Zambrzycka, "LandCover.ai: Dataset for Automatic Mapping of Buildings, Woodlands, Water and Roads from Aerial Images," in Proc. IEEE/CVF CVPR Workshops, 2021, pp. 1102-1110.

G.-S. Xia et al., "DOTA: A Large-scale Dataset for Object Detection in Aerial Images," in Proc. IEEE CVPR, 2018, pp. 3974-3983.

S. W. Zamir et al., "iSAID: A Large-scale Dataset for Instance Segmentation in Aerial Images," in Proc. CVPR Workshops, 2019.

H. Touvron, M. Cord, A. Sablayrolles, G. Synnaeve, and H. Jégou, "Going Deeper with Image Transformers," in Proc. IEEE ICCV, 2021, pp. 32-42.

E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, "LoRA: Low-Rank Adaptation of Large Language Models," in Proc. ICLR, 2022.

K. Zhang and D. Liu, "Customized Segment Anything Model for Medical Image Segmentation," arXiv:2304.13785, 2023.

S. Aleem, J. Dietlmeier, E. Arazo, and S. Little, "ConvLoRA and AdaBN-based Domain Adaptation via Self-Training," in Proc. IEEE ISBI, 2024.

D. Picard, "Torch.manual_seed(3407) is all you need: On the influence of random seeds in deep learning architectures for computer vision," arXiv:2109.08203, 2021.

X. Bouthillier, P. Delaunay, M. Bronzi et al., "Accounting for Variance in Machine Learning Benchmarks," in Proc. MLSys, 2021, pp. 747-769.

J. Cohen, Statistical Power Analysis for the Behavioral Sciences, 2nd ed. Hillsdale, NJ: Lawrence Erlbaum, 1988.


Refbacks

  • There are currently no refbacks.


Abava  Кибербезопасность Monetec 2026 СНЭ

ISSN: 2307-8162