Deep Learning for Small and Tiny Object Detection: A Survey
Abstract
In recent years, thanks to the development of Deep Learning methods, there has been significant progress in object detection and other computer vision tasks. While generic object detection is becoming less of an issue for modern algorithms, with the Average Precision for medium and large objects in the COCO dataset approaching 70 and 80 percent, respectively, small object detection still remains an unsolved problem. Limited appearance information, blurring, and low signal-to-noise ratio cause state-of-the-art general detectors to fail when applied to small objects. Traditional feature extractors rely on downsampling, which can cause the smallest objects to disappear, and standard anchor assignment methods have proven to be less effective when used to detect low-pixel instances. In this work, we perform an exhaustive review of the literature related to small and tiny object detection. We aggregate the definitions of small and tiny objects, distinguish between small absolute and small relative sizes, and highlight their challenges. We comprehensively discuss datasets, metrics, and methods dedicated to small and tiny objects, and finally, we make a quantitative comparison on three publicly available datasets.
Keywords
Deep Learning, Small Object Detection, Tiny Object Detection, Tiny Object Detection Datasets, Tiny Object Detection Methods
Przegląd metod uczenia głębokiego w wykrywaniu małych i bardzo małych obiektów
Streszczenie
W ostatnich latach, dzięki rozwojowi metod uczenia głębokiego, dokonano znacznego postępu w detekcji obiektów i innych zadaniach widzenia maszynowego. Mimo że ogólne wykrywanie obiektów staje się coraz mniej problematyczne dla nowoczesnych algorytmów, a średnia precyzja dla średnich i dużych instancji w zbiorze COCO zbliża się odpowiednio do 70 i 80 procent, wykrywanie małych obiektów pozostaje nierozwiązanym problemem. Ograniczone informacje o wyglądzie, rozmycia i niski stosunek sygnału do szumu powodują, że najnowocześniejsze detektory zawodzą, gdy są stosowane do małych obiektów. Tradycyjne ekstraktory cech opierają się na próbkowaniu w dół, które może powodować zanikanie najmniejszych obiektów, a standardowe metody przypisania kotwic są mniej skuteczne w wykrywaniu instancji o małej liczbie pikseli. W niniejszej pracy dokonujemy wyczerpującego przeglądu literatury dotyczącej wykrywania małych i bardzo małych obiektów. Przedstawiamy definicje, rozróżniamy małe wymiary bezwzględne i względne oraz podkreślamy związane z nimi wyzwania. Kompleksowo omawiamy zbiory danych, metryki i metody, a na koniec dokonujemy porównania ilościowego na trzech publicznie dostępnych zbiorach danych.
Słowa kluczowe
metody wykrywania bardzo małych obiektów, uczenie głębokie, wykrywanie bardzo małych obiektów, wykrywanie małych obiektów, zbiory danych bardzo małych obiektów
Bibliografia
- Bochkovskiy A., Wang C.-Y., Liao H.-Y.M., Yolov4: Optimal speed and accuracy of object detection, arXiv preprint arXiv:2004.10934, 2020, DOI: 10.48550/arXiv.2004.10934.
- Girshick R., Donahue J., Darrell T., Malik J., Rich feature hierarchies for accurate object detection and semantic segmentation, [In:] Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, 580–587, DOI: 10.1109/CVPR.2014.81.
- Zhou X., Wang D., Krähenbühl P., Objects as points, arXiv preprint arXiv:1904.07850, 2019, DOI: 10.48550/arXiv.1904.07850.
- Liu W., Anguelov D., Erhan D., Szegedy C., Reed S., Fu C.-Y., Berg A.C., Ssd: Single shot multibox detector, [In:] Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, 2016, 21–37, DOI: 10.1007/978-3-319-46448-0_2.
- Lin T.-Y., Dollár P., Girshick R., He K., Hariharan B., Belongie S., Feature pyramid networks for object detection, [In:] Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, 2117–2125, DOI: 10.1109/CVPR.2017.106.
- Kos A., Majek K., CNN-based traffic sign detection on embedded devices, [In:] Proceedings of the 3rd Polish Conference on Artificial Intelligence, April 25-27, 2022, Gdynia, Poland, 108–111, [Online]. Available: https://wydawnictwo.umg.edu.pl/pp-rai2022/pdfs/25_pp-rai-2022-016.pdf.
- Wang J., Yang W., Guo H., Zhang R., Xia G.-S., Tiny object detection in aerial images, [In:] 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021, 3791–3798, DOI: 10.1109/ICPR48806.2021.9413340.
- Xu C., Wang J., Yang W., Yu H., Yu L., Xia G.-S., Detecting tiny objects in aerial images: A normalized Wasserstein distance and a new benchmark, “ISPRS Journal of Photogrammetry and Remote Sensing”, Vol. 190, 2022, 79–93, DOI: 10.1016/j.isprsjprs.2022.06.002.
- Yu X., Gong Y., Jiang N., Ye Q., Han Z., Scale match for tiny person detection, [In:] Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2020, 1257–1265, DOI: 10.1109/WACV45572.2020.9093394.
- Xia G.-S., Bai X., Ding J., Zhu Z., Belongie S., Luo J., Datcu M., Pelillo M., Zhang L., DOTA: A large-scale dataset for object detection in aerial images, [In:] Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, 3974–3983.
- Cheng G., Yuan X., Yao X., Yan K., Zeng Q., Han J., Towards large-scale small object detection: Survey and benchmarks, arXiv preprint arXiv:2207.14096, 2022, DOI: 10.1109/TPAMI.2023.3290594.
- Xu C., Wang J., Yang W., Yu L., Dot Distance for Tiny Object Detection in Aerial Images, [In:] Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2021, 1192–1201, DOI: 10.1109/CVPRW53098.2021.00130.
- Xu C., Wang J., Yang W., Yu H., Yu L., Xia G.-S., RFLA: Gaussian receptive field based label assignment for tiny object detection, [In:] Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part IX. Springer, 2022, 526–543, DOI: 10.1007/978-3-031-20077-9_31.
- Lin T.-Y., Maire M., Belongie S., Hays J., Perona P., Ramanan D., Dollar P., Zitnick C.L., Microsoft COCO: Common objects in context, [In:] European Conference on Computer Vision. Springer, 2014, 740–755, DOI: 10.1007/978-3-319-10602-1_48.
- Yang S., Luo P., Loy C.-C., Tang X., WIDER FACE: A face detection benchmark, [In:] Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 5525–5533, DOI: 10.1109/CVPR.2016.596.
- Lam D., Kuzma R., McGee K., Dooley S., Laielli M., Klaric M., Bulatov Y., McCord B., xView: Objects in context in overhead imagery, arXiv preprint arXiv:1802.07856, 2018, DOI: 10.48550/arXiv.1802.07856.
- Yu X., Chen P., Wu D., Hassan N., Li G., Yan J., Shi H., Ye Q., Han Z., Object localization under single coarse point supervision, [In:] Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 4868–4877, DOI: 10.1109/CVPR52688.2022.00482.
- Varga L.A., Kiefer B., Messmer M., Zell A., SeaDronesSee: A maritime benchmark for detecting humans in open water, [In:] Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022, 3686–3696, DOI: 10.1109/WACV51458.2022.00374.
- Du D., Qi Y., Yu H., Yang Y., Duan K., Li G., Zhang W., Huang Q., Tian Q., The unmanned aerial vehicle benchmark: Object detection and tracking, [In:] Proceedings of the European conference on computer vision (ECCV), 2018, 375–391, DOI: 10.1007/978-3-030-01249-6_23.
- Zhu P., Wen L., Du D., Bian X., Fan H., Hu Q., Ling H., Detection and tracking meet drones challenge, “IEEE Transactions on Pattern Analysis and Machine Intelligence”, Vol. 44, No. 11, 2021, 7380–7399, DOI: 10.1109/TPAMI.2021.3119563.
- Ding J., Xue N., Xia G.-S., Bai X., Yang W., Yang M. Y., Belongie S., Luo J., Datcu M., Pelillo M., Zhang L., Object detection in aerial images: A large-scale benchmark and challenges, IEEE transactions on pattern analysis and machine intelligence, Vol. 44, No. 11, 2021, 7778–7796, DOI: 10.1109/TPAMI.2021.3117983.
- Everingham M., Van Gool L., Williams C.K., Winn J., Zisserman A., The PASCAL Visual Object Classes (VOC) challenge, “International Journal of Computer Vision”, Vol. 88, No. 2, 2010, 303–338, DOI: 10.1007/s11263-009-0275-4.
- Özge Ünel F., Özkalayci B.O., Cigla C., The power of tiling for small object detection, [In:] Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019, DOI: 10.1109/CVPRW.2019.00084.
- Xie X., Cheng G., Li Q., Miao S., Li K., Han J., Fewer sis more: Efficient object detection in large aerial image, arXiv preprint arXiv:2212.13136, 2022, DOI: 10.1007/s11432-022-3718-5.
- Zhou J., Vong C.-M., Liu Q., Wang Z., Scale adaptive image cropping for UAV object detection, “Neurocomputing”, Vol. 366, 2019, 305–313, DOI: 10.1016/j.neucom.2019.07.073.
- Růžička V., Franchetti F., Fast and accurate object detection in high resolution 4K and 8K video using GPUs, [In:] 2018 IEEE High Performance extreme Computing Conference (HPEC). IEEE, 2018, DOI: 10.1109/HPEC.2018.8547574.
- Plastiras G., Kyrkou C., Theocharides T., Efficient ConvNet-based Object Detection for Unmanned Aerial Vehicles by Selective Tile Processing, [In:] Proceedings of the 12th International Conference on Distributed Smart Cameras, 2018, DOI: 10.1145/3243394.3243692.
- Yang F., Fan H., Chu P., Blasch E., Ling H., Clustered object detection in aerial images, [In:] Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, 8311–8320, DOI: 10.1109/ICCV.2019.00840.
- Zhang J., Huang J., Chen X., Zhang D., How to fully exploit the abilities of aerial image detectors, [In:] Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2019, DOI: 10.1109/ICCVW.2019.00007.
- Kos A., Majek K., Belter D., Where to look for tiny objects? ROI prediction for tiny object detection in high resolution images, [In:] 2022 17th International Conference on Control, Automation, Robotics and Vision (ICARCV). IEEE, 2022, 721–726, DOI: 10.1109/ICARCV57592.2022.10004372.
- Li C., Yang T., Zhu S., Chen C., Guan S., Density map guided object detection in aerial images, [In:] Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, 190–191, DOI: 10.1109/CVPRW50498.2020.00103.
- Duan C., Wei Z., Zhang C., Qu S., Wang H., Coarsegrained density map guided object detection in aerial images, [In:] Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, 2789–2798, DOI: 10.1109/ICCVW54120.2021.00313.
- Koyun O.C., Keser R.K., Akkaya I.B., Töreyin B.U., Focus-and-Detect: A small object detection framework for aerial images, “Signal Processing: Image Communication”, Vol. 104, 2022, DOI: 10.1016/j.image.2022.116675.
- Wang Y., Yang Y., Zhao X., Object detection using clustering algorithm adaptive searching regions in aerial images, [In:] European Conference on Computer Vision. Springer, 2020, 651–664, DOI: 10.1007/978-3-030-66823-5_39.
- Xu J., Li Y., Wang S., AdaZoom: adaptive zoom network for multiscale object detection in large scenes, arXiv preprint arXiv:2106.10409, 2021, DOI: 10.48550/arXiv.2106.10409.
- Deng S., Li S., Xie K., Song W., Liao X., Hao A., Qin H., A global-local self-adaptive network for drone-view object detection, “IEEE Transactions on Image Processing”, Vol. 30, 2020, 1556–1569, DOI: 10.1109/TIP.2020.3045636.
- Kisantal M., Wojna Z., Murawski J., Naruniec J., Cho K., Augmentation for small object detection, arXiv preprint arXiv:1902.07296, 2019, DOI: 10.48550/arXiv.1902.07296.
- Chen C., Zhang Y., Lv Q., Wei S., Wang X., Sun X., Dong J., RRNet: A hybrid detector for object detection in drone-captured images, [In:] Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2019, DOI: 10.1109/ICCVW.2019.00018.
- Wang X., Zhu D., Yan Y., Towards efficient detection for small objects via attention-guided detection network and data augmentation, “Sensors”, Vol. 22, No. 19, 2022, DOI: 10.3390/s22197663.
- Zhang X., Izquierdo E., Chandramouli K., Dense and small object detection in UAV vision based on cascade network, [In:] Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2019, DOI: 10.1109/ICCVW.2019.00020.
- Bosquet B., Cores D., Seidenari L., Brea V.M., Mucientes M., Del Bimbo A., A full data augmentation pipeline for small object detection based on generative adversarial networks, “Pattern Recognition”, Vol. 133, 2023, DOI: 10.1016/j.patcog.2022.108998.
- Zhang S., Zhu X., Lei Z., Shi H., Wang X., Li S.Z., S3 FD: Single Shot Scale-invariant Face Detector, [In:] Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, 192–201, DOI: 10.1109/ICCV.2017.30.
- Zhu C., Tao R., Luu K., Savvides M., Seeing small faces from robust anchor’s perspective, [In:] Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, 5127–5136, DOI: 10.1109/CVPR.2018.00538.
- Yang X., Yang J., Yan J., Zhang Y., Zhang T., Guo Z., Sun X., Fu K., SCRDet: Towards more robust detection for small, cluttered and rotated objects, [In:] Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, 8232–8241, DOI: 10.1109/ICCV.2019.00832.
- Yi K., Jian Z., Chen S., Zheng N., Feature selective small object detection via knowledge-based recurrent attentive neural network, arXiv preprint arXiv:1803.05263, 2018.
- Lu X., Ji J., Xing Z., Miao Q., Attention and feature fusion SSD for remote sensing object detection, “IEEE Transactions on Instrumentation and Measurement”, Vol. 70, 2021, DOI: 10.1109/TIM.2021.3052575.
- Ran Q., Wang Q., Zhao B., Wu Y., Pu S., Li Z., Lightweight oriented object detection using multiscale context and enhanced channel attention in remote sensing images, “IEEE Journal of Selected Topics Applied Earth Observations and Remote Sensing”, Vol. 14, 2021, 5786–5795, DOI: 10.1109/JSTARS.2021.3079968.
- Li Y., Huang Q., Pei X., Chen Y., Jiao L., Shang R., Cross-layer attention network for small object detection in remote sensing imagery, “IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing”, Vol. 14, 2020, 2148–2161, DOI: 10.1109/JSTARS.2020.3046482.
- Fu J., Sun X., Wang Z., Fu K., An anchor-free method based on feature balancing and refinement network for multiscale ship detection in SAR images, “IEEE Transactions on Geoscience and Remote Sensing”, Vol. 59, No. 2, 2020, 1331–1344, DOI: 10.1109/TGRS.2020.3005151.
- Hu J., Shen L., Sun G., Squeeze-and-Excitation Networks, [In:] Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, 7132–7141, DOI: 10.1109/CVPR.2018.00745.
- Zhang H., Wang K., Tian Y., Gou C., Wang F.-Y., MFRCNN: Incorporating multi-scale features and global information for traffic object detection, “IEEE Transactions on Vehicular Technology”, Vol. 67, No. 9, 2018, 8019–8030, DOI: 10.1109/TVT.2018.2843394.
- Liu Z., Gao G., Sun L., Fang L., IPG-Net: Image pyramid guidance network for small object detection, [In:] Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, DOI: 10.1109/CVPRW50498.2020.00521.
- Gong Y., Yu X., Ding Y., Peng X., Zhao J., Han Z., Effective fusion factor in FPN for tiny object detection, [In:] Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2021, 1160–1168, DOI: 10.1109/WACV48630.2021.00120.
- Hong M., Li S., Yang Y., Zhu F., Zhao Q., Lu L., SSPNet: Scale selection pyramid network for tiny person detection from UAV images, “IEEE Geoscience and Remote Sensing Letters”, Vol. 19, 2021, DOI: 10.1109/LGRS.2021.3103069.
- Liu Y., Yang F., Hu P., Small-object detection in UAV-captured images via multi-branch parallel feature pyramid networks, “IEEE Access”, Vol. 8, 2020, 145 740–145 750, DOI: 10.1109/ACCESS.2020.3014910.
- Pang J., Li C., Shi J., Xu Z., Feng H., R2 -CNN: Fast tiny object detection in large-scale remote sensing images. arXiv preprint arXiv:1902.06042, DOI: 10.1109/TGRS.2019.2899955.
- Li J., Wang Y., Wang C., Tai Y., Qian J., Yang J., Wang C., Li J., Huang F., DSFD: Dual Shot Face Detector, [In:] Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, 5060–5069, DOI: 10.1109/CVPR.2019.00520.
- Najibi M., Samangouei P., Chellappa R., Davis L.S., SSH: Single Stage Headless Face Detector, [In:] Proceedings of the IEEE International Conference on Computer Vision, 2017, 4885–4894, DOI: 10.1109/ICCV.2017.522.
- Yang C., Huang Z., Wang N., QueryDet: Cascaded sparse query for accelerating high-resolution small object detection, [In:] Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 13 658–13 677, DOI: 10.1109/CVPR52688.2022.01330.
- Chen C., Liu M.-Y., Tuzel O., Xiao J., R-CNN for small object detection, [In:] ACCV 2016: 13th Asian Conference on Computer Vision, Revised Selected Papers, Part V 13. Springer, 2017, 214–230, DOI: 10.1007/978-3-319- 54193-8_14.
- Bell S., Zitnick C.L., Bala K., Girshick R., Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks, [In:] Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 2874–2883, DOI: 10.1109/CVPR.2016.314.
- Hu P., Ramanan D., Finding tiny faces, [In:] Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1522–1530, DOI: 10.1109/CVPR.2017.166.
- Tang X., Du D.K., He Z., Liu J., PyramidBox: A con text-assisted single shot face detector, [In:] Proceedings of the European Conference on Computer Vision (ECCV), 2018, 812–828, DOI: 10.1007/978-3-030-01240-3_49.
- Liang X., Zhang J., Zhuo L., Li Y., Tian Q., Small object detection in unmanned aerial vehicle images using feature fusion and scaling-based single shot detector with spatial context analysis, "IEEE Transactions on Circuits and Systems for Video Technology", Vol. 30, No. 6, 2019, 1758–1770, DOI: 10.1109/TCSVT.2019.2905881.
- Hu X., Xu X., Xiao Y., Chen H., He S., Qin J., Heng P.-A., SINet: A scale-insensitive convolutional neural network for fast vehicle detection, "IEEE Transactions on Intelligent Transportation Systems”, Vol. 20, No. 3, 2018, 1010–1019, DOI: 10.1109/TITS.2018.2838132.
- Zhang G., Lu S., Zhang W., CAD-Net: A context-aware detection network for objects in remote sensing imagery, “IEEE Transactions on Geoscience and Remote Sensing”, Vol. 57, No. 12, 2019, 10 015–10 024, DOI: 10.1109/TGRS.2019.2930982.
- Wang H., Wang Z., Jia M., Li A., Feng T., Zhang W., Jiao L., Spatial Attention for Multi-Scale Feature Refinement for Object Detection, [In:] Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2019, DOI: 10.1109/ICCVW.2019.00014.
- Li J., Liang X., Wei Y., Xu T., Feng J., Yan S., Perceptual generative adversarial networks for small object detection, [In:] Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1222–1230, DOI: 10.1109/CVPR.2017.211.
- Bai Y., Zhang Y., Ding M., Ghanem B., Finding tiny faces in the wild with generative adversarial network, [In:] Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, 21–30, DOI: 10.1109/CVPR.2018.00010.
- Bai Y., Zhang Y., Ding M., Ghanem B., SOD-MTGAN: Small object detection via multi-task generative adversarial network, [In:] Proceedings of the European Conference on Computer Vision (ECCV), 2018, 210–226, DOI: 10.1007/978-3-030-01261-8_13.
- Noh J., Bae W., Lee W., Seo J., Kim G., Better to Follow, Follow to Be Better: Towards Precise Supervision of Feature Super-Resolution for Small Object Detection, [In:] Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, 9725–9734, DOI: 10.1109/ICCV.2019.00982.
- Jiang K., Wang Z., Yi P., Wang G., Lu T., Jiang J., Edge-Enhanced GAN for Remote Sensing Image Superresolution, “IEEE Transactions on Geoscience and Remote Sensing”, Vol. 57, No. 8, 2019, 5799–5812, DOI: 10.1109/TGRS.2019.2902431.
- Ji H., Gao Z., Mei T., Ramesh B., Vehicle detection in remote sensing images leveraging on simultaneous super-resolution, “IEEE Geoscience and Remote Sensing Letters”, Vol. 17, No. 4, 2019, 676–680, DOI: 10.1109/LGRS.2019.2930308.
- Wu J., Zhou C., Zhang Q., Yang M., Yuan J., Self-mimic learning for smal l-scale pedestrian detection, [In:] Proceedings of the 28th ACM International Conference on Multimedia, 2020, 2012–2020, DOI: 10.1145/3394171.3413634.
- Kim J.U., Park S., Ro Y.M., Robust small-scale pedestrian detection with cued recall via memory learning, [In:] Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, 3050–3059, DOI: 10.1109/ICCV48922.2021.00304.
- Pan X., Tang F., Dong W., Gu Y., Song Z., Meng Y., Xu P., Deussen O., Xu C., Self-Supervised Feature Augmentation for Large Image Object Detection, “IEEE Transactions on Image Processing”, Vol. 29, 2020, 6745–6758, DOI: 10.1109/TIP.2020.2993403.
- Wang J., Xu C., Yang W., Yu L., A normalized gaussian Wasserstein distance for tiny object detection, arXiv preprint arXiv:2110.13389, 2021, DOI: 10.48550/arXiv.2110.13389