Abstract
Automated object detectors on Unmanned Aerial Vehi-cles (UAVs) are increasingly employed for a wide rangeof tasks. However, to be accurate in their specific taskthey need expensive ground truth in the form of boundingboxes or positional information. Weakly-Supervised Ob-ject Detection (WSOD) overcomes this hindrance by local-izing objects with only image-level labels that are faster andcheaper to obtain, but is not on par with fully-supervisedmodels in terms of performance. In this study we proposeto combine both approaches in a model that is principallyapt for WSOD, but receives full position ground truth fora small number of images. Experiments show that withjust 1% of densely annotated images, but simple image-level counts as remaining ground truth, we effectively matchthe performance of fully-supervised models on a challeng-ing dataset with scarcely occurring wildlife on UAV imagesfrom the African savanna. As a result, with a very limitedamount of precise annotations our model can be trainedwith ground truth that is orders of magnitude cheaper andfaster to obtain while still providing the same detection per-formance.
Original language | English |
---|---|
Title of host publication | 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) |
Place of Publication | Long Beach, CA, USA |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 1414-1422 |
Number of pages | 9 |
Publication status | Published - 2019 |