Publication

Back to overview

Detecting mammals in UAV images: Best practices to address a substantially imbalanced dataset with deep learning

Type of publication Peer-reviewed
Publikationsform Original article (peer-reviewed)
Author Kellenberger Benjamin, Marcos Diego, Tuia Devis,
Project Multimodal machine learning for remote sensing information fusion
Show all

Original article (peer-reviewed)

Journal Remote Sensing of Environment
Volume (Issue) 216
Page(s) 139 - 153
Title of proceedings Remote Sensing of Environment
DOI 10.1016/j.rse.2018.06.028

Open Access

URL https://arxiv.org/abs/1806.11368
Type of Open Access Repository (Green Open Access)

Abstract

© 2018 Elsevier Inc. Knowledge over the number of animals in large wildlife reserves is a vital necessity for park rangers in their efforts to protect endangered species. Manual animal censuses are dangerous and expensive, hence Unmanned Aerial Vehicles (UAVs) with consumer level digital cameras are becoming a popular alternative tool to estimate livestock. Several works have been proposed that semi-automatically process UAV images to detect animals, of which some employ Convolutional Neural Networks (CNNs), a recent family of deep learning algorithms that proved very effective in object detection in large datasets from computer vision. However, the majority of works related to wildlife focuses only on small datasets (typically subsets of UAV campaigns), which might be detrimental when presented with the sheer scale of real study areas for large mammal census. Methods may yield thousands of false alarms in such cases. In this paper, we study how to scale CNNs to large wildlife census tasks and present a number of recommendations to train a CNN on a large UAV dataset. We further introduce novel evaluation protocols that are tailored to censuses and model suitability for subsequent human verification of detections. Using our recommendations, we are able to train a CNN reducing the number of false positives by an order of magnitude compared to previous state-of-the-art. Setting the requirements at 90% recall, our CNN allows to reduce the amount of data required for manual verification by three times, thus making it possible for rangers to screen all the data acquired efficiently and to detect almost all animals in the reserve automatically.
-