Improving Maps from CNNs Trained with Sparse, Scribbled Ground Truths Using Fully Connected CRFs

Luca Maggiolo, Diego Marcos, Gabriele Moser, Devis Tuia

Research output: Chapter in Book/Report/Conference proceedingConference paper

3 Citations (Scopus)

Abstract

Convolutional Neural Networks (CNNs) have become the new standard for semantic segmentation of very high resolution images. But as for other methods, the map accuracy depends on the quantity and quality of ground truth used to train them. Having densely annotated data, i.e. a detailed, pixel-level ground truth (GT), allows obtaining effective models, but requires high efforts in annotation. For this reason, it is more common and efficient to work with point or scribbled annotations rather than with dense ones. A CNN model trained with such incomplete ground truths tends to mischaracterize the shapes of the objects and to be inaccurate near their boundaries. We propose to use an approximation of a fully connected Conditional Random Field (CRF) to solve these issues, in which long range connections are accounted for through auxiliary nodes based on clustering of CNN activation features. Experiments on the ISPRS Vaihingen benchmark, where a CNN is trained only with a non-dense, scribbled ground truth, show that the proposed method can fill part of the performance gap with respect to models trained on the densely annotated, but unrealistic, ground truth.
Original languageEnglish
Title of host publication2018 IEEE International Geoscience & Remote Sensing Symposium Proceedings
Subtitle of host publicationObserving, Understanding And Forecasting The Dynamics Of Our Planet
PublisherIEEE Xplore
Pages2099-2102
ISBN (Electronic)9781538671504, 9781538671498
ISBN (Print)9781538671511
DOIs
Publication statusPublished - 5 Nov 2018
EventIGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium - Valencia, Spain
Duration: 22 Jul 201827 Jul 2018

Conference

ConferenceIGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium
CountrySpain
CityValencia
Period22/07/1827/07/18

Fingerprint

Neural networks
Image resolution
Pixels
Chemical activation
Semantics
Experiments

Cite this

Maggiolo, L., Marcos, D., Moser, G., & Tuia, D. (2018). Improving Maps from CNNs Trained with Sparse, Scribbled Ground Truths Using Fully Connected CRFs. In 2018 IEEE International Geoscience & Remote Sensing Symposium Proceedings: Observing, Understanding And Forecasting The Dynamics Of Our Planet (pp. 2099-2102). IEEE Xplore. https://doi.org/10.1109/IGARSS.2018.8517947
Maggiolo, Luca ; Marcos, Diego ; Moser, Gabriele ; Tuia, Devis. / Improving Maps from CNNs Trained with Sparse, Scribbled Ground Truths Using Fully Connected CRFs. 2018 IEEE International Geoscience & Remote Sensing Symposium Proceedings: Observing, Understanding And Forecasting The Dynamics Of Our Planet. IEEE Xplore, 2018. pp. 2099-2102
@inproceedings{926a0f65d7df404aa99a6edfab1d7158,
title = "Improving Maps from CNNs Trained with Sparse, Scribbled Ground Truths Using Fully Connected CRFs",
abstract = "Convolutional Neural Networks (CNNs) have become the new standard for semantic segmentation of very high resolution images. But as for other methods, the map accuracy depends on the quantity and quality of ground truth used to train them. Having densely annotated data, i.e. a detailed, pixel-level ground truth (GT), allows obtaining effective models, but requires high efforts in annotation. For this reason, it is more common and efficient to work with point or scribbled annotations rather than with dense ones. A CNN model trained with such incomplete ground truths tends to mischaracterize the shapes of the objects and to be inaccurate near their boundaries. We propose to use an approximation of a fully connected Conditional Random Field (CRF) to solve these issues, in which long range connections are accounted for through auxiliary nodes based on clustering of CNN activation features. Experiments on the ISPRS Vaihingen benchmark, where a CNN is trained only with a non-dense, scribbled ground truth, show that the proposed method can fill part of the performance gap with respect to models trained on the densely annotated, but unrealistic, ground truth.",
author = "Luca Maggiolo and Diego Marcos and Gabriele Moser and Devis Tuia",
year = "2018",
month = "11",
day = "5",
doi = "10.1109/IGARSS.2018.8517947",
language = "English",
isbn = "9781538671511",
pages = "2099--2102",
booktitle = "2018 IEEE International Geoscience & Remote Sensing Symposium Proceedings",
publisher = "IEEE Xplore",

}

Maggiolo, L, Marcos, D, Moser, G & Tuia, D 2018, Improving Maps from CNNs Trained with Sparse, Scribbled Ground Truths Using Fully Connected CRFs. in 2018 IEEE International Geoscience & Remote Sensing Symposium Proceedings: Observing, Understanding And Forecasting The Dynamics Of Our Planet. IEEE Xplore, pp. 2099-2102, IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22/07/18. https://doi.org/10.1109/IGARSS.2018.8517947

Improving Maps from CNNs Trained with Sparse, Scribbled Ground Truths Using Fully Connected CRFs. / Maggiolo, Luca; Marcos, Diego; Moser, Gabriele; Tuia, Devis.

2018 IEEE International Geoscience & Remote Sensing Symposium Proceedings: Observing, Understanding And Forecasting The Dynamics Of Our Planet. IEEE Xplore, 2018. p. 2099-2102.

Research output: Chapter in Book/Report/Conference proceedingConference paper

TY - GEN

T1 - Improving Maps from CNNs Trained with Sparse, Scribbled Ground Truths Using Fully Connected CRFs

AU - Maggiolo, Luca

AU - Marcos, Diego

AU - Moser, Gabriele

AU - Tuia, Devis

PY - 2018/11/5

Y1 - 2018/11/5

N2 - Convolutional Neural Networks (CNNs) have become the new standard for semantic segmentation of very high resolution images. But as for other methods, the map accuracy depends on the quantity and quality of ground truth used to train them. Having densely annotated data, i.e. a detailed, pixel-level ground truth (GT), allows obtaining effective models, but requires high efforts in annotation. For this reason, it is more common and efficient to work with point or scribbled annotations rather than with dense ones. A CNN model trained with such incomplete ground truths tends to mischaracterize the shapes of the objects and to be inaccurate near their boundaries. We propose to use an approximation of a fully connected Conditional Random Field (CRF) to solve these issues, in which long range connections are accounted for through auxiliary nodes based on clustering of CNN activation features. Experiments on the ISPRS Vaihingen benchmark, where a CNN is trained only with a non-dense, scribbled ground truth, show that the proposed method can fill part of the performance gap with respect to models trained on the densely annotated, but unrealistic, ground truth.

AB - Convolutional Neural Networks (CNNs) have become the new standard for semantic segmentation of very high resolution images. But as for other methods, the map accuracy depends on the quantity and quality of ground truth used to train them. Having densely annotated data, i.e. a detailed, pixel-level ground truth (GT), allows obtaining effective models, but requires high efforts in annotation. For this reason, it is more common and efficient to work with point or scribbled annotations rather than with dense ones. A CNN model trained with such incomplete ground truths tends to mischaracterize the shapes of the objects and to be inaccurate near their boundaries. We propose to use an approximation of a fully connected Conditional Random Field (CRF) to solve these issues, in which long range connections are accounted for through auxiliary nodes based on clustering of CNN activation features. Experiments on the ISPRS Vaihingen benchmark, where a CNN is trained only with a non-dense, scribbled ground truth, show that the proposed method can fill part of the performance gap with respect to models trained on the densely annotated, but unrealistic, ground truth.

U2 - 10.1109/IGARSS.2018.8517947

DO - 10.1109/IGARSS.2018.8517947

M3 - Conference paper

SN - 9781538671511

SP - 2099

EP - 2102

BT - 2018 IEEE International Geoscience & Remote Sensing Symposium Proceedings

PB - IEEE Xplore

ER -

Maggiolo L, Marcos D, Moser G, Tuia D. Improving Maps from CNNs Trained with Sparse, Scribbled Ground Truths Using Fully Connected CRFs. In 2018 IEEE International Geoscience & Remote Sensing Symposium Proceedings: Observing, Understanding And Forecasting The Dynamics Of Our Planet. IEEE Xplore. 2018. p. 2099-2102 https://doi.org/10.1109/IGARSS.2018.8517947