Qiu, Kevin ; Budde, Lina E. ; Bulatov, Dimitri ; Iwaszczuk, Dorota
Hrsg.: Schulz, Karsten ; Michel, Ulrich ; Nikolakopoulos, Konstantinos G. ; International Society for Optics and Photonics (SPIE) (2022)
Exploring fusion techniques in U-Net and DeepLab V3 architectures for multi-modal land cover classification.
SPIE Sensors + Imaging 2022. Berlin (05.09.2022-07.09.2022)
doi: 10.1117/12.2636144
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
Many deep learning architectures exist for semantic segmentation. In this paper, their application to multi-modal remote sensing data is examined. Two well-known network architectures, U-Net and DeepLab V3+, developed originally for RGB image data, are modified to accept additional input channels, such as near infrared or depth information. In both networks, ResNet101 is used as the backbone, while data-preprocessing steps, including data augmentation, are identical. We compare both networks and experiment with different fusion techniques in U-Net and with hyper-parameters for weighting the input channels for fusion in DeepLab V3+. We also evaluate the effect of pre-training on RGB and non-RGB data. The results show a minimally better performance of the DeepLab V3+ model compared to U-Net, while for the certain classes, such as vehicles, U-Net yields a slightly superior accuracy.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2022 |
Herausgeber: | Schulz, Karsten ; Michel, Ulrich ; Nikolakopoulos, Konstantinos G. |
Autor(en): | Qiu, Kevin ; Budde, Lina E. ; Bulatov, Dimitri ; Iwaszczuk, Dorota |
Art des Eintrags: | Bibliographie |
Titel: | Exploring fusion techniques in U-Net and DeepLab V3 architectures for multi-modal land cover classification |
Sprache: | Englisch |
Publikationsjahr: | 26 Oktober 2022 |
Ort: | Berlin |
Reihe: | Earth Resources and Environmental Remote Sensing/GIS Applications XIII |
Band einer Reihe: | 12268 |
Veranstaltungstitel: | SPIE Sensors + Imaging 2022 |
Veranstaltungsort: | Berlin |
Veranstaltungsdatum: | 05.09.2022-07.09.2022 |
DOI: | 10.1117/12.2636144 |
Kurzbeschreibung (Abstract): | Many deep learning architectures exist for semantic segmentation. In this paper, their application to multi-modal remote sensing data is examined. Two well-known network architectures, U-Net and DeepLab V3+, developed originally for RGB image data, are modified to accept additional input channels, such as near infrared or depth information. In both networks, ResNet101 is used as the backbone, while data-preprocessing steps, including data augmentation, are identical. We compare both networks and experiment with different fusion techniques in U-Net and with hyper-parameters for weighting the input channels for fusion in DeepLab V3+. We also evaluate the effect of pre-training on RGB and non-RGB data. The results show a minimally better performance of the DeepLab V3+ model compared to U-Net, while for the certain classes, such as vehicles, U-Net yields a slightly superior accuracy. |
Freie Schlagworte: | Semantic segmentation, remote sensing, U-Net, DeepLab, data fusion |
Fachbereich(e)/-gebiet(e): | 13 Fachbereich Bau- und Umweltingenieurwissenschaften 13 Fachbereich Bau- und Umweltingenieurwissenschaften > Institut für Geodäsie 13 Fachbereich Bau- und Umweltingenieurwissenschaften > Institut für Geodäsie > Fernerkundung und Bildanalyse |
Hinterlegungsdatum: | 11 Nov 2022 09:13 |
Letzte Änderung: | 11 Nov 2022 09:13 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |