TU Darmstadt / ULB / TUbiblio

PixelPyramids: Exact Inference Models from Lossless Image Pyramids

Mahajan, Shweta ; Roth, Stefan (2021)
PixelPyramids: Exact Inference Models from Lossless Image Pyramids.
IEEE/CVF International Conference on Computer Vision (ICCV'21). virtual Conference (11.10.2021-17.10.2021)
doi: 10.1109/ICCV48922.2021.00657
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Autoregressive models are a class of exact inference approaches with highly flexible functional forms, yielding state-of-the-art density estimates for natural images. Yet, the sequential ordering on the dimensions makes these models computationally expensive and limits their applicability to low-resolution imagery. In this work, we propose PixelPyramids, 1 a block-autoregressive approach employing a lossless pyramid decomposition with scale-specific representations to encode the joint distribution of image pixels. Crucially, it affords a sparser dependency structure compared to fully autoregressive approaches. Our PixelPyramids yield state-of-the-art results for density estimation on various image datasets, especially for high-resolution data. For CelebA-HQ 1024 × 1024, we observe that the density estimates (in terms of bits/dim) are improved to ~44% of the baseline despite sampling speeds superior even to easily parallelizable flow-based models.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2021
Autor(en): Mahajan, Shweta ; Roth, Stefan
Art des Eintrags: Bibliographie
Titel: PixelPyramids: Exact Inference Models from Lossless Image Pyramids
Sprache: Englisch
Publikationsjahr: 11 Oktober 2021
Verlag: IEEE
Buchtitel: Proceedings: 2021 IEEE/CVF International Conference on Computer Vision
Veranstaltungstitel: IEEE/CVF International Conference on Computer Vision (ICCV'21)
Veranstaltungsort: virtual Conference
Veranstaltungsdatum: 11.10.2021-17.10.2021
DOI: 10.1109/ICCV48922.2021.00657
URL / URN: https://openaccess.thecvf.com/content/ICCV2021/papers/Mahaja...
Kurzbeschreibung (Abstract):

Autoregressive models are a class of exact inference approaches with highly flexible functional forms, yielding state-of-the-art density estimates for natural images. Yet, the sequential ordering on the dimensions makes these models computationally expensive and limits their applicability to low-resolution imagery. In this work, we propose PixelPyramids, 1 a block-autoregressive approach employing a lossless pyramid decomposition with scale-specific representations to encode the joint distribution of image pixels. Crucially, it affords a sparser dependency structure compared to fully autoregressive approaches. Our PixelPyramids yield state-of-the-art results for density estimation on various image datasets, especially for high-resolution data. For CelebA-HQ 1024 × 1024, we observe that the density estimates (in terms of bits/dim) are improved to ~44% of the baseline despite sampling speeds superior even to easily parallelizable flow-based models.

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Visuelle Inferenz
DFG-Graduiertenkollegs
DFG-Graduiertenkollegs > Graduiertenkolleg 1994 Adaptive Informationsaufbereitung aus heterogenen Quellen
Hinterlegungsdatum: 08 Mär 2022 07:54
Letzte Änderung: 08 Mär 2022 07:54
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen