Mahajan, Shweta ; Roth, Stefan (2021)
PixelPyramids: Exact Inference Models from Lossless Image Pyramids.
IEEE/CVF International Conference on Computer Vision (ICCV'21). virtual Conference (11.10.2021-17.10.2021)
doi: 10.1109/ICCV48922.2021.00657
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
Autoregressive models are a class of exact inference approaches with highly flexible functional forms, yielding state-of-the-art density estimates for natural images. Yet, the sequential ordering on the dimensions makes these models computationally expensive and limits their applicability to low-resolution imagery. In this work, we propose PixelPyramids, 1 a block-autoregressive approach employing a lossless pyramid decomposition with scale-specific representations to encode the joint distribution of image pixels. Crucially, it affords a sparser dependency structure compared to fully autoregressive approaches. Our PixelPyramids yield state-of-the-art results for density estimation on various image datasets, especially for high-resolution data. For CelebA-HQ 1024 × 1024, we observe that the density estimates (in terms of bits/dim) are improved to ~44% of the baseline despite sampling speeds superior even to easily parallelizable flow-based models.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2021 |
Autor(en): | Mahajan, Shweta ; Roth, Stefan |
Art des Eintrags: | Bibliographie |
Titel: | PixelPyramids: Exact Inference Models from Lossless Image Pyramids |
Sprache: | Englisch |
Publikationsjahr: | 11 Oktober 2021 |
Verlag: | IEEE |
Buchtitel: | Proceedings: 2021 IEEE/CVF International Conference on Computer Vision |
Veranstaltungstitel: | IEEE/CVF International Conference on Computer Vision (ICCV'21) |
Veranstaltungsort: | virtual Conference |
Veranstaltungsdatum: | 11.10.2021-17.10.2021 |
DOI: | 10.1109/ICCV48922.2021.00657 |
URL / URN: | https://openaccess.thecvf.com/content/ICCV2021/papers/Mahaja... |
Kurzbeschreibung (Abstract): | Autoregressive models are a class of exact inference approaches with highly flexible functional forms, yielding state-of-the-art density estimates for natural images. Yet, the sequential ordering on the dimensions makes these models computationally expensive and limits their applicability to low-resolution imagery. In this work, we propose PixelPyramids, 1 a block-autoregressive approach employing a lossless pyramid decomposition with scale-specific representations to encode the joint distribution of image pixels. Crucially, it affords a sparser dependency structure compared to fully autoregressive approaches. Our PixelPyramids yield state-of-the-art results for density estimation on various image datasets, especially for high-resolution data. For CelebA-HQ 1024 × 1024, we observe that the density estimates (in terms of bits/dim) are improved to ~44% of the baseline despite sampling speeds superior even to easily parallelizable flow-based models. |
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Visuelle Inferenz DFG-Graduiertenkollegs DFG-Graduiertenkollegs > Graduiertenkolleg 1994 Adaptive Informationsaufbereitung aus heterogenen Quellen |
Hinterlegungsdatum: | 08 Mär 2022 07:54 |
Letzte Änderung: | 08 Mär 2022 07:54 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |