István, Zsolt ; Woods, Louis ; Alonso, Gustavo (2014)
Histograms as a side effect of data movement for big data.
2014 International Conference on Management of Data. Snowbird, USA (22.06.2014-27.06.2014)
doi: 10.1145/2588555.2612174
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
Histograms are a crucial part of database query planning but their computation is resource-intensive. As a consequence, generating histograms on database tables is typically performed as a batch job, separately from query processing. In this paper, we show how to calculate statistics as a side effect of data movement within a DBMS using a hardware accelerator in the data path. This accelerator analyzes tables as they are transmitted from storage to the processing unit, and provides histograms on the data retrieved for queries at virtually no extra performance cost. To evaluate our approach, we implemented this accelerator on an FPGA. This prototype calculates histograms faster and with similar or better accuracy than commercial databases. Moreover, the FPGA can provide various types of histograms such as Equi-depth, Compressed, or Max-diff on the same input data in parallel, without additional overhead.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2014 |
Autor(en): | István, Zsolt ; Woods, Louis ; Alonso, Gustavo |
Art des Eintrags: | Bibliographie |
Titel: | Histograms as a side effect of data movement for big data |
Sprache: | Englisch |
Publikationsjahr: | 18 Juni 2014 |
Ort: | New York, NY |
Verlag: | ACM |
Buchtitel: | SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data |
Veranstaltungstitel: | 2014 International Conference on Management of Data |
Veranstaltungsort: | Snowbird, USA |
Veranstaltungsdatum: | 22.06.2014-27.06.2014 |
DOI: | 10.1145/2588555.2612174 |
Kurzbeschreibung (Abstract): | Histograms are a crucial part of database query planning but their computation is resource-intensive. As a consequence, generating histograms on database tables is typically performed as a batch job, separately from query processing. In this paper, we show how to calculate statistics as a side effect of data movement within a DBMS using a hardware accelerator in the data path. This accelerator analyzes tables as they are transmitted from storage to the processing unit, and provides histograms on the data retrieved for queries at virtually no extra performance cost. To evaluate our approach, we implemented this accelerator on an FPGA. This prototype calculates histograms faster and with similar or better accuracy than commercial databases. Moreover, the FPGA can provide various types of histograms such as Equi-depth, Compressed, or Max-diff on the same input data in parallel, without additional overhead. |
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Distributed and Networked Systems |
Hinterlegungsdatum: | 23 Jan 2023 12:34 |
Letzte Änderung: | 11 Mai 2023 08:40 |
PPN: | 50772383X |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |