TU Darmstadt / ULB / TUbiblio

A Flexible K-Means Operator for Hybrid Databases

He, Zhenhao ; Sidler, David ; István, Zsolt ; Alonso, Gustavo (2018)
A Flexible K-Means Operator for Hybrid Databases.
28th International Conference on Field Programmable Logic and Applications. Dublin, Ireland (26.-30.08.2018)
doi: 10.1109/FPL.2018.00069
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

The K-means algorithm is widely used in unsupervised learning and data exploration. It is less used in analytical databases due to its high computational cost. K-means has been explored in great detail, mostly focusing on performance. However, in emerging hybrid CPU-FPGA databases where memory bandwidth is shared across software and hardware operators, two additional requirements arise. One is parameterization to avoid frequent reprogramming. The other is concurrent use to balance memory bandwidth and computation. Our design supports two operational modes that can be chosen at runtime, one for high query throughput and one for evaluating multiple clusters concurrently. The former targets speed up, while the latter targets efficient bandwidth utilization by increasing the amount of computation per input byte. Our design is competitive when compared to both existing FPGA-based solutions as well as highly optimized multi-core software implementations.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2018
Autor(en): He, Zhenhao ; Sidler, David ; István, Zsolt ; Alonso, Gustavo
Art des Eintrags: Bibliographie
Titel: A Flexible K-Means Operator for Hybrid Databases
Sprache: Englisch
Publikationsjahr: 6 Dezember 2018
Verlag: IEEE
Buchtitel: Proceedings: 2018 International Conference on Field- Programmable Logic and Applications (FPL 2018)
Veranstaltungstitel: 28th International Conference on Field Programmable Logic and Applications
Veranstaltungsort: Dublin, Ireland
Veranstaltungsdatum: 26.-30.08.2018
DOI: 10.1109/FPL.2018.00069
URL / URN: https://doi.org/10.1109/FPL.2018.00069
Kurzbeschreibung (Abstract):

The K-means algorithm is widely used in unsupervised learning and data exploration. It is less used in analytical databases due to its high computational cost. K-means has been explored in great detail, mostly focusing on performance. However, in emerging hybrid CPU-FPGA databases where memory bandwidth is shared across software and hardware operators, two additional requirements arise. One is parameterization to avoid frequent reprogramming. The other is concurrent use to balance memory bandwidth and computation. Our design supports two operational modes that can be chosen at runtime, one for high query throughput and one for evaluating multiple clusters concurrently. The former targets speed up, while the latter targets efficient bandwidth utilization by increasing the amount of computation per input byte. Our design is competitive when compared to both existing FPGA-based solutions as well as highly optimized multi-core software implementations.

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Distributed and Networked Systems
Hinterlegungsdatum: 23 Jan 2023 10:15
Letzte Änderung: 03 Apr 2023 14:01
PPN: 506548775
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen