TU Darmstadt / ULB / TUbiblio

A distributed B+Tree indexing method for processing range queries over streaming data

Safaee, Shahab ; Mirabi, Meghdad ; Rahmani, Amir Masoud ; Safaei, Ali Asghar (2023)
A distributed B+Tree indexing method for processing range queries over streaming data.
In: Cluster Computing, 2023
doi: 10.1007/s10586-023-04015-9
Artikel, Bibliographie

Kurzbeschreibung (Abstract)

A data stream exhibits as a massive unbounded sequence of data elements continuously generated at a high rate. Stream databases raise new challenges for query processing due to both the streaming nature of data which constantly changes over time and the wider range of queries submitted by the user when compared with the traditional databases. In this paper, we propose a system architecture which includes components for both distributed indexing of streaming data and distributed processing of range queries on streaming data. Instead of creating a large and centralized B+Tree index structure, we create a set of small B+Tree indexes in such a way that a B+Tree index can be created for every partition of streaming data. We also design a distributed range search algorithm which can be used by each individual machine inside a Spark cluster to independently process range queries on each partition of streaming data. By exploiting the proposed system architecture, the process of indexing of streaming data and the process of querying over streaming data can be performed in a distributed and parallel manner. By performing several experiments, we demonstrate that our proposed indexing method is scalable and efficient for processing range queries on streaming data compared to the existing centralized B+Tree indexing methods and therefore, it can be used for applications involving data streams with a large volume of data elements and a large number of range queries.

Typ des Eintrags: Artikel
Erschienen: 2023
Autor(en): Safaee, Shahab ; Mirabi, Meghdad ; Rahmani, Amir Masoud ; Safaei, Ali Asghar
Art des Eintrags: Bibliographie
Titel: A distributed B+Tree indexing method for processing range queries over streaming data
Sprache: Englisch
Publikationsjahr: 7 Mai 2023
Verlag: Springer
Titel der Zeitschrift, Zeitung oder Schriftenreihe: Cluster Computing
Jahrgang/Volume einer Zeitschrift: 2023
DOI: 10.1007/s10586-023-04015-9
URL / URN: https://link.springer.com/article/10.1007/s10586-023-04015-9...
Kurzbeschreibung (Abstract):

A data stream exhibits as a massive unbounded sequence of data elements continuously generated at a high rate. Stream databases raise new challenges for query processing due to both the streaming nature of data which constantly changes over time and the wider range of queries submitted by the user when compared with the traditional databases. In this paper, we propose a system architecture which includes components for both distributed indexing of streaming data and distributed processing of range queries on streaming data. Instead of creating a large and centralized B+Tree index structure, we create a set of small B+Tree indexes in such a way that a B+Tree index can be created for every partition of streaming data. We also design a distributed range search algorithm which can be used by each individual machine inside a Spark cluster to independently process range queries on each partition of streaming data. By exploiting the proposed system architecture, the process of indexing of streaming data and the process of querying over streaming data can be performed in a distributed and parallel manner. By performing several experiments, we demonstrate that our proposed indexing method is scalable and efficient for processing range queries on streaming data compared to the existing centralized B+Tree indexing methods and therefore, it can be used for applications involving data streams with a large volume of data elements and a large number of range queries.

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Data and AI Systems
Hinterlegungsdatum: 07 Mär 2024 13:22
Letzte Änderung: 07 Mär 2024 13:22
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen