Bormann, Pascal (2024)
Improving the efficiency of point cloud data management.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00027526
Dissertation, Erstveröffentlichung, Verlagsversion
Kurzbeschreibung (Abstract)
The collection of point cloud data has increased drastically in recent years, which poses challenges for the data management layer. Multi-billion point datasets are commonplace and users are getting accustomed to real-time data exploration in the Web. To make this possible, existing point cloud data management approaches rely on optimized data formats which are time- and resource-intensive to generate. This introduces long wait times before data can be used and frequent data duplication, since these optimized formats are often domain- or application-specific. As a result, data management is a challenging and expensive aspect when developing applications that use point cloud data. We observe that the interaction between applications and the point cloud data management layer can be modeled as a series of queries similar to those found in traditional databases. Based on this observation, we evaluate current point cloud data management using three query metrics: Responsiveness, throughput, and expressiveness. We contribute to the current state of the art by improving these metrics for both the handling of raw files without preprocessing, as well as indexed point clouds. In the domain of unindexed point cloud data, we introduce the concept of ad-hoc queries, which are queries executed ad-hoc on raw point cloud files. We demonstrate that ad-hoc queries can improve query responsiveness significantly as they do not require long wait times for indexing or database imports. Using columnar memory layouts, queries on datasets of up to a billion points can be answered in interactive or near-interactive time, with throughputs of more than one hundred million points per second on unindexed data. A demonstration of an adaptive indexing method shows that spending a few seconds per query on index creation can improve responsiveness by up to an order of magnitude. Our experiments also confirm the importance of high-throughput systems when querying point cloud data, as the overhead of data transmission has a significant effect on the overall query performance. For situations where indexing is mandatory, we demonstrate improvements to the runtime performance of existing point cloud indexing tools. We developed a fast indexer based on task-parallel programming, using Morton indices to efficiently sort and distribute point batches onto worker threads. This system, called Schwarzwald, outperformed existing indexers by up to a factor 9 when it was first published, and still has competitive performance to current out-of-core capable indexers. Additionally we adapted our indexing algorithm for distributed processing in a Cloud-environment and demonstrate that its horizontal scalability allows it to outperform all existing indexers by up to a factor of 3. Lastly we demonstrated point cloud indexing in real-time during Light Detection And Ranging (LiDAR) capturing, based on a similar task-based algorithm but optimized for progressive indexing. Our real-time indexer is able to keep up with current LiDAR sensors in a real-world test, with end-to-end latencies as low as 0.1 seconds. Together, our improvements significantly reduce wait times for working with point cloud data and increase the overall efficiency of the data access layer.
Typ des Eintrags: | Dissertation | ||||
---|---|---|---|---|---|
Erschienen: | 2024 | ||||
Autor(en): | Bormann, Pascal | ||||
Art des Eintrags: | Erstveröffentlichung | ||||
Titel: | Improving the efficiency of point cloud data management | ||||
Sprache: | Englisch | ||||
Referenten: | Fellner, Prof. Dr. Dieter W. ; Reiterer, Prof. Dr. Alexander | ||||
Publikationsjahr: | 1 Juli 2024 | ||||
Ort: | Darmstadt | ||||
Kollation: | xxiv, 158 Seiten | ||||
Datum der mündlichen Prüfung: | 8 Mai 2024 | ||||
DOI: | 10.26083/tuprints-00027526 | ||||
URL / URN: | https://tuprints.ulb.tu-darmstadt.de/27526 | ||||
Kurzbeschreibung (Abstract): | The collection of point cloud data has increased drastically in recent years, which poses challenges for the data management layer. Multi-billion point datasets are commonplace and users are getting accustomed to real-time data exploration in the Web. To make this possible, existing point cloud data management approaches rely on optimized data formats which are time- and resource-intensive to generate. This introduces long wait times before data can be used and frequent data duplication, since these optimized formats are often domain- or application-specific. As a result, data management is a challenging and expensive aspect when developing applications that use point cloud data. We observe that the interaction between applications and the point cloud data management layer can be modeled as a series of queries similar to those found in traditional databases. Based on this observation, we evaluate current point cloud data management using three query metrics: Responsiveness, throughput, and expressiveness. We contribute to the current state of the art by improving these metrics for both the handling of raw files without preprocessing, as well as indexed point clouds. In the domain of unindexed point cloud data, we introduce the concept of ad-hoc queries, which are queries executed ad-hoc on raw point cloud files. We demonstrate that ad-hoc queries can improve query responsiveness significantly as they do not require long wait times for indexing or database imports. Using columnar memory layouts, queries on datasets of up to a billion points can be answered in interactive or near-interactive time, with throughputs of more than one hundred million points per second on unindexed data. A demonstration of an adaptive indexing method shows that spending a few seconds per query on index creation can improve responsiveness by up to an order of magnitude. Our experiments also confirm the importance of high-throughput systems when querying point cloud data, as the overhead of data transmission has a significant effect on the overall query performance. For situations where indexing is mandatory, we demonstrate improvements to the runtime performance of existing point cloud indexing tools. We developed a fast indexer based on task-parallel programming, using Morton indices to efficiently sort and distribute point batches onto worker threads. This system, called Schwarzwald, outperformed existing indexers by up to a factor 9 when it was first published, and still has competitive performance to current out-of-core capable indexers. Additionally we adapted our indexing algorithm for distributed processing in a Cloud-environment and demonstrate that its horizontal scalability allows it to outperform all existing indexers by up to a factor of 3. Lastly we demonstrated point cloud indexing in real-time during Light Detection And Ranging (LiDAR) capturing, based on a similar task-based algorithm but optimized for progressive indexing. Our real-time indexer is able to keep up with current LiDAR sensors in a real-world test, with end-to-end latencies as low as 0.1 seconds. Together, our improvements significantly reduce wait times for working with point cloud data and increase the overall efficiency of the data access layer. |
||||
Alternatives oder übersetztes Abstract: |
|
||||
Status: | Verlagsversion | ||||
URN: | urn:nbn:de:tuda-tuprints-275267 | ||||
Sachgruppe der Dewey Dezimalklassifikatin (DDC): | 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik | ||||
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Fraunhofer IGD |
||||
Hinterlegungsdatum: | 01 Jul 2024 12:06 | ||||
Letzte Änderung: | 06 Jul 2024 08:52 | ||||
PPN: | |||||
Referenten: | Fellner, Prof. Dr. Dieter W. ; Reiterer, Prof. Dr. Alexander | ||||
Datum der mündlichen Prüfung / Verteidigung / mdl. Prüfung: | 8 Mai 2024 | ||||
Export: | |||||
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |