Shi, Meiling ; Heinz, Tobias ; Rüppel, Uwe
Hrsg.: Scherer, Raimar (2023)
Retrieve information from construction documents with BERT and unsupervised learning.
14th European Conference on Product & Process Modelling (ECPPM 2022). Trondheim, Norway (14.09.2022-16.09.2022)
doi: 10.1201/9781003354222-51
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
The exploitation of using text documents from precedent projects for decision-making in the construction industry is still at a low level. One reason is that the in unstructured natural language formulated information cannot be processed directly by computer programs and the search is conducted by keywordsmatch, which is inefficient and imprecise. To make the information of unstructured text document accessible in digital processes without introducing additional manual work, we propose using natural language processing and unsupervised learning methods to automatedly extract information from unstructured textual documents. This paper describes an NLP-based pipeline that includes methods for data acquisition and preprocessing, different transformer-based embedding methods, and subsequent downstream tasks. Our proof-of-concept is trained on documents from different waterways construction projects in the German language. Because of the unsupervised learning and available language models, this pipeline can be generalized to other languages and construction types.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2023 |
Herausgeber: | Scherer, Raimar |
Autor(en): | Shi, Meiling ; Heinz, Tobias ; Rüppel, Uwe |
Art des Eintrags: | Bibliographie |
Titel: | Retrieve information from construction documents with BERT and unsupervised learning |
Sprache: | Englisch |
Publikationsjahr: | März 2023 |
Ort: | London |
Verlag: | CRC Press |
Buchtitel: | ECPPM 2022 - eWork and eBusiness in Architecture, Engineering and Construction 2022 |
Veranstaltungstitel: | 14th European Conference on Product & Process Modelling (ECPPM 2022) |
Veranstaltungsort: | Trondheim, Norway |
Veranstaltungsdatum: | 14.09.2022-16.09.2022 |
Auflage: | 1st edition |
DOI: | 10.1201/9781003354222-51 |
URL / URN: | https://www.taylorfrancis.com/books/9781003354222/chapters/1... |
Kurzbeschreibung (Abstract): | The exploitation of using text documents from precedent projects for decision-making in the construction industry is still at a low level. One reason is that the in unstructured natural language formulated information cannot be processed directly by computer programs and the search is conducted by keywordsmatch, which is inefficient and imprecise. To make the information of unstructured text document accessible in digital processes without introducing additional manual work, we propose using natural language processing and unsupervised learning methods to automatedly extract information from unstructured textual documents. This paper describes an NLP-based pipeline that includes methods for data acquisition and preprocessing, different transformer-based embedding methods, and subsequent downstream tasks. Our proof-of-concept is trained on documents from different waterways construction projects in the German language. Because of the unsupervised learning and available language models, this pipeline can be generalized to other languages and construction types. |
Fachbereich(e)/-gebiet(e): | 13 Fachbereich Bau- und Umweltingenieurwissenschaften 13 Fachbereich Bau- und Umweltingenieurwissenschaften > Institut für Numerische Methoden und Informatik im Bauwesen |
Hinterlegungsdatum: | 03 Nov 2023 10:41 |
Letzte Änderung: | 05 Jul 2024 07:59 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |