TU Darmstadt / ULB / TUbiblio

Information Overload in Crisis Management: Bilingual Evaluation of Embedding Models for Clustering Social Media Posts in Emergencies

Bayer, Markus ; Kaufhold, Marc-André ; Reuter, Christian (2022)
Information Overload in Crisis Management: Bilingual Evaluation of Embedding Models for Clustering Social Media Posts in Emergencies.
European Conference on Information Systems (ECIS 2021). Marrakech, Morocco (14.-16.06.2021)
doi: 10.26083/tuprints-00022167
Conference or Workshop Item, Secondary publication, Publisher's Version

WarningThere is a more recent version of this item available.

Abstract

Past studies in the domains of information systems have analysed the potentials and barriers of social media in emergencies. While information disseminated in social media can lead to valuable insights, emergency services and researchers face the challenge of information overload as data quickly exceeds the manageable amount. We propose an embedding-based clustering approach and a method for the automated labelling of clusters. Given that the clustering quality is highly dependent on embeddings, we evaluate 19 embedding models with respect to time, internal cluster quality, and language invariance. The results show that it may be sensible to use embedding models that were already trained on other crisis datasets. However, one must ensure that the training data generalizes enough, so that the clustering can adapt to new situations. Confirming this, we found out that some embeddings were not able to perform as well on a German dataset as on an English dataset.

Item Type: Conference or Workshop Item
Erschienen: 2022
Creators: Bayer, Markus ; Kaufhold, Marc-André ; Reuter, Christian
Type of entry: Secondary publication
Title: Information Overload in Crisis Management: Bilingual Evaluation of Embedding Models for Clustering Social Media Posts in Emergencies
Language: English
Date: 2022
Place of Publication: Darmstadt
Year of primary publication: 2021
Publisher: AIS
Book Title: ECIS 2021 Research-in-Progress Papers
Series: ECIS 2021 Research Papers
Collation: 18 Seiten
Event Title: European Conference on Information Systems (ECIS 2021)
Event Location: Marrakech, Morocco
Event Dates: 14.-16.06.2021
DOI: 10.26083/tuprints-00022167
URL / URN: https://tuprints.ulb.tu-darmstadt.de/22167
Corresponding Links:
Origin: Secondary publication service
Abstract:

Past studies in the domains of information systems have analysed the potentials and barriers of social media in emergencies. While information disseminated in social media can lead to valuable insights, emergency services and researchers face the challenge of information overload as data quickly exceeds the manageable amount. We propose an embedding-based clustering approach and a method for the automated labelling of clusters. Given that the clustering quality is highly dependent on embeddings, we evaluate 19 embedding models with respect to time, internal cluster quality, and language invariance. The results show that it may be sensible to use embedding models that were already trained on other crisis datasets. However, one must ensure that the training data generalizes enough, so that the clustering can adapt to new situations. Confirming this, we found out that some embeddings were not able to perform as well on a German dataset as on an English dataset.

Uncontrolled Keywords: Social Media Clustering, Information Overload, Crisis Informatics, Unsupervised Machine Learning
Status: Publisher's Version
URN: urn:nbn:de:tuda-tuprints-221672
Classification DDC: 000 Generalities, computers, information > 004 Computer science
000 Generalities, computers, information > 070 News media, journalism, publishing
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Science and Technology for Peace and Security (PEASEC)
Forschungsfelder
Forschungsfelder > Information and Intelligence
Forschungsfelder > Information and Intelligence > Cybersecurity & Privacy
Date Deposited: 05 Sep 2022 13:38
Last Modified: 07 Sep 2022 09:07
PPN:
Export:
Suche nach Titel in: TUfind oder in Google

Available Versions of this Item

Send an inquiry Send an inquiry

Options (only for editors)
Show editorial Details Show editorial Details