TU Darmstadt / ULB / TUbiblio

Information Overload in Crisis Management: Bilingual Evaluation of Embedding Models for Clustering Social Media Posts in Emergencies

Bayer, Markus ; Kaufhold, Marc-André ; Reuter, Christian (2022)
Information Overload in Crisis Management: Bilingual Evaluation of Embedding Models for Clustering Social Media Posts in Emergencies.
European Conference on Information Systems (ECIS 2021). Marrakech, Morocco (14.-16.06.2021)
doi: 10.26083/tuprints-00022167
Konferenzveröffentlichung, Zweitveröffentlichung, Verlagsversion

Kurzbeschreibung (Abstract)

Past studies in the domains of information systems have analysed the potentials and barriers of social media in emergencies. While information disseminated in social media can lead to valuable insights, emergency services and researchers face the challenge of information overload as data quickly exceeds the manageable amount. We propose an embedding-based clustering approach and a method for the automated labelling of clusters. Given that the clustering quality is highly dependent on embeddings, we evaluate 19 embedding models with respect to time, internal cluster quality, and language invariance. The results show that it may be sensible to use embedding models that were already trained on other crisis datasets. However, one must ensure that the training data generalizes enough, so that the clustering can adapt to new situations. Confirming this, we found out that some embeddings were not able to perform as well on a German dataset as on an English dataset.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2022
Autor(en): Bayer, Markus ; Kaufhold, Marc-André ; Reuter, Christian
Art des Eintrags: Zweitveröffentlichung
Titel: Information Overload in Crisis Management: Bilingual Evaluation of Embedding Models for Clustering Social Media Posts in Emergencies
Sprache: Englisch
Publikationsjahr: 2022
Ort: Darmstadt
Verlag: AIS
Buchtitel: ECIS 2021 Research-in-Progress Papers
Reihe: ECIS 2021 Research Papers
Kollation: 18 Seiten
Veranstaltungstitel: European Conference on Information Systems (ECIS 2021)
Veranstaltungsort: Marrakech, Morocco
Veranstaltungsdatum: 14.-16.06.2021
DOI: 10.26083/tuprints-00022167
URL / URN: https://tuprints.ulb.tu-darmstadt.de/22167
Zugehörige Links:
Herkunft: Zweitveröffentlichungsservice
Kurzbeschreibung (Abstract):

Past studies in the domains of information systems have analysed the potentials and barriers of social media in emergencies. While information disseminated in social media can lead to valuable insights, emergency services and researchers face the challenge of information overload as data quickly exceeds the manageable amount. We propose an embedding-based clustering approach and a method for the automated labelling of clusters. Given that the clustering quality is highly dependent on embeddings, we evaluate 19 embedding models with respect to time, internal cluster quality, and language invariance. The results show that it may be sensible to use embedding models that were already trained on other crisis datasets. However, one must ensure that the training data generalizes enough, so that the clustering can adapt to new situations. Confirming this, we found out that some embeddings were not able to perform as well on a German dataset as on an English dataset.

Freie Schlagworte: Social Media Clustering, Information Overload, Crisis Informatics, Unsupervised Machine Learning
Status: Verlagsversion
URN: urn:nbn:de:tuda-tuprints-221672
Sachgruppe der Dewey Dezimalklassifikatin (DDC): 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik
000 Allgemeines, Informatik, Informationswissenschaft > 070 Nachrichtenmedien, Journalismus, Verlagswesen
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Wissenschaft und Technik für Frieden und Sicherheit (PEASEC)
Forschungsfelder
Forschungsfelder > Information and Intelligence
Forschungsfelder > Information and Intelligence > Cybersecurity & Privacy
Hinterlegungsdatum: 05 Sep 2022 13:38
Letzte Änderung: 07 Sep 2022 09:07
PPN:
Zugehörige Links:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen