Rapid relevance classification of social media posts in disasters and emergencies: A system and evaluation featuring active, incremental and online learning

Kaufhold, Marc-André ; Bayer, Markus ; Reuter, Christian (2020)
Rapid relevance classification of social media posts in disasters and emergencies: A system and evaluation featuring active, incremental and online learning.
In: Information Processing & Management, 57 (1)
doi: 10.1016/j.ipm.2019.102132
Artikel, Bibliographie

Kurzbeschreibung (Abstract)

The research field of crisis informatics examines, amongst others, the potentials and barriers of social media use during disasters and emergencies. Social media allow emergency services to receive valuable information (e.g., eyewitness reports, pictures, or videos) from social media. However, the vast amount of data generated during large-scale incidents can lead to issue of information overload. Research indicates that supervised machine learning techniques are sui- table for identifying relevant messages and filter out irrelevant messages, thus mitigating in- formation overload. Still, they require a considerable amount of labeled data, clear criteria for relevance classification, a usable interface to facilitate the labeling process and a mechanism to rapidly deploy retrained classifiers. To overcome these issues, we present (1) a system for social media monitoring, analysis and relevance classification, (2) abstract and precise criteria for re- levance classification in social media during disasters and emergencies, (3) the evaluation of a well-performing Random Forest algorithm for relevance classification incorporating metadata from social media into a batch learning approach (e.g., 91.28%/89.19% accuracy, 98.3%/89.6% precision and 80.4%/87.5% recall with a fast training time with feature subset selection on the European floods/BASF SE incident datasets), as well as (4) an approach and preliminary eva- luation for relevance classification including active, incremental and online learning to reduce the amount of required labeled data and to correct misclassifications of the algorithm by feed- back classification. Using the latter approach, we achieved a well-performing classifier based on the European floods dataset by only requiring a quarter of labeled data compared to the tradi- tional batch learning approach. Despite a lesser effect on the BASF SE incident dataset, still a substantial improvement could be determined.

Typ des Eintrags:	Artikel
Erschienen:	2020
Autor(en):	Kaufhold, Marc-André ; Bayer, Markus ; Reuter, Christian
Art des Eintrags:	Bibliographie
Titel:	Rapid relevance classification of social media posts in disasters and emergencies: A system and evaluation featuring active, incremental and online learning
Sprache:	Englisch
Publikationsjahr:	1 Januar 2020
Verlag:	Elsevier
Titel der Zeitschrift, Zeitung oder Schriftenreihe:	Information Processing & Management
Jahrgang/Volume einer Zeitschrift:	57
(Heft-)Nummer:	1
DOI:	10.1016/j.ipm.2019.102132
Kurzbeschreibung (Abstract):	The research field of crisis informatics examines, amongst others, the potentials and barriers of social media use during disasters and emergencies. Social media allow emergency services to receive valuable information (e.g., eyewitness reports, pictures, or videos) from social media. However, the vast amount of data generated during large-scale incidents can lead to issue of information overload. Research indicates that supervised machine learning techniques are sui- table for identifying relevant messages and filter out irrelevant messages, thus mitigating in- formation overload. Still, they require a considerable amount of labeled data, clear criteria for relevance classification, a usable interface to facilitate the labeling process and a mechanism to rapidly deploy retrained classifiers. To overcome these issues, we present (1) a system for social media monitoring, analysis and relevance classification, (2) abstract and precise criteria for re- levance classification in social media during disasters and emergencies, (3) the evaluation of a well-performing Random Forest algorithm for relevance classification incorporating metadata from social media into a batch learning approach (e.g., 91.28%/89.19% accuracy, 98.3%/89.6% precision and 80.4%/87.5% recall with a fast training time with feature subset selection on the European floods/BASF SE incident datasets), as well as (4) an approach and preliminary eva- luation for relevance classification including active, incremental and online learning to reduce the amount of required labeled data and to correct misclassifications of the algorithm by feed- back classification. Using the latter approach, we achieved a well-performing classifier based on the European floods dataset by only requiring a quarter of labeled data compared to the tradi- tional batch learning approach. Despite a lesser effect on the BASF SE incident dataset, still a substantial improvement could be determined.
Freie Schlagworte:	A-Paper, CORE-A, Crisis, SecUrban, SocialMedia,WKWI-B, emergenCITY, emergenCITY_INF, emergenCITY_SG
Fachbereich(e)/-gebiet(e):	20 Fachbereich Informatik 20 Fachbereich Informatik > Wissenschaft und Technik für Frieden und Sicherheit (PEASEC) Profilbereiche Profilbereiche > Cybersicherheit (CYSEC) LOEWE LOEWE > LOEWE-Zentren LOEWE > LOEWE-Zentren > CRISP - Center for Research in Security and Privacy LOEWE > LOEWE-Zentren > emergenCITY Zentrale Einrichtungen Zentrale Einrichtungen > Interdisziplinäre Arbeitsgruppe Naturwissenschaft, Technik und Sicherheit (IANUS)
TU-Projekte:	HMWK\|III L6-519/03/05.001-(0016)\|emergenCity TP Bock
Hinterlegungsdatum:	20 Aug 2020 07:32
Letzte Änderung:	27 Okt 2021 09:47
PPN:
Export:

Suche nach Titel in:	TUfind oder in Google

Frage zum Eintrag

Optionen (nur für Redakteure)

Redaktionelle Details anzeigen

OAI 2.0-Basis-URL: https://tubiblio.ulb.tu-darmstadt.de/cgi/oai2 TUbiblio verwendet EPrints 3.

Drucken |

Impressum |

Datenschutzerklärung