TU Darmstadt / ULB / TUbiblio

Lessons Learned from a Citizen Science Project for Natural Language Processing

Klie, Jan-Christoph ; Lee, Ji-Ung ; Stowe, Kevin ; Şahin, Gözde Gül ; Moosavi, Nafise Sadat ; Bates, Luke ; Petrak, Dominic ; Castilho, Richard Eckart de (2023)
Lessons Learned from a Citizen Science Project for Natural Language Processing.
17th Conference of the European Chapter of the Association for Computational Linguistics. Dubrovnik, Croatia (02.-06.05.2023)
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Many Natural Language Processing (NLP) systems use annotated corpora for training and evaluation. However, labeled data is often costly to obtain and scaling annotation projects is difficult, which is why annotation tasks are often outsourced to paid crowdworkers. Citizen Science is an alternative to crowdsourcing that is relatively unexplored in the context of NLP. To investigate whether and how well Citizen Science can be applied in this setting, we conduct an exploratory study into engaging different groups of volunteers in Citizen Science for NLP by re-annotating parts of a pre-existing crowdsourced dataset. Our results show that this can yield high-quality annotations and at- tract motivated volunteers, but also requires considering factors such as scalability, participation over time, and legal and ethical issues. We summarize lessons learned in the form of guidelines and provide our code and data to aid future work on Citizen Science.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2023
Autor(en): Klie, Jan-Christoph ; Lee, Ji-Ung ; Stowe, Kevin ; Şahin, Gözde Gül ; Moosavi, Nafise Sadat ; Bates, Luke ; Petrak, Dominic ; Castilho, Richard Eckart de
Art des Eintrags: Bibliographie
Titel: Lessons Learned from a Citizen Science Project for Natural Language Processing
Sprache: Englisch
Publikationsjahr: 2 Mai 2023
Verlag: ACL
Buchtitel: The 17th Conference of the European Chapter of the Association for Computational Linguistics - proceedings of the conference
Veranstaltungstitel: 17th Conference of the European Chapter of the Association for Computational Linguistics
Veranstaltungsort: Dubrovnik, Croatia
Veranstaltungsdatum: 02.-06.05.2023
URL / URN: https://aclanthology.org/2023.eacl-main.261/
Kurzbeschreibung (Abstract):

Many Natural Language Processing (NLP) systems use annotated corpora for training and evaluation. However, labeled data is often costly to obtain and scaling annotation projects is difficult, which is why annotation tasks are often outsourced to paid crowdworkers. Citizen Science is an alternative to crowdsourcing that is relatively unexplored in the context of NLP. To investigate whether and how well Citizen Science can be applied in this setting, we conduct an exploratory study into engaging different groups of volunteers in Citizen Science for NLP by re-annotating parts of a pre-existing crowdsourced dataset. Our results show that this can yield high-quality annotations and at- tract motivated volunteers, but also requires considering factors such as scalability, participation over time, and legal and ethical issues. We summarize lessons learned in the form of guidelines and provide our code and data to aid future work on Citizen Science.

Freie Schlagworte: UKP_p_EVIDENCE, UKP_p_square,UKP_p_INCEpTION,UKP_p_PEER, UKP_p_seditrah_factcheck
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Zentrale Einrichtungen
Zentrale Einrichtungen > hessian.AI - Hessisches Zentrum für Künstliche Intelligenz
Hinterlegungsdatum: 12 Jun 2023 12:31
Letzte Änderung: 09 Aug 2023 12:29
PPN: 510469019
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen