Eckart de Castilho, Richard (2016)
Automatic Analysis of Flaws in Pre-Trained NLP Models.
Osaka, Japan
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
Most tools for natural language processing (NLP) today are based on machine learning and come with pre-trained models. In addition, third-parties provide pre-trained models for popular NLP tools. The predictive power and accuracy of these tools depends on the quality of these models. Downstream researchers often base their results on pre-trained models instead of training their own. Consequently, pre-trained models are an essential resource to our community. However, to be best of our knowledge, no systematic study of pre-trained models has been conducted so far. This paper reports on the analysis of 274 pre-models for six NLP tools and four potential causes of problems: encoding, tokenization, normalization, and change over time. The analysis is implemented in the open source tool Model Investigator. Our work 1) allows model consumers to better assess whether a model is suitable for their task, 2) enables tool and model creators to sanity-check their models before distributing them, and 3) enables improvements in tool interoperability by performing automatic adjustments of normalization or other pre-processing based on the models used.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2016 |
Autor(en): | Eckart de Castilho, Richard |
Art des Eintrags: | Bibliographie |
Titel: | Automatic Analysis of Flaws in Pre-Trained NLP Models |
Sprache: | Englisch |
Publikationsjahr: | Dezember 2016 |
Buchtitel: | Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI3nOIAF2) at COLING 2016 |
Veranstaltungsort: | Osaka, Japan |
URL / URN: | http://www.aclweb.org/anthology/W16-5203 |
Kurzbeschreibung (Abstract): | Most tools for natural language processing (NLP) today are based on machine learning and come with pre-trained models. In addition, third-parties provide pre-trained models for popular NLP tools. The predictive power and accuracy of these tools depends on the quality of these models. Downstream researchers often base their results on pre-trained models instead of training their own. Consequently, pre-trained models are an essential resource to our community. However, to be best of our knowledge, no systematic study of pre-trained models has been conducted so far. This paper reports on the analysis of 274 pre-models for six NLP tools and four potential causes of problems: encoding, tokenization, normalization, and change over time. The analysis is implemented in the open source tool Model Investigator. Our work 1) allows model consumers to better assess whether a model is suitable for their task, 2) enables tool and model creators to sanity-check their models before distributing them, and 3) enables improvements in tool interoperability by performing automatic adjustments of normalization or other pre-processing based on the models used. |
Freie Schlagworte: | CEDIFOR;UKP_s_DKPro_Core;UKP_p_DKPro;UKP_reviewed;UKP_p_OpenMinTeD |
ID-Nummer: | TUD-CS-2016-14654 |
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung DFG-Graduiertenkollegs DFG-Graduiertenkollegs > Graduiertenkolleg 1994 Adaptive Informationsaufbereitung aus heterogenen Quellen |
Hinterlegungsdatum: | 31 Dez 2016 14:29 |
Letzte Änderung: | 05 Okt 2018 09:02 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |