TU Darmstadt / ULB / TUbiblio

Teaching "Unstructured Information Management: Theory and Applications" to Computational Linguistics Students

Gurevych, Iryna ; Müller, Christof ; Zesch, Torsten (2007)
Teaching "Unstructured Information Management: Theory and Applications" to Computational Linguistics Students.
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Students in Computational Linguistics often lack experience in building robust and scalable software components. Thus, student projects tend to be unstable and to work only under very special preconditions (e.g., a project has to be installed in a certain directory, or handles only single files instead of whole directories). Furthermore, if students have to build a system from scratch, they have to concentrate on input and output issues, as well as connecting numerous preprocessing components that were not designed to work together. This limits the scope of feasible course tasks to relatively simple ones like implementing yet another tokenizer. When offering the course “Unstructured Information Management: Theory and Applications 1 as part of the B.A./M.A. program of International Studies in Computational Linguistics at the University of Tübingen, our motivation was to familiarize students with fundamental concepts in unstructured information management and Natural Language Processing (NLP) middleware. This should enable students of computational linguistics to work on more challenging tasks, and to gain first experiences with building complex software systems. The course goals were supported by providing basic preprocessing components like a tokenizer or a PoSTagger on the basis of the Unstructured Information Management Architecture (UIMA) (Ferrucci and Lally, 2004). Thus, students of computational linguistics can concentrate on their core competence and work on more challenging tasks both in terms of theoretical complex- 1http://www.ukp.tu-darmstadt.de/teaching/ ws0607/UIMseminar ity and industrial relevance. As a side effect, components developed in the course are robust and scalable, which enables re-use by the research community.2 UIMA allows us to shift the focus from software engineering to research relevant tasks, like thorough evaluation of the projects.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2007
Autor(en): Gurevych, Iryna ; Müller, Christof ; Zesch, Torsten
Art des Eintrags: Bibliographie
Titel: Teaching "Unstructured Information Management: Theory and Applications" to Computational Linguistics Students
Sprache: Deutsch
Publikationsjahr: 2007
Buchtitel: Proceedings of the First Workshop on Unstructured Information Management Architecture at Biannual Conference of the Society for Computational Linguistics and Language Technology
URL / URN: https://public.ukp.informatik.tu-darmstadt.de/UKP_Webpage/pu...
Kurzbeschreibung (Abstract):

Students in Computational Linguistics often lack experience in building robust and scalable software components. Thus, student projects tend to be unstable and to work only under very special preconditions (e.g., a project has to be installed in a certain directory, or handles only single files instead of whole directories). Furthermore, if students have to build a system from scratch, they have to concentrate on input and output issues, as well as connecting numerous preprocessing components that were not designed to work together. This limits the scope of feasible course tasks to relatively simple ones like implementing yet another tokenizer. When offering the course “Unstructured Information Management: Theory and Applications 1 as part of the B.A./M.A. program of International Studies in Computational Linguistics at the University of Tübingen, our motivation was to familiarize students with fundamental concepts in unstructured information management and Natural Language Processing (NLP) middleware. This should enable students of computational linguistics to work on more challenging tasks, and to gain first experiences with building complex software systems. The course goals were supported by providing basic preprocessing components like a tokenizer or a PoSTagger on the basis of the Unstructured Information Management Architecture (UIMA) (Ferrucci and Lally, 2004). Thus, students of computational linguistics can concentrate on their core competence and work on more challenging tasks both in terms of theoretical complex- 1http://www.ukp.tu-darmstadt.de/teaching/ ws0607/UIMseminar ity and industrial relevance. As a side effect, components developed in the course are robust and scalable, which enables re-use by the research community.2 UIMA allows us to shift the focus from software engineering to research relevant tasks, like thorough evaluation of the projects.

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Telekooperation
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Hinterlegungsdatum: 31 Dez 2016 12:59
Letzte Änderung: 24 Jan 2020 12:03
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen