TU Darmstadt / ULB / TUbiblio

Analyzing Formulaic Patterns in Historical Corpora

Moulin, Claudine and Gurevych, Iryna and Filatkina, Natalia and Eckart de Castilho, Richard
Gippert, Jost and Gehrke, Ralf (eds.) (2015):
Analyzing Formulaic Patterns in Historical Corpora.
In: Historical Corpora. Challenges and Perspectives., Narr Publishing House, pp. 51-64, [Online-Edition: http://narr-starter.de/magento/index.php/historical-corpora....],
[Book Section]

Abstract

This paper aims to point out a linguistic phenomenon that due to the current stage of research can be analysed only insufficiently with the help of an electronic text corpus. In this way, the paper adds a new aspect to the discussion about historical corpora by tackling the question of how they should be designed in order to be useful for linguistic research on so‐called formulaic patterns. The novelty of the question becomes apparent considering the fact that at present such historical corpora do not exist. In section 1, we define the term formulaic pattern because a clear understanding of this phenomenon is a prerequisite condition for collaborative research of it by historians of language and corpus and computer linguists. Section 2 gives a brief outline of the state of the art in the field of modern formulaic language within the framework of corpus and computer linguistics. Section 3 shows that some well known problems in this area are exacerbated when applied to historical texts. Section 4 presents a possible solution that has been implemented by the HiFoS Researchers' Group at the University of Trier (Germany). Joint research efforts planned with UKP Lab at the TU Darmstadt (section 5) demonstrate that the restrictions posed by historical formulaic patterns are challenges to be overcome, rather than insurmountable obstacles.

Item Type: Book Section
Erschienen: 2015
Editors: Gippert, Jost and Gehrke, Ralf
Creators: Moulin, Claudine and Gurevych, Iryna and Filatkina, Natalia and Eckart de Castilho, Richard
Title: Analyzing Formulaic Patterns in Historical Corpora
Language: English
Abstract:

This paper aims to point out a linguistic phenomenon that due to the current stage of research can be analysed only insufficiently with the help of an electronic text corpus. In this way, the paper adds a new aspect to the discussion about historical corpora by tackling the question of how they should be designed in order to be useful for linguistic research on so‐called formulaic patterns. The novelty of the question becomes apparent considering the fact that at present such historical corpora do not exist. In section 1, we define the term formulaic pattern because a clear understanding of this phenomenon is a prerequisite condition for collaborative research of it by historians of language and corpus and computer linguists. Section 2 gives a brief outline of the state of the art in the field of modern formulaic language within the framework of corpus and computer linguistics. Section 3 shows that some well known problems in this area are exacerbated when applied to historical texts. Section 4 presents a possible solution that has been implemented by the HiFoS Researchers' Group at the University of Trier (Germany). Joint research efforts planned with UKP Lab at the TU Darmstadt (section 5) demonstrate that the restrictions posed by historical formulaic patterns are challenges to be overcome, rather than insurmountable obstacles.

Title of Book: Historical Corpora. Challenges and Perspectives.
Series Name: Corpus Linguistics and Interdisciplinary Perspectives on Language (CLIP)
Number: 5
Publisher: Narr Publishing House
ISBN: 978-3-8233-6922-6
Uncontrolled Keywords: UKP_reviewed;UKP_a_LangTech4eHum;UKP_a_LTDH
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
DFG-Graduiertenkollegs
DFG-Graduiertenkollegs > Research Training Group 1994 Adaptive Preparation of Information from Heterogeneous Sources
Date Deposited: 31 Dec 2016 14:29
Official URL: http://narr-starter.de/magento/index.php/historical-corpora....
Identification Number: TUD-CS-2012-0294
Related URLs:
Export:
Suche nach Titel in: TUfind oder in Google

Optionen (nur für Redakteure)

View Item View Item