Halvani, Oren ; Winter, Christian ; Pflug, Anika (2016)
Authorship Verification for Different Languages, Genres and Topics.
In: Digital Investigation, 16 (supplement)
doi: 10.1016/j.diin.2016.01.006
Artikel, Bibliographie
Kurzbeschreibung (Abstract)
Authorship verification is a branch of forensic authorship analysis addressing the following task: Given a number of sample documents of an author A and a document allegedly written by A, the task is to decide whether the author of the latter document is truly A or not. We present a scalable authorship verification method that copes with this problem across different languages, genres and topics. The central concept of our method is a model, which is trained with Dutch, English, Greek, Spanish and German text documents. The model sets for each language specific parameters and a threshold that accepts or rejects the alleged author as A. The proposed method offers a wide range of benefits, e.g., a universal (static) threshold for each language and scalability regarding almost any involved component (classification function, ensemble strategy, features, etc.). Furthermore, the method benefits from low runtime due to the fact that no natural language processing techniques nor other computationally-intensive methods are involved. In our experiments, we applied the method on 28 test corpora including 4525 verification cases across 16 genres and a huge number of mixed topics, where we achieved competitive results (75% median accuracy). With these results we were able to outperform two state-of-the-art baselines, given the same training and test corpora.
Typ des Eintrags: | Artikel |
---|---|
Erschienen: | 2016 |
Autor(en): | Halvani, Oren ; Winter, Christian ; Pflug, Anika |
Art des Eintrags: | Bibliographie |
Titel: | Authorship Verification for Different Languages, Genres and Topics |
Sprache: | Englisch |
Publikationsjahr: | 2016 |
Titel der Zeitschrift, Zeitung oder Schriftenreihe: | Digital Investigation |
Jahrgang/Volume einer Zeitschrift: | 16 |
(Heft-)Nummer: | supplement |
Veranstaltungsort: | Lausanne, Switzerland |
DOI: | 10.1016/j.diin.2016.01.006 |
Kurzbeschreibung (Abstract): | Authorship verification is a branch of forensic authorship analysis addressing the following task: Given a number of sample documents of an author A and a document allegedly written by A, the task is to decide whether the author of the latter document is truly A or not. We present a scalable authorship verification method that copes with this problem across different languages, genres and topics. The central concept of our method is a model, which is trained with Dutch, English, Greek, Spanish and German text documents. The model sets for each language specific parameters and a threshold that accepts or rejects the alleged author as A. The proposed method offers a wide range of benefits, e.g., a universal (static) threshold for each language and scalability regarding almost any involved component (classification function, ensemble strategy, features, etc.). Furthermore, the method benefits from low runtime due to the fact that no natural language processing techniques nor other computationally-intensive methods are involved. In our experiments, we applied the method on 28 test corpora including 4525 verification cases across 16 genres and a huge number of mixed topics, where we achieved competitive results (75% median accuracy). With these results we were able to outperform two state-of-the-art baselines, given the same training and test corpora. |
Freie Schlagworte: | Secure Data;Digital text forensics; Intrinsic authorship verification; One-class-classification; Cross-genre; Cross-topic |
ID-Nummer: | TUD-CS-2016-0183 |
Fachbereich(e)/-gebiet(e): | LOEWE > LOEWE-Zentren > CASED – Center for Advanced Security Research Darmstadt LOEWE > LOEWE-Zentren LOEWE |
Hinterlegungsdatum: | 30 Dez 2016 20:23 |
Letzte Änderung: | 17 Mai 2018 13:02 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |