TU Darmstadt / ULB / TUbiblio

An Evaluation of Efficient Multilabel Classification Algorithms for Large-Scale Problems in the Legal Domain

Loza Mencía, Eneldo ; Fürnkranz, Johannes
Hrsg.: Montemagni, Simonetta ; Tiscornia, Daniela ; Francesconi, Enrico ; Peters, Wim (2008)
An Evaluation of Efficient Multilabel Classification Algorithms for Large-Scale Problems in the Legal Domain.
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

In this paper we evaluate the performance of multilabel classification algorithms on the EUR-Lex database of legal documents of the European Union. On the same set of underlying documents, we defined three different large-scale multilabel problems with up to 4000 classes. On these datasets, we compared three algorithms: (i) the well-known one-against-all approach (OAA); (ii) the multiclass multilabel perceptron algorithm (MMP), which modifies the OAA ensemble by respecting dependencies between the base classifiers in the training protocol of the classifier ensemble; and (iii) the multilabel pairwise perceptron algorithm (MLPP), which unlike the previous algorithms trains one base classifier for each pair of classes. All algorithms use the simple but very efficient perceptron algorithm as the underlying classifier. This makes them very suitable for large-scale multilabel classification problems. While previous work has already shown that the latter approach outperforms the other two approaches in terms of predictive accuracy, its key problem is that it has to store one classifier for each pair of classes. The key contribution of this work is to demonstrate a novel technique that makes the pairwise approach feasible for problems with large number of classes, such as those studied in this work. Our results on the EUR-Lex database illustrate the effectiveness of the pairwise approach and the efficiency of the MMP algorithm. We also show that it is feasible to efficiently and effectively handle very large multilabel problems.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2008
Herausgeber: Montemagni, Simonetta ; Tiscornia, Daniela ; Francesconi, Enrico ; Peters, Wim
Autor(en): Loza Mencía, Eneldo ; Fürnkranz, Johannes
Art des Eintrags: Bibliographie
Titel: An Evaluation of Efficient Multilabel Classification Algorithms for Large-Scale Problems in the Legal Domain
Sprache: Englisch
Publikationsjahr: 2008
Buchtitel: Proceedings of the LREC 2008 Workshop on Semantic Processing of Legal Texts
URL / URN: http://www.ke.informatik.tu-darmstadt.de/publications/papers...
Kurzbeschreibung (Abstract):

In this paper we evaluate the performance of multilabel classification algorithms on the EUR-Lex database of legal documents of the European Union. On the same set of underlying documents, we defined three different large-scale multilabel problems with up to 4000 classes. On these datasets, we compared three algorithms: (i) the well-known one-against-all approach (OAA); (ii) the multiclass multilabel perceptron algorithm (MMP), which modifies the OAA ensemble by respecting dependencies between the base classifiers in the training protocol of the classifier ensemble; and (iii) the multilabel pairwise perceptron algorithm (MLPP), which unlike the previous algorithms trains one base classifier for each pair of classes. All algorithms use the simple but very efficient perceptron algorithm as the underlying classifier. This makes them very suitable for large-scale multilabel classification problems. While previous work has already shown that the latter approach outperforms the other two approaches in terms of predictive accuracy, its key problem is that it has to store one classifier for each pair of classes. The key contribution of this work is to demonstrate a novel technique that makes the pairwise approach feasible for problems with large number of classes, such as those studied in this work. Our results on the EUR-Lex database illustrate the effectiveness of the pairwise approach and the efficiency of the MMP algorithm. We also show that it is feasible to efficiently and effectively handle very large multilabel problems.

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Knowledge Engineering
Hinterlegungsdatum: 24 Jun 2011 15:08
Letzte Änderung: 03 Jun 2018 21:24
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen