TU Darmstadt / ULB / TUbiblio

On the rate of convergence of a classifier based on a Transformer encoder

Gurevych, Iryna ; Kohler, Michael ; Şahin, Gözde Gül (2022)
On the rate of convergence of a classifier based on a Transformer encoder.
In: IEEE Transactions on Information Theory, 68 (12)
doi: 10.1109/TIT.2022.3191747
Artikel, Bibliographie

Kurzbeschreibung (Abstract)

Pattern recognition based on a high-dimensional predictor is considered. A classifier is defined which is based on a Transformer encoder. The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed. It is shown that this classifier is able to circumvent the curse of dimensionality provided the a posteriori probability satisfies a suitable hierarchical composition model. Furthermore, the difference between the Transformer classifiers theoretically analyzed in this paper and the ones used in practice today is illustrated by means of classification problems in natural language processing.

Typ des Eintrags: Artikel
Erschienen: 2022
Autor(en): Gurevych, Iryna ; Kohler, Michael ; Şahin, Gözde Gül
Art des Eintrags: Bibliographie
Titel: On the rate of convergence of a classifier based on a Transformer encoder
Sprache: Englisch
Publikationsjahr: 1 Dezember 2022
Verlag: IEEE
Titel der Zeitschrift, Zeitung oder Schriftenreihe: IEEE Transactions on Information Theory
Jahrgang/Volume einer Zeitschrift: 68
(Heft-)Nummer: 12
DOI: 10.1109/TIT.2022.3191747
Kurzbeschreibung (Abstract):

Pattern recognition based on a high-dimensional predictor is considered. A classifier is defined which is based on a Transformer encoder. The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed. It is shown that this classifier is able to circumvent the curse of dimensionality provided the a posteriori probability satisfies a suitable hierarchical composition model. Furthermore, the difference between the Transformer classifiers theoretically analyzed in this paper and the ones used in practice today is illustrated by means of classification problems in natural language processing.

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Hinterlegungsdatum: 25 Jul 2022 11:27
Letzte Änderung: 12 Jan 2023 07:47
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen