TU Darmstadt / ULB / TUbiblio

On the rate of convergence of a classifier based on a Transformer encoder

Gurevych, Iryna ; Kohler, Michael ; Şahin, Gözde Gül (2022)
On the rate of convergence of a classifier based on a Transformer encoder.
In: IEEE Transactions on Information Theory, 68 (12)
doi: 10.1109/TIT.2022.3191747
Article, Bibliographie

Abstract

Pattern recognition based on a high-dimensional predictor is considered. A classifier is defined which is based on a Transformer encoder. The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed. It is shown that this classifier is able to circumvent the curse of dimensionality provided the a posteriori probability satisfies a suitable hierarchical composition model. Furthermore, the difference between the Transformer classifiers theoretically analyzed in this paper and the ones used in practice today is illustrated by means of classification problems in natural language processing.

Item Type: Article
Erschienen: 2022
Creators: Gurevych, Iryna ; Kohler, Michael ; Şahin, Gözde Gül
Type of entry: Bibliographie
Title: On the rate of convergence of a classifier based on a Transformer encoder
Language: English
Date: 1 December 2022
Publisher: IEEE
Journal or Publication Title: IEEE Transactions on Information Theory
Volume of the journal: 68
Issue Number: 12
DOI: 10.1109/TIT.2022.3191747
Abstract:

Pattern recognition based on a high-dimensional predictor is considered. A classifier is defined which is based on a Transformer encoder. The rate of convergence of the misclassification probability of the classifier towards the optimal misclassification probability is analyzed. It is shown that this classifier is able to circumvent the curse of dimensionality provided the a posteriori probability satisfies a suitable hierarchical composition model. Furthermore, the difference between the Transformer classifiers theoretically analyzed in this paper and the ones used in practice today is illustrated by means of classification problems in natural language processing.

Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
Date Deposited: 25 Jul 2022 11:27
Last Modified: 12 Jan 2023 07:47
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Send an inquiry Send an inquiry

Options (only for editors)
Show editorial Details Show editorial Details