Braun, Alina (2021)
In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00019052
Dissertation, Erstveröffentlichung, Verlagsversion
Kurzbeschreibung (Abstract)
In theory, recent results in nonparametric regression show that neural network estimates are able to achieve good rates of convergence provided suitable assumptions on the structure of the regression function are imposed. However, these theoretical analyses cannot explain the practical success of neural networks since the theoretically studied estimates are defined by minimizing the empirical L_2 risk over a class of neural networks and in practice, solving this kind of minimization problem is not feasible. Consequently, the neural networks examined in theory cannot be implemented as they are defined. This means that neural network in applications differ from the ones that are analyzed theoretically.
In this thesis we narrow the gap between theory and practice. We deal with neural network regression estimates for (p,C)-smooth regression functions m that satisfy a projection pursuit model. We construct three implementable neural network estimates and show that each of them achieve up to a logarithmic factor the optimal univariate rate of convergence.
Firstly, for univariate regression functions with p contained in [-1/2,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The starting weights are randomly chosen from an interval independently of the data. The interval is large enough to guarantee that the estimate is close to a piecewise constant approximation.
Secondly, for multivariate regression functions with p contained in (0,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The initial weights are chosen from specific intervals dependently on the data and the projection directions. This choice guarantees that the estimate is close to a piecewise constant approximation. The projection directions are repeatedly chosen randomly.
Lastly, for multivariate regression functions with p>0 we construct a multilayer neural network estimate. The value of the inner weights are prescribed dependently on the projection directions by a new approximation result for a projection pursuit model by piecewise polynomials. The outer weights are chosen by solving a linear equation system. The projection directions are repeatedly chosen randomly.
Since we are able to show a rate of convergence that is independent of the dimension of the data our second and third estimates are able to circumvent the curse of dimensionality.
Typ des Eintrags: | Dissertation | ||||
---|---|---|---|---|---|
Erschienen: | 2021 | ||||
Autor(en): | Braun, Alina | ||||
Art des Eintrags: | Erstveröffentlichung | ||||
Titel: | In Theory and Practice - On the Rate of Convergence of Implementable Neural Network Regression Estimates | ||||
Sprache: | Englisch | ||||
Referenten: | Kohler, Prof. Dr. Michael ; Betz, Prof. Dr. Volker | ||||
Publikationsjahr: | 2021 | ||||
Ort: | Darmstadt | ||||
Kollation: | x, 219 Seiten | ||||
Datum der mündlichen Prüfung: | 25 Juni 2021 | ||||
DOI: | 10.26083/tuprints-00019052 | ||||
URL / URN: | https://tuprints.ulb.tu-darmstadt.de/19052 | ||||
Kurzbeschreibung (Abstract): | In theory, recent results in nonparametric regression show that neural network estimates are able to achieve good rates of convergence provided suitable assumptions on the structure of the regression function are imposed. However, these theoretical analyses cannot explain the practical success of neural networks since the theoretically studied estimates are defined by minimizing the empirical L_2 risk over a class of neural networks and in practice, solving this kind of minimization problem is not feasible. Consequently, the neural networks examined in theory cannot be implemented as they are defined. This means that neural network in applications differ from the ones that are analyzed theoretically. In this thesis we narrow the gap between theory and practice. We deal with neural network regression estimates for (p,C)-smooth regression functions m that satisfy a projection pursuit model. We construct three implementable neural network estimates and show that each of them achieve up to a logarithmic factor the optimal univariate rate of convergence. Firstly, for univariate regression functions with p contained in [-1/2,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The starting weights are randomly chosen from an interval independently of the data. The interval is large enough to guarantee that the estimate is close to a piecewise constant approximation. Secondly, for multivariate regression functions with p contained in (0,1] we construct a neural network estimate with one hidden layer where the weights are learned via gradient descent. The initial weights are chosen from specific intervals dependently on the data and the projection directions. This choice guarantees that the estimate is close to a piecewise constant approximation. The projection directions are repeatedly chosen randomly. Lastly, for multivariate regression functions with p>0 we construct a multilayer neural network estimate. The value of the inner weights are prescribed dependently on the projection directions by a new approximation result for a projection pursuit model by piecewise polynomials. The outer weights are chosen by solving a linear equation system. The projection directions are repeatedly chosen randomly. Since we are able to show a rate of convergence that is independent of the dimension of the data our second and third estimates are able to circumvent the curse of dimensionality. |
||||
Alternatives oder übersetztes Abstract: |
|
||||
Status: | Verlagsversion | ||||
URN: | urn:nbn:de:tuda-tuprints-190528 | ||||
Sachgruppe der Dewey Dezimalklassifikatin (DDC): | 500 Naturwissenschaften und Mathematik > 510 Mathematik | ||||
Fachbereich(e)/-gebiet(e): | 04 Fachbereich Mathematik 04 Fachbereich Mathematik > Stochastik |
||||
Hinterlegungsdatum: | 11 Aug 2021 08:47 | ||||
Letzte Änderung: | 16 Aug 2021 07:42 | ||||
PPN: | |||||
Referenten: | Kohler, Prof. Dr. Michael ; Betz, Prof. Dr. Volker | ||||
Datum der mündlichen Prüfung / Verteidigung / mdl. Prüfung: | 25 Juni 2021 | ||||
Export: | |||||
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |