Domont, Xavier ; Heckmann, Martin ; Wersing, Heiko ; Joublin, Frank ; Menzel, Stefan ; Sendhoff, Bernhard ; Goerick, Christian (2007)
Word recognition with a hierarchical network.
International Conference on Nonlinear Speech Processing 2007. Paris, France (22.05.2007-25.05.2007)
doi: 10.1007/978-3-540-77347-4_11
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
In this paper we propose a feedforward neural network for syllable recognition. The core of the recognition system is based on a hierarchical architecture initially developed for visual object recognition. We show that, given the similarities between the primary auditory and visual cortexes, such a system can successfully be used for speech recognition. Syllables are used as basic units for the recognition. Their spectrograms, computed using a Gammatone filterbank, are interpreted as images and subsequently feed into the neural network after a preprocessing step that enhances the formant frequencies and normalizes the length of the syllables. The performance of our system has been analyzed on the recognition of 25 different monosyllabic words. The parameters of the architecture have been optimized using an evolutionary strategy. Compared to the Sphinx-4 speech recognition system, our system achieves better robustness and generalization capabilities in noisy conditions
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2007 |
Autor(en): | Domont, Xavier ; Heckmann, Martin ; Wersing, Heiko ; Joublin, Frank ; Menzel, Stefan ; Sendhoff, Bernhard ; Goerick, Christian |
Art des Eintrags: | Bibliographie |
Titel: | Word recognition with a hierarchical network |
Sprache: | Englisch |
Publikationsjahr: | 2007 |
Verlag: | Springer |
Buchtitel: | Advances in Nonlinear Speech Processing - NOLISP 2007 |
Reihe: | Lecture Notes in Computer Science |
Band einer Reihe: | 4885 |
Veranstaltungstitel: | International Conference on Nonlinear Speech Processing 2007 |
Veranstaltungsort: | Paris, France |
Veranstaltungsdatum: | 22.05.2007-25.05.2007 |
DOI: | 10.1007/978-3-540-77347-4_11 |
Kurzbeschreibung (Abstract): | In this paper we propose a feedforward neural network for syllable recognition. The core of the recognition system is based on a hierarchical architecture initially developed for visual object recognition. We show that, given the similarities between the primary auditory and visual cortexes, such a system can successfully be used for speech recognition. Syllables are used as basic units for the recognition. Their spectrograms, computed using a Gammatone filterbank, are interpreted as images and subsequently feed into the neural network after a preprocessing step that enhances the formant frequencies and normalizes the length of the syllables. The performance of our system has been analyzed on the recognition of 25 different monosyllabic words. The parameters of the architecture have been optimized using an evolutionary strategy. Compared to the Sphinx-4 speech recognition system, our system achieves better robustness and generalization capabilities in noisy conditions |
Fachbereich(e)/-gebiet(e): | 18 Fachbereich Elektrotechnik und Informationstechnik 18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Automatisierungstechnik und Mechatronik 18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Automatisierungstechnik und Mechatronik > Regelungsmethoden und Robotik (ab 01.08.2022 umbenannt in Regelungsmethoden und Intelligente Systeme) |
Hinterlegungsdatum: | 20 Nov 2008 08:28 |
Letzte Änderung: | 21 Apr 2023 07:09 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |