Domont, Xavier ; Heckmann, Martin ; Wersing, Heiko ; Joublin, Frank ; Menzel, Stefan ; Sendhoff, Bernhard ; Goerick, Christian (2007)
Word recognition with a hierarchical network.
International Conference on Nonlinear Speech Processing 2007. Paris, France (22.-25.05.2007)
doi: 10.1007/978-3-540-77347-4_11
Conference or Workshop Item, Bibliographie
Abstract
In this paper we propose a feedforward neural network for syllable recognition. The core of the recognition system is based on a hierarchical architecture initially developed for visual object recognition. We show that, given the similarities between the primary auditory and visual cortexes, such a system can successfully be used for speech recognition. Syllables are used as basic units for the recognition. Their spectrograms, computed using a Gammatone filterbank, are interpreted as images and subsequently feed into the neural network after a preprocessing step that enhances the formant frequencies and normalizes the length of the syllables. The performance of our system has been analyzed on the recognition of 25 different monosyllabic words. The parameters of the architecture have been optimized using an evolutionary strategy. Compared to the Sphinx-4 speech recognition system, our system achieves better robustness and generalization capabilities in noisy conditions
Item Type: | Conference or Workshop Item |
---|---|
Erschienen: | 2007 |
Creators: | Domont, Xavier ; Heckmann, Martin ; Wersing, Heiko ; Joublin, Frank ; Menzel, Stefan ; Sendhoff, Bernhard ; Goerick, Christian |
Type of entry: | Bibliographie |
Title: | Word recognition with a hierarchical network |
Language: | English |
Date: | 2007 |
Publisher: | Springer |
Book Title: | Advances in Nonlinear Speech Processing - NOLISP 2007 |
Series: | Lecture Notes in Computer Science |
Series Volume: | 4885 |
Event Title: | International Conference on Nonlinear Speech Processing 2007 |
Event Location: | Paris, France |
Event Dates: | 22.-25.05.2007 |
DOI: | 10.1007/978-3-540-77347-4_11 |
Abstract: | In this paper we propose a feedforward neural network for syllable recognition. The core of the recognition system is based on a hierarchical architecture initially developed for visual object recognition. We show that, given the similarities between the primary auditory and visual cortexes, such a system can successfully be used for speech recognition. Syllables are used as basic units for the recognition. Their spectrograms, computed using a Gammatone filterbank, are interpreted as images and subsequently feed into the neural network after a preprocessing step that enhances the formant frequencies and normalizes the length of the syllables. The performance of our system has been analyzed on the recognition of 25 different monosyllabic words. The parameters of the architecture have been optimized using an evolutionary strategy. Compared to the Sphinx-4 speech recognition system, our system achieves better robustness and generalization capabilities in noisy conditions |
Divisions: | 18 Department of Electrical Engineering and Information Technology 18 Department of Electrical Engineering and Information Technology > Institut für Automatisierungstechnik und Mechatronik 18 Department of Electrical Engineering and Information Technology > Institut für Automatisierungstechnik und Mechatronik > Control Methods and Robotics (from 01.08.2022 renamed Control Methods and Intelligent Systems) |
Date Deposited: | 20 Nov 2008 08:28 |
Last Modified: | 21 Apr 2023 07:09 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Send an inquiry |
Options (only for editors)
Show editorial Details |