TU Darmstadt / ULB / TUbiblio

Word recognition with a hierarchical network

Domont, Xavier ; Heckmann, Martin ; Wersing, Heiko ; Joublin, Frank ; Menzel, Stefan ; Sendhoff, Bernhard ; Goerick, Christian (2007)
Word recognition with a hierarchical network.
International Conference on Nonlinear Speech Processing 2007. Paris, France (22.-25.05.2007)
doi: 10.1007/978-3-540-77347-4_11
Conference or Workshop Item, Bibliographie

Abstract

In this paper we propose a feedforward neural network for syllable recognition. The core of the recognition system is based on a hierarchical architecture initially developed for visual object recognition. We show that, given the similarities between the primary auditory and visual cortexes, such a system can successfully be used for speech recognition. Syllables are used as basic units for the recognition. Their spectrograms, computed using a Gammatone filterbank, are interpreted as images and subsequently feed into the neural network after a preprocessing step that enhances the formant frequencies and normalizes the length of the syllables. The performance of our system has been analyzed on the recognition of 25 different monosyllabic words. The parameters of the architecture have been optimized using an evolutionary strategy. Compared to the Sphinx-4 speech recognition system, our system achieves better robustness and generalization capabilities in noisy conditions

Item Type: Conference or Workshop Item
Erschienen: 2007
Creators: Domont, Xavier ; Heckmann, Martin ; Wersing, Heiko ; Joublin, Frank ; Menzel, Stefan ; Sendhoff, Bernhard ; Goerick, Christian
Type of entry: Bibliographie
Title: Word recognition with a hierarchical network
Language: English
Date: 2007
Publisher: Springer
Book Title: Advances in Nonlinear Speech Processing - NOLISP 2007
Series: Lecture Notes in Computer Science
Series Volume: 4885
Event Title: International Conference on Nonlinear Speech Processing 2007
Event Location: Paris, France
Event Dates: 22.-25.05.2007
DOI: 10.1007/978-3-540-77347-4_11
Abstract:

In this paper we propose a feedforward neural network for syllable recognition. The core of the recognition system is based on a hierarchical architecture initially developed for visual object recognition. We show that, given the similarities between the primary auditory and visual cortexes, such a system can successfully be used for speech recognition. Syllables are used as basic units for the recognition. Their spectrograms, computed using a Gammatone filterbank, are interpreted as images and subsequently feed into the neural network after a preprocessing step that enhances the formant frequencies and normalizes the length of the syllables. The performance of our system has been analyzed on the recognition of 25 different monosyllabic words. The parameters of the architecture have been optimized using an evolutionary strategy. Compared to the Sphinx-4 speech recognition system, our system achieves better robustness and generalization capabilities in noisy conditions

Divisions: 18 Department of Electrical Engineering and Information Technology
18 Department of Electrical Engineering and Information Technology > Institut für Automatisierungstechnik und Mechatronik
18 Department of Electrical Engineering and Information Technology > Institut für Automatisierungstechnik und Mechatronik > Control Methods and Robotics (from 01.08.2022 renamed Control Methods and Intelligent Systems)
Date Deposited: 20 Nov 2008 08:28
Last Modified: 21 Apr 2023 07:09
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Send an inquiry Send an inquiry

Options (only for editors)
Show editorial Details Show editorial Details