TU Darmstadt / ULB / TUbiblio

Word recognition with a hierarchical network

Domont, Xavier ; Heckmann, Martin ; Wersing, Heiko ; Joublin, Frank ; Menzel, Stefan ; Sendhoff, Bernhard ; Goerick, Christian (2007)
Word recognition with a hierarchical network.
International Conference on Nonlinear Speech Processing 2007. Paris, France (22.-25.05.2007)
doi: 10.1007/978-3-540-77347-4_11
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

In this paper we propose a feedforward neural network for syllable recognition. The core of the recognition system is based on a hierarchical architecture initially developed for visual object recognition. We show that, given the similarities between the primary auditory and visual cortexes, such a system can successfully be used for speech recognition. Syllables are used as basic units for the recognition. Their spectrograms, computed using a Gammatone filterbank, are interpreted as images and subsequently feed into the neural network after a preprocessing step that enhances the formant frequencies and normalizes the length of the syllables. The performance of our system has been analyzed on the recognition of 25 different monosyllabic words. The parameters of the architecture have been optimized using an evolutionary strategy. Compared to the Sphinx-4 speech recognition system, our system achieves better robustness and generalization capabilities in noisy conditions

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2007
Autor(en): Domont, Xavier ; Heckmann, Martin ; Wersing, Heiko ; Joublin, Frank ; Menzel, Stefan ; Sendhoff, Bernhard ; Goerick, Christian
Art des Eintrags: Bibliographie
Titel: Word recognition with a hierarchical network
Sprache: Englisch
Publikationsjahr: 2007
Verlag: Springer
Buchtitel: Advances in Nonlinear Speech Processing - NOLISP 2007
Reihe: Lecture Notes in Computer Science
Band einer Reihe: 4885
Veranstaltungstitel: International Conference on Nonlinear Speech Processing 2007
Veranstaltungsort: Paris, France
Veranstaltungsdatum: 22.-25.05.2007
DOI: 10.1007/978-3-540-77347-4_11
Kurzbeschreibung (Abstract):

In this paper we propose a feedforward neural network for syllable recognition. The core of the recognition system is based on a hierarchical architecture initially developed for visual object recognition. We show that, given the similarities between the primary auditory and visual cortexes, such a system can successfully be used for speech recognition. Syllables are used as basic units for the recognition. Their spectrograms, computed using a Gammatone filterbank, are interpreted as images and subsequently feed into the neural network after a preprocessing step that enhances the formant frequencies and normalizes the length of the syllables. The performance of our system has been analyzed on the recognition of 25 different monosyllabic words. The parameters of the architecture have been optimized using an evolutionary strategy. Compared to the Sphinx-4 speech recognition system, our system achieves better robustness and generalization capabilities in noisy conditions

Fachbereich(e)/-gebiet(e): 18 Fachbereich Elektrotechnik und Informationstechnik
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Automatisierungstechnik und Mechatronik
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Automatisierungstechnik und Mechatronik > Regelungsmethoden und Robotik (ab 01.08.2022 umbenannt in Regelungsmethoden und Intelligente Systeme)
Hinterlegungsdatum: 20 Nov 2008 08:28
Letzte Änderung: 21 Apr 2023 07:09
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen