Domont, Xavier ; Heckmann, Martin ; Joublin, Frank ; Goerick, Christian (2008)
Hierarchical Spectro-Temporal Features for Robust Speech Recognition.
2008 IEEE International Conference on Acoustics, Speech, and Signal Processing. Las Vegas, USA (30.04.2008-04.04.2008)
doi: 10.1109/ICASSP.2008.4518635
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
Previously we presented an auditory-inspired feed-forward architecture which achieves good performance in noisy conditions on a segmented word recognition task. In this paper we propose to use a modified version of this hierarchical model to generate features for standard hidden Markov models. To obtain these features we firstly compute the spectrograms using a Gammatone filterbank. A filtering over the channels permits to enhance the formant frequencies which are afterwards detected using Gabor-like receptive fields. Then the responses of the receptive fields are combined to complex features which span the whole frequency range and extend over three different time windows. The features have been evaluated on a single digit recognition task. The results show that their combination with MFCCs or RASTA features yields improved recognition scores in noise
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2008 |
Autor(en): | Domont, Xavier ; Heckmann, Martin ; Joublin, Frank ; Goerick, Christian |
Art des Eintrags: | Bibliographie |
Titel: | Hierarchical Spectro-Temporal Features for Robust Speech Recognition |
Sprache: | Englisch |
Publikationsjahr: | 12 Mai 2008 |
Verlag: | IEEE |
Buchtitel: | 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings |
Veranstaltungstitel: | 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing |
Veranstaltungsort: | Las Vegas, USA |
Veranstaltungsdatum: | 30.04.2008-04.04.2008 |
DOI: | 10.1109/ICASSP.2008.4518635 |
Kurzbeschreibung (Abstract): | Previously we presented an auditory-inspired feed-forward architecture which achieves good performance in noisy conditions on a segmented word recognition task. In this paper we propose to use a modified version of this hierarchical model to generate features for standard hidden Markov models. To obtain these features we firstly compute the spectrograms using a Gammatone filterbank. A filtering over the channels permits to enhance the formant frequencies which are afterwards detected using Gabor-like receptive fields. Then the responses of the receptive fields are combined to complex features which span the whole frequency range and extend over three different time windows. The features have been evaluated on a single digit recognition task. The results show that their combination with MFCCs or RASTA features yields improved recognition scores in noise |
Zusätzliche Informationen: | Print-ISBN: 978-1-4244-1483-3 |
Fachbereich(e)/-gebiet(e): | 18 Fachbereich Elektrotechnik und Informationstechnik 18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Automatisierungstechnik und Mechatronik 18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Automatisierungstechnik und Mechatronik > Regelungsmethoden und Robotik (ab 01.08.2022 umbenannt in Regelungsmethoden und Intelligente Systeme) |
Hinterlegungsdatum: | 16 Aug 2010 14:32 |
Letzte Änderung: | 02 Mai 2023 11:41 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |