Schnall, Andrea ; Heckmann, Martin (2016)
Comparing speaker independent and speaker adapted classification for word prominence detection.
In: 2016 IEEE Spoken Language Technology Workshop (SLT)
doi: 10.1109/SLT.2016.7846271
Buchkapitel, Bibliographie
Kurzbeschreibung (Abstract)
Prosodic cues are an important part of human communication. One of these cues is the word prominence which is used to e.g. highlight important information. Since individual speakers use different ways of expressing prominence, it is not easily extracted and incorporated in a dialog system. As a consequence, up to date prominence only plays a marginal role in human-machine communication. In this paper we compare DNNs and SVMs trained speaker independently with the results of classification with SVM using a speaker adaptation method we recently developed. This adaptation method is based on the radial basis function of the SVM with a Gaussian regularization, which is derived from fMLLR. With this adaptation, we can notably reduce the problem of speaker variations. We present detailed evaluations of the methods and discuss advantages and shortcomings of the proposed approaches for word prominence detection.
Typ des Eintrags: | Buchkapitel |
---|---|
Erschienen: | 2016 |
Autor(en): | Schnall, Andrea ; Heckmann, Martin |
Art des Eintrags: | Bibliographie |
Titel: | Comparing speaker independent and speaker adapted classification for word prominence detection |
Sprache: | Englisch |
Publikationsjahr: | 2016 |
Ort: | San Diego, California, USA |
Buchtitel: | 2016 IEEE Spoken Language Technology Workshop (SLT) |
DOI: | 10.1109/SLT.2016.7846271 |
Zugehörige Links: | |
Kurzbeschreibung (Abstract): | Prosodic cues are an important part of human communication. One of these cues is the word prominence which is used to e.g. highlight important information. Since individual speakers use different ways of expressing prominence, it is not easily extracted and incorporated in a dialog system. As a consequence, up to date prominence only plays a marginal role in human-machine communication. In this paper we compare DNNs and SVMs trained speaker independently with the results of classification with SVM using a speaker adaptation method we recently developed. This adaptation method is based on the radial basis function of the SVM with a Gaussian regularization, which is derived from fMLLR. With this adaptation, we can notably reduce the problem of speaker variations. We present detailed evaluations of the methods and discuss advantages and shortcomings of the proposed approaches for word prominence detection. |
Zusätzliche Informationen: | Date of Conference: 13-16 December 2016 |
Fachbereich(e)/-gebiet(e): | 18 Fachbereich Elektrotechnik und Informationstechnik 18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Automatisierungstechnik und Mechatronik 18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Automatisierungstechnik und Mechatronik > Regelungsmethoden und Robotik (ab 01.08.2022 umbenannt in Regelungsmethoden und Intelligente Systeme) |
Hinterlegungsdatum: | 25 Nov 2016 15:55 |
Letzte Änderung: | 29 Mai 2024 09:25 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |