TU Darmstadt / ULB / TUbiblio

Distributed Multi-Speaker Voice Activity Detection for Wireless Acoustic Sensor Networks

Bahari, M. H. ; Hamaidi, L. K. ; Muma, M. ; Plata-Chaves, J. ; Moonen, M. ; Zoubir, A. M. ; Bertrand, A. (2017)
Distributed Multi-Speaker Voice Activity Detection for Wireless Acoustic Sensor Networks.
Report, Bibliographie

Kurzbeschreibung (Abstract)

A distributed multi-speaker voice activity detection (DM-VAD) method for wireless acoustic sensor networks (WASNs) is proposed. DM-VAD is required in many signal processing applications, e.g. distributed speech enhancement based on multi-channel Wiener filtering, but is non-existent up to date. The proposed method neither requires a fusion center nor prior knowledge about the node positions, microphone array orientations or the number of observed sources. It consists of two steps: (i) distributed source-specific energy signal unmixing (ii) energy signal based voice activity detection. Existing computationally efficient methods to extract source-specific energy signals from the mixed observations, e.g., multiplicative non-negative independent component analysis (MNICA) quickly loose performance with an increasing number of sources, and require a fusion center. To overcome these limitations, we introduce a distributed energy signal unmixing method based on a source-specific node clustering method to locate the nodes around each source. To determine the number of sources that are observed in the WASN, a source enumeration method that uses a Lasso penalized Poisson generalized linear model is developed. Each identified cluster estimates the energy signal of a single (dominant) source by applying a two-component MNICA. The VAD problem is transformed into a clustering task, by extracting features from the energy signals and applying K-means type clustering algorithms. All steps of the proposed method are evaluated using numerical experiments. A VAD accuracy of >85% is achieved for a challenging scenario where 20 nodes observe 7 sources in a simulated reverberant rectangular room.

Typ des Eintrags: Report
Erschienen: 2017
Autor(en): Bahari, M. H. ; Hamaidi, L. K. ; Muma, M. ; Plata-Chaves, J. ; Moonen, M. ; Zoubir, A. M. ; Bertrand, A.
Art des Eintrags: Bibliographie
Titel: Distributed Multi-Speaker Voice Activity Detection for Wireless Acoustic Sensor Networks
Sprache: Englisch
Publikationsjahr: 16 März 2017
Verlag: arXiv
URL / URN: https://arxiv.org/abs/1703.05782
Kurzbeschreibung (Abstract):

A distributed multi-speaker voice activity detection (DM-VAD) method for wireless acoustic sensor networks (WASNs) is proposed. DM-VAD is required in many signal processing applications, e.g. distributed speech enhancement based on multi-channel Wiener filtering, but is non-existent up to date. The proposed method neither requires a fusion center nor prior knowledge about the node positions, microphone array orientations or the number of observed sources. It consists of two steps: (i) distributed source-specific energy signal unmixing (ii) energy signal based voice activity detection. Existing computationally efficient methods to extract source-specific energy signals from the mixed observations, e.g., multiplicative non-negative independent component analysis (MNICA) quickly loose performance with an increasing number of sources, and require a fusion center. To overcome these limitations, we introduce a distributed energy signal unmixing method based on a source-specific node clustering method to locate the nodes around each source. To determine the number of sources that are observed in the WASN, a source enumeration method that uses a Lasso penalized Poisson generalized linear model is developed. Each identified cluster estimates the energy signal of a single (dominant) source by applying a two-component MNICA. The VAD problem is transformed into a clustering task, by extracting features from the energy signals and applying K-means type clustering algorithms. All steps of the proposed method are evaluated using numerical experiments. A VAD accuracy of >85% is achieved for a challenging scenario where 20 nodes observe 7 sources in a simulated reverberant rectangular room.

Fachbereich(e)/-gebiet(e): 18 Fachbereich Elektrotechnik und Informationstechnik
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik > Robust Data Science
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik > Signalverarbeitung
Exzellenzinitiative
Exzellenzinitiative > Graduiertenschulen
Exzellenzinitiative > Graduiertenschulen > Graduate School of Computational Engineering (CE)
Hinterlegungsdatum: 20 Aug 2019 05:42
Letzte Änderung: 19 Dez 2024 08:55
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen