TU Darmstadt / ULB / TUbiblio

The Terminating-Knockoff Filter: Fast High-Dimensional Variable Selection with False Discovery Rate Control

Machkour, J. ; Muma, M. ; Palomar, D. P. (2021)
The Terminating-Knockoff Filter: Fast High-Dimensional Variable Selection with False Discovery Rate Control.
doi: 10.48550/arXiv.2110.06048
Report, Bibliographie

Kurzbeschreibung (Abstract)

We propose the Terminating-Knockoff (T-Knock) filter, a fast variable selection method for high-dimensional data. The T-Knock filter controls a user-defined target false discovery rate (FDR) while maximizing the number of selected true positives. This is achieved by fusing the solutions of multiple early terminated random experiments. The experiments are conducted on a combination of the original data and multiple sets of randomly generated knockoff variables. A finite sample proof based on martingale theory for the FDR control property is provided. Numerical simulations show that the FDR is controlled at the target level while allowing for a high power. We prove under mild conditions that the knockoffs can be sampled from any univariate distribution. The computational complexity of the proposed method is derived and it is demonstrated via numerical simulations that the sequential computation time is multiple orders of magnitude lower than that of the strongest benchmark methods in sparse high-dimensional settings. The T-Knock filter outperforms state-of-the-art methods for FDR control on a simulated genome-wide association study (GWAS), while its computation time is more than two orders of magnitude lower than that of the strongest benchmark methods.

Typ des Eintrags: Report
Erschienen: 2021
Autor(en): Machkour, J. ; Muma, M. ; Palomar, D. P.
Art des Eintrags: Bibliographie
Titel: The Terminating-Knockoff Filter: Fast High-Dimensional Variable Selection with False Discovery Rate Control
Sprache: Englisch
Publikationsjahr: 12 Oktober 2021
Verlag: arXiv
Reihe: Methodology
Auflage: 1.Version
DOI: 10.48550/arXiv.2110.06048
URL / URN: https://arxiv.org/abs/2110.06048v1
Kurzbeschreibung (Abstract):

We propose the Terminating-Knockoff (T-Knock) filter, a fast variable selection method for high-dimensional data. The T-Knock filter controls a user-defined target false discovery rate (FDR) while maximizing the number of selected true positives. This is achieved by fusing the solutions of multiple early terminated random experiments. The experiments are conducted on a combination of the original data and multiple sets of randomly generated knockoff variables. A finite sample proof based on martingale theory for the FDR control property is provided. Numerical simulations show that the FDR is controlled at the target level while allowing for a high power. We prove under mild conditions that the knockoffs can be sampled from any univariate distribution. The computational complexity of the proposed method is derived and it is demonstrated via numerical simulations that the sequential computation time is multiple orders of magnitude lower than that of the strongest benchmark methods in sparse high-dimensional settings. The T-Knock filter outperforms state-of-the-art methods for FDR control on a simulated genome-wide association study (GWAS), while its computation time is more than two orders of magnitude lower than that of the strongest benchmark methods.

Zusätzliche Informationen:

Titeländerung ab Version 5

Fachbereich(e)/-gebiet(e): 18 Fachbereich Elektrotechnik und Informationstechnik
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik > Robust Data Science
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik > Signalverarbeitung
LOEWE
LOEWE > LOEWE-Zentren
LOEWE > LOEWE-Zentren > emergenCITY
Zentrale Einrichtungen
Zentrale Einrichtungen > Hochschulrechenzentrum (HRZ)
Zentrale Einrichtungen > Hochschulrechenzentrum (HRZ) > Hochleistungsrechner
Hinterlegungsdatum: 25 Okt 2021 05:36
Letzte Änderung: 17 Apr 2024 11:50
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen