Machkour, J. ; Muma, M. ; Palomar, D. P. (2021)
The Terminating-Knockoff Filter: Fast High-Dimensional Variable Selection with False Discovery Rate Control.
doi: 10.48550/arXiv.2110.06048
Report, Bibliographie
Kurzbeschreibung (Abstract)
We propose the Terminating-Knockoff (T-Knock) filter, a fast variable selection method for high-dimensional data. The T-Knock filter controls a user-defined target false discovery rate (FDR) while maximizing the number of selected true positives. This is achieved by fusing the solutions of multiple early terminated random experiments. The experiments are conducted on a combination of the original data and multiple sets of randomly generated knockoff variables. A finite sample proof based on martingale theory for the FDR control property is provided. Numerical simulations show that the FDR is controlled at the target level while allowing for a high power. We prove under mild conditions that the knockoffs can be sampled from any univariate distribution. The computational complexity of the proposed method is derived and it is demonstrated via numerical simulations that the sequential computation time is multiple orders of magnitude lower than that of the strongest benchmark methods in sparse high-dimensional settings. The T-Knock filter outperforms state-of-the-art methods for FDR control on a simulated genome-wide association study (GWAS), while its computation time is more than two orders of magnitude lower than that of the strongest benchmark methods.
Typ des Eintrags: | Report |
---|---|
Erschienen: | 2021 |
Autor(en): | Machkour, J. ; Muma, M. ; Palomar, D. P. |
Art des Eintrags: | Bibliographie |
Titel: | The Terminating-Knockoff Filter: Fast High-Dimensional Variable Selection with False Discovery Rate Control |
Sprache: | Englisch |
Publikationsjahr: | 12 Oktober 2021 |
Verlag: | arXiv |
Reihe: | Methodology |
Auflage: | 1.Version |
DOI: | 10.48550/arXiv.2110.06048 |
URL / URN: | https://arxiv.org/abs/2110.06048v1 |
Kurzbeschreibung (Abstract): | We propose the Terminating-Knockoff (T-Knock) filter, a fast variable selection method for high-dimensional data. The T-Knock filter controls a user-defined target false discovery rate (FDR) while maximizing the number of selected true positives. This is achieved by fusing the solutions of multiple early terminated random experiments. The experiments are conducted on a combination of the original data and multiple sets of randomly generated knockoff variables. A finite sample proof based on martingale theory for the FDR control property is provided. Numerical simulations show that the FDR is controlled at the target level while allowing for a high power. We prove under mild conditions that the knockoffs can be sampled from any univariate distribution. The computational complexity of the proposed method is derived and it is demonstrated via numerical simulations that the sequential computation time is multiple orders of magnitude lower than that of the strongest benchmark methods in sparse high-dimensional settings. The T-Knock filter outperforms state-of-the-art methods for FDR control on a simulated genome-wide association study (GWAS), while its computation time is more than two orders of magnitude lower than that of the strongest benchmark methods. |
Zusätzliche Informationen: | Titeländerung ab Version 5 |
Fachbereich(e)/-gebiet(e): | 18 Fachbereich Elektrotechnik und Informationstechnik 18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik 18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik > Robust Data Science 18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik > Signalverarbeitung LOEWE LOEWE > LOEWE-Zentren LOEWE > LOEWE-Zentren > emergenCITY Zentrale Einrichtungen Zentrale Einrichtungen > Hochschulrechenzentrum (HRZ) Zentrale Einrichtungen > Hochschulrechenzentrum (HRZ) > Hochleistungsrechner |
Hinterlegungsdatum: | 25 Okt 2021 05:36 |
Letzte Änderung: | 17 Apr 2024 11:50 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |