Knockpy

Powerful knockoffs via minimizing reconstructability. Model-X knockoffs ( extit{J. R. Stat. Soc. Ser. B. Stat. Methodol.} 80 (2018) 551--577) allows analysts to perform feature selection using almost any machine learning algorithm while provably controlling the expected proportion of false discoveries. This procedure involves constructing synthetic variables, called knockoffs, which effectively act as controls during feature selection. The gold standard for constructing knockoffs has been to minimize the mean absolute correlation (MAC) between features and their knockoffs, but, surprisingly, we prove this procedure can be powerless in extremely easy settings, including Gaussian linear models with correlated exchangeable features. The key problem is that minimizing the MAC creates joint dependencies between the features and knockoffs, which allow machine learning algorithms to reconstruct the effect of the features on the response using the knockoffs. To improve power, we propose generating knockoffs which minimize the reconstructability (MRC) of the features, and we demonstrate our proposal for Gaussian features by showing it is computationally efficient, robust, and powerful. We also prove that certain MRC knockoffs minimize a notion of estimation error in Gaussian linear models. Through extensive simulations, we show MRC knockoffs often dramatically outperform MAC-minimizing knockoffs, and we find no settings in which MAC-minimizing knockoffs outperform MRC knockoffs by more than a slight margin. We implement our methods and many others from the knockoffs literature in a new python package (mathtt{knockpy}).

Keywords for this software

Anything in here will be replaced on browsers that support the canvas element


References in zbMATH (referenced in 1 article , 1 standard article )

Showing result 1 of 1.
Sorted by year (citations)

  1. Spector, Asher; Janson, Lucas: Powerful knockoffs via minimizing reconstructability (2022)