Using mixed integer programming for matching in an observational study of kidney failure after surgery This article presents a new method for optimal matching in observational studies based on mixed integer programming. Unlike widely used matching methods based on network algorithms, which attempt to achieve covariate balance by minimizing the total sum of distances between treated units and matched controls, this new method achieves covariate balance directly, either by minimizing both the total sum of distances and a weighted sum of specific measures of covariate imbalance, or by minimizing the total sum of distances while constraining the measures of imbalance to be less than or equal to certain tolerances. The inclusion of these extra terms in the objective function or the use of these additional constraints explicitly optimizes or constrains the criteria that will be used to evaluate the quality of the match. For example, the method minimizes or constrains differences in univariate moments, such as means, variances, and skewness; differences in multivariate moments, such as correlations between covariates; differences in quantiles; and differences in statistics, such as the Kolmogorov-Smirnov statistic, to minimize the differences in both location and shape of the empirical distributions of the treated units and matched controls. While balancing several of these measures, it is also possible to impose constraints for exact and near-exact matching, and fine and near-fine balance for more than one nominal covariate, whereas network algorithms can finely or near-finely balance only a single nominal covariate. From a practical standpoint, this method eliminates the guesswork involved in current optimal matching methods, and offers a controlled and systematic way of improving covariate balance by focusing the matching efforts on certain measures of covariate imbalance and their corresponding weights or tolerances. A matched case-control study of acute kidney injury after surgery among Medicare patients illustrates these features in detail. A new R package called mipmatch implements the method.

References in zbMATH (referenced in 17 articles , 1 standard article )

Showing results 1 to 17 of 17.
Sorted by year (citations)

  1. Hochbaum, Dorit S.; Rao, Xu; Sauppe, Jason: Network flow methods for the minimum covariate imbalance problem (2022)
  2. Du, Xin; Sun, Lei; Duivesteijn, Wouter; Nikolaev, Alexander; Pechenizkiy, Mykola: Adversarial balancing-based representation learning for causal effect inference with observational data (2021)
  3. Rischard, Maxime; Branson, Zach; Miratrix, Luke; Bornn, Luke: Do school districts affect NYC house prices? Identifying border differences using a Bayesian nonparametric approach to geographic regression discontinuity designs (2021)
  4. Zhang, Yumin; Sabbaghi, Arman: The designed bootstrap for causal inference in big observational data (2021)
  5. Bennett, Magdalena; Vielma, Juan Pablo; Zubizarreta, José R.: Building representative matched samples with multi-valued treatments in large observational studies (2020)
  6. Pimentel, Samuel D.; Kelz, Rachel R.: Optimal tradeoffs in matched designs comparing US-trained and internationally trained surgeons (2020)
  7. Yu, Ruoqi; Silber, Jeffrey H.; Rosenbaum, Paul R.: Matching methods for observational studies derived from large administrative databases (2020)
  8. Yu, Ruoqi; Silber, Jeffrey H.; Rosenbaum, Paul R.: Rejoinder: Matching methods for observational studies derived from large administrative databases (2020)
  9. Karmakar, Bikram; Small, Dylan S.; Rosenbaum, Paul R.: Using approximation algorithms to build evidence factors and related designs for observational studies (2019)
  10. Zhao, Qingyuan; Keele, Luke J.; Small, Dylan S.: Comment: Will competition-winning methods for causal inference also succeed in practice? (2019)
  11. Pimentel, Samuel D.; Page, Lindsay C.; Lenard, Matthew; Keele, Luke: Optimal multilevel matching using network flows: an application to a summer reading intervention (2018)
  12. Cho, Wendy K. Tam: Causal inferences from many experiments (2017)
  13. Lopez, Michael J.; Gutman, Roee: Estimation of causal effects with multiple treatments: a review and new ideas (2017)
  14. Sauppe, Jason J.; Jacobson, Sheldon H.; Sewell, Edward C.: Complexity and approximation results for the balance optimization subset selection model for causal inference in observational studies (2014)
  15. Zubizarreta, José R.; Paredes, Ricardo D.; Rosenbaum, Paul R.: Matching for balance, pairing for heterogeneity in an observational study of the effectiveness of for-profit and not-for-profit high schools in Chile (2014)
  16. Zubizarreta, José R.; Small, Dylan S.; Rosenbaum, Paul R.: Isolation in the construction of natural experiments (2014)
  17. Zubizarreta, José R.: Using mixed integer programming for matching in an observational study of kidney failure after surgery (2012)