PyHealth: A Python Library for Health Predictive Models. Despite the explosion of interest in healthcare AI research, the reproducibility and benchmarking of those research works are often limited due to the lack of standard benchmark datasets and diverse evaluation metrics. To address this reproducibility challenge, we develop PyHealth, an open-source Python toolbox for developing various predictive models on healthcare data. PyHealth consists of data preprocessing module, predictive modeling module, and evaluation module. The target users of PyHealth are both computer science researchers and healthcare data scientists. With PyHealth, they can conduct complex machine learning pipelines on healthcare datasets with fewer than ten lines of code. The data preprocessing module enables the transformation of complex healthcare datasets such as longitudinal electronic health records, medical images, continuous signals (e.g., electrocardiogram), and clinical notes into machine learning friendly formats. The predictive modeling module provides more than 30 machine learning models, including established ensemble trees and deep neural network-based approaches, via a unified but extendable API designed for both researchers and practitioners. The evaluation module provides various evaluation strategies (e.g., cross-validation and train-validation-test split) and predictive model metrics. With robustness and scalability in mind, best practices such as unit testing, continuous integration, code coverage, and interactive examples are introduced in the library’s development. PyHealth can be installed through the Python Package Index (PyPI) or

Keywords for this software

Anything in here will be replaced on browsers that support the canvas element

References in zbMATH (referenced in 1 article , 1 standard article )

Showing result 1 of 1.
Sorted by year (citations)

  1. Yue Zhao, Zhi Qiao, Cao Xiao, Lucas Glass, Jimeng Sun: PyHealth: A Python Library for Health Predictive Models (2021) arXiv