Abstract
Background: Understanding early predictors of treatment outcomes allows better outcome prediction and resource allocation for efficient tuberculosis (TB) management. Objectives: This study aimed to predict treatment outcomes of TB patients from a real-world population-wide health record dataset with a significant rate of incomplete observations. In addition, potential risk factors associated with death during TB treatment were investigated. Methods: We exploited the upweighting approach and multiple imputation analysis (MIA) to address the extreme imbalance in responses and missing data. Three algorithms were employed for TB treatment outcome prediction, including logistic regression (LOGIT), random forest, and stochastic gradient boosting. The three models exhibited similar performance in predicting the treatment outcomes. Moreover, an interpretation of LOGIT was conducted, adjusted odds ratios (aORs) were computed, and the interpretation results were compared between MIA and complete case analysis (CCA). Results: MIA was an appropriate method for coping with missing data. In addition, compared to CCA, the interpretation results of the MIA-derived LOGIT showed more statistically significant covariates associated with TB treatment outcomes. In MIA, factors such as TB clinical form involving both pulmonary TB and extrapulmonary TB [aOR = 3.077, 95% confidence interval (CI) = 2.994–3.163], retreatment after abandonment (aOR = 2.272, 95% CI = 2.209–2.338), and the absence of isoniazid (aOR = 2.072, 95% CI = 1.892–2.269) or rifampicin (aOR = 1.968, 95% CI = 1.746–2.218) in the treatment regimen were associated with increased odds of death. Conclusion: In conclusion, our results shed light on the potential risk factors for death during TB treatment and suggest the use of simple yet interpretable LOGIT for the prediction of TB treatment outcomes.
| Original language | English |
|---|---|
| Article number | 301 |
| Pages (from-to) | 301 |
| Journal | BMC Medical Informatics and Decision Making |
| Volume | 25 |
| Issue number | 1 |
| DOIs | |
| State | Published - 11 08 2025 |
Bibliographical note
© 2025. The Author(s).Keywords
- Imputation
- Machine learning
- Statistical learning
- Treatment outcome prediction
- Tuberculosis
- Humans
- Risk Factors
- Middle Aged
- Male
- Treatment Outcome
- Tuberculosis/drug therapy
- Machine Learning
- Young Adult
- Female
- Adult
- Antitubercular Agents/therapeutic use
- Aged
- Outcome Assessment, Health Care
- Databases, Factual