TY - JOUR
T1 - Diagnostic Performance of Machine Learning-Derived OSA Prediction Tools in Large Clinical and Community-Based Samples
AU - Holfinger, Steven J.
AU - Lyons, M. Melanie
AU - Keenan, Brendan T.
AU - Mazzotti, Diego R.
AU - Mindel, Jesse
AU - Maislin, Greg
AU - Cistulli, Peter A.
AU - Sutherland, Kate
AU - McArdle, Nigel
AU - Singh, Bhajan
AU - Chen, Ning Hung
AU - Gislason, Thorarinn
AU - Penzel, Thomas
AU - Han, Fang
AU - Li, Qing Yun
AU - Schwab, Richard
AU - Pack, Allan I.
AU - Magalang, Ulysses J.
N1 - Publisher Copyright:
© 2021 American College of Chest Physicians
PY - 2022/3
Y1 - 2022/3
N2 - Background: Prediction tools without patient-reported symptoms could facilitate widespread identification of OSA. Research Question: What is the diagnostic performance of OSA prediction tools derived from machine learning using readily available data without patient responses to questionnaires? Also, how do they compare with STOP-BANG, an OSA prediction tool, in clinical and community-based samples? Study Design and Methods: Logistic regression and machine learning techniques, including artificial neural network (ANN), random forests (RF), and kernel support vector machine, were used to determine the ability of age, sex, BMI, and race to predict OSA status. A retrospective cohort of 17,448 subjects from sleep clinics within the international Sleep Apnea Global Interdisciplinary Consortium (SAGIC) were randomly split into training (n = 10,469) and validation (n = 6,979) sets. Model comparisons were performed by using the area under the receiver-operating curve (AUC). Trained models were compared with the STOP-BANG questionnaire in two prospective testing datasets: an independent clinic-based sample from SAGIC (n = 1,613) and a community-based sample from the Sleep Heart Health Study (n = 5,599). Results: The AUCs (95% CI) of the machine learning models were significantly higher than logistic regression (0.61 [0.60-0.62]) in both the training and validation datasets (ANN, 0.68 [0.66-0.69]; RF, 0.68 [0.67-0.70]; and kernel support vector machine, 0.66 [0.65-0.67]). In the SAGIC testing sample, the ANN (0.70 [0.68-0.72]) and RF (0.70 [0.68-0.73]) models had AUCs similar to those of the STOP-BANG (0.71 [0.68-0.72]). In the Sleep Heart Health Study testing sample, the ANN (0.72 [0.71-0.74]) had AUCs similar to those of STOP-BANG (0.72 [0.70-0.73]). Interpretation: OSA prediction tools using machine learning without patient-reported symptoms provide better diagnostic performance than logistic regression. In clinical and community-based samples, the symptomless ANN tool has diagnostic performance similar to that of a widely used prediction tool that includes patient symptoms. Machine learning-derived algorithms may have utility for widespread identification of OSA.
AB - Background: Prediction tools without patient-reported symptoms could facilitate widespread identification of OSA. Research Question: What is the diagnostic performance of OSA prediction tools derived from machine learning using readily available data without patient responses to questionnaires? Also, how do they compare with STOP-BANG, an OSA prediction tool, in clinical and community-based samples? Study Design and Methods: Logistic regression and machine learning techniques, including artificial neural network (ANN), random forests (RF), and kernel support vector machine, were used to determine the ability of age, sex, BMI, and race to predict OSA status. A retrospective cohort of 17,448 subjects from sleep clinics within the international Sleep Apnea Global Interdisciplinary Consortium (SAGIC) were randomly split into training (n = 10,469) and validation (n = 6,979) sets. Model comparisons were performed by using the area under the receiver-operating curve (AUC). Trained models were compared with the STOP-BANG questionnaire in two prospective testing datasets: an independent clinic-based sample from SAGIC (n = 1,613) and a community-based sample from the Sleep Heart Health Study (n = 5,599). Results: The AUCs (95% CI) of the machine learning models were significantly higher than logistic regression (0.61 [0.60-0.62]) in both the training and validation datasets (ANN, 0.68 [0.66-0.69]; RF, 0.68 [0.67-0.70]; and kernel support vector machine, 0.66 [0.65-0.67]). In the SAGIC testing sample, the ANN (0.70 [0.68-0.72]) and RF (0.70 [0.68-0.73]) models had AUCs similar to those of the STOP-BANG (0.71 [0.68-0.72]). In the Sleep Heart Health Study testing sample, the ANN (0.72 [0.71-0.74]) had AUCs similar to those of STOP-BANG (0.72 [0.70-0.73]). Interpretation: OSA prediction tools using machine learning without patient-reported symptoms provide better diagnostic performance than logistic regression. In clinical and community-based samples, the symptomless ANN tool has diagnostic performance similar to that of a widely used prediction tool that includes patient symptoms. Machine learning-derived algorithms may have utility for widespread identification of OSA.
KW - OSA
KW - artificial neural network
KW - electronic medical record
KW - kernel support vector machine
KW - machine learning
KW - prediction model
KW - random forest
UR - http://www.scopus.com/inward/record.url?scp=85124398725&partnerID=8YFLogxK
U2 - 10.1016/j.chest.2021.10.023
DO - 10.1016/j.chest.2021.10.023
M3 - 文章
C2 - 34717928
AN - SCOPUS:85124398725
SN - 1931-3543
VL - 161
SP - 807
EP - 817
JO - Chest
JF - Chest
IS - 3
ER -