Accurately Identifying Cerebroarterial Stenosis from Angiography Reports Using Natural Language Processing Approaches

Ching Heng Lin, Kai Cheng Hsu, Chih Kuang Liang, Tsong Hai Lee, Ching Sen Shih, Yang C. Fann*

*Corresponding author for this work

Research output: Contribution to journalJournal Article peer-review

1 Scopus citations


Patients with intracranial artery stenosis show high incidence of stroke. Angiography reports contain rich but underutilized information that can enable the detection of cerebrovascular diseases. This study evaluated various natural language processing (NLP) techniques to accurately identify eleven intracranial artery stenosis from angiography reports. Three NLP models, including a rule-based model, a recurrent neural network (RNN), and a contextualized language model, XLNet, were developed and evaluated by internal–external cross-validation. In this study, angiography reports from two independent medical centers (9614 for training and internal validation testing and 315 as external validation) were assessed. The internal testing results showed that XLNet had the best performance, with a receiver operating characteristic curve (AUROC) ranging from 0.97 to 0.99 using eleven targeted arteries. The rule-based model attained an AUROC from 0.92 to 0.96, and the RNN long short-term memory model attained an AUROC from 0.95 to 0.97. The study showed the potential application of NLP techniques such as the XLNet model for the routine and automatic screening of patients with high risk of intracranial artery stenosis using angiography reports. However, the NLP models were investigated based on relatively small sample sizes with very different report writing styles and a prevalence of stenosis case distributions, revealing challenges for model generalization.

Original languageEnglish
Article number1882
Issue number8
StatePublished - 08 2022
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2022 by the authors.


  • cerebrovascular diseases
  • deep learning
  • intracranial artery stenosis
  • natural language processing
  • ruled-based model


Dive into the research topics of 'Accurately Identifying Cerebroarterial Stenosis from Angiography Reports Using Natural Language Processing Approaches'. Together they form a unique fingerprint.

Cite this