Knowledge Distillation on Extractive Summarization

Ying Jia Lin, Daniel Tan, Tzu Hsuan Chou, Hung Yu Kao*, Hsin Yang Wang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Large-scale pre-trained frameworks have shown state-of-the-art performance in several natural language processing tasks. However, the costly training and inference time are great challenges when deploying such models to real-world applications. In this work, we conduct an empirical study of knowledge distillation on an extractive text summarization task. We first utilized a pre-trained model as the teacher model for extractive summarization and extracted learned knowledge from it as soft targets. Then, we leveraged both the hard targets and the soft targets as the objective for training a much smaller student model to perform extractive summarization. Our results show the student model performs only 1 point lower in the three ROUGE scores on the CNN/DM dataset of extractive summarization while being 40% smaller than the teacher model and 50% faster in terms of the inference time.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE 3rd International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages71-76
Number of pages6
ISBN (Electronic)9781728187082
DOIs
StatePublished - 12 2020
Externally publishedYes
Event3rd IEEE International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2020 - Irvine, United States
Duration: 09 12 202011 12 2020

Publication series

NameProceedings - 2020 IEEE 3rd International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2020

Conference

Conference3rd IEEE International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2020
Country/TerritoryUnited States
CityIrvine
Period09/12/2011/12/20

Bibliographical note

Publisher Copyright:
© 2020 IEEE.

Keywords

  • knowledge distillation
  • text summarization

Fingerprint

Dive into the research topics of 'Knowledge Distillation on Extractive Summarization'. Together they form a unique fingerprint.

Cite this