NV-DNN: Towards fault-tolerant DNN systems with N-version programming

Hui Xu, Zhuangbin Chen, Weibin Wu, Zhi Jin, Sy Yen Kuo, Michael Lyu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

19 Scopus citations

Abstract

Employing deep learning algorithms in real-world applications becomes a trend. However, a bottleneck that impedes their further adoption in safety-critical systems is the reliability issue. It is challenging to develop reliable neural network models as the theory of deep learning has not yet been well-established and neural network models are very sensitive to data perturbations. Inspired by the classic paradigm of N-version programming for fault tolerance, this paper investigates the feasibility of developing fault-tolerant deep learning systems through model redundancy. We hypothesize that if we train several simplex models independently, these models are unlikely to produce erroneous results for the same test cases. In this way, we can design a fault-tolerant system whose output is determined by all these models cooperatively. We propose several independence factors that can be introduced for generating multiple versions of neural network models, including training, network, and data. Experimental results on MNIST and CIFAR-10 both verify that our approach can improve the fault-tolerant ability of a deep learning system. Particularly, independent data for training plays the most significant role in generating multiple models sharing the least mutual faults.

Original languageEnglish
Title of host publicationProceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshop, DSN-W 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages44-47
Number of pages4
ISBN (Electronic)9781728130309
DOIs
StatePublished - 06 2019
Externally publishedYes
Event49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshop, DSN-W 2019 - Portland, United States
Duration: 24 06 201927 06 2019

Publication series

NameProceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshop, DSN-W 2019

Conference

Conference49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshop, DSN-W 2019
Country/TerritoryUnited States
CityPortland
Period24/06/1927/06/19

Bibliographical note

Publisher Copyright:
© 2019 IEEE.

Keywords

  • deep learning
  • fault tolerance
  • NV DNN

Fingerprint

Dive into the research topics of 'NV-DNN: Towards fault-tolerant DNN systems with N-version programming'. Together they form a unique fingerprint.

Cite this