Theoretical analysis for communication-induced checkpointing protocols with rollback-dependency trackability

Jichiang Tsai*, Sy Yen Kuo, Yi Min Wang

*Corresponding author for this work

Research output: Contribution to journalJournal Article peer-review

43 Scopus citations

Abstract

Rollback-Dependency Trackability (RDT) is a property that states that all rollback dependencies between local checkpoints are on-line trackable by using a transitive dependency vector. In this paper, we address three fundamental issues in the design of communication-induced checkpointing protocols that ensure RDT. First, we prove that the following intuition commonly assumed in the literature is in fact false: If a protocol forces a checkpoint only at a stronger condition, then it must take, at most, as many forced checkpoints as a protocol based on a weaker condition. This result implies that the common approach of sharpening the checkpoint-inducing condition by piggybacking more control information on each message may not always yield a more efficient protocol. Next, we prove that there is no optimal on-line RDT protocol that takes fewer forced checkpoints than any other RDT protocol for all possible communication patterns. Finally, since comparing checkpoint-inducing conditions is not sufficient for comparing protocol performance, we present some formal techniques for comparing the performance of several existing RDT protocols.

Original languageEnglish
Pages (from-to)963-971
Number of pages9
JournalIEEE Transactions on Parallel and Distributed Systems
Volume9
Issue number10
DOIs
StatePublished - 1998
Externally publishedYes

Keywords

  • Checkpointing
  • Communication-induced protocols
  • On-line algorithms
  • Rollback recovery
  • Rollback-dependency trackability

Fingerprint

Dive into the research topics of 'Theoretical analysis for communication-induced checkpointing protocols with rollback-dependency trackability'. Together they form a unique fingerprint.

Cite this