Date of Award
12-2009
Degree Name
Master of Science
Department
Computer Science
First Advisor
Gupta, Dr.Bidyut
Abstract
In this work we have addressed the complex problem of recovery for concurrent failures in a distributed computing environment. We have proposed a new checkpointing and recovery approach that enables each process to restart from its recent checkpoint and therefore guarantees least amount of recomputation to be done after recovery. The proposed new approach deals effectively with orphan and lost messages. We have introduced two new ideas. The value of the common checkpointing interval is such that it requires to log only the messages sent in the recent checkpoints of the processes. The lost messages are always determined a priori by the initiator process in parallel to the normal distributed computation. Thereby, it does not delay the recovery approach in anyway.
Access
This thesis is only available for download to the SIUC community. Current SIUC affiliates may also access this paper off campus by searching Dissertations & Theses @ Southern Illinois University Carbondale from ProQuest. Others should contact the interlibrary loan department of your local library or contact ProQuest's Dissertation Express service.