Date of Award
Master of Science
In this paper, we have addressed the complex problem of recovery for concurrent failures in distributed computing environment. We have proposed a new approach in which we have dealt with effectively both orphan and lost messages. The proposed check pointing and recovery approaches enable a process to restart from its recent checkpoint and hence guarantees the least amount of re-computation after recovery. It also means that a process needs to save only its recent local checkpoint. The proposed value of the common check pointing interval enables an initiator process to log the minimum number of messages sent by each application process. The message complexity of the proposed check pointing algorithm as well as the recovery approach is O(n).
This thesis is only available for download to the SIUC community. Others should
contact the interlibrary loan department of your local library.