Monday, January 3, 2011

Two-Phase Commit Protocol

Overview

The two phase commit protocol is a distributed algorithm which lets all sites in a distributed system agree to commit or rollback a transaction based upon consensus of all participating sites. The two phase commit strategy is designed to ensure that either all the databases are updated or none of them, so that the databases remain synchronized. The protocol achieves its goal even in many cases of temporary system.

Assumptions

In two phase commit protocol there is one node which is act as a coordinator and all other participating node are known as cohorts. The assumptions made in this protocol are listed below:
  1. Each cohort should have its own stable storage.
  2. To provide atomicity and durability each participating node should maintain a log before performing and operation. This type of logging called write-ahead logging.
  3. In case of system failure of any of the cohort the storage can be recovered and information from the log file can still be retrieve.
  4. No participating node can fail permanently.
  5. In case of permanent failure of any cohort or complete lose of storage then the data cannot be recovered.

Basic Algorithm

As the name suggests the two phase commit protocol involves two phases. First phase is "Commit Request" phase second phase is "Commit" phase.
  • Commit Request Phase:
    1. To commit the transaction, the coordinator sends a request asking for "ready for commit?" to each cohorts.
    2. The coordinator waits until it has received a reply from all cohorts to "vote" on the request.
    3. Each participant votes by sending a massage back to the coordinator, as follows:
      • It vote YES if it is prepared to commit
      • It may vote NO for any reason, usually because it cannot prepare the transaction due to a local failure.
      • It may delay voting indefinitely, for example, because its system was busy with other work because it failed.
  • Commit Phase:
    1. If the coordinator receives YES response from all cohorts, it decides to commit. The transaction is now officially committed. Otherwise, it either receives a NO response or give up waiting for some participants, so it decides to abort.
    2. The coordinator sends its decision to all participants (i.e. COMMIT or ABORT).
    3. Participants acknowledge receipt of commit or about by replying DONE.

Disadvantages

The two phase commit protocol is a blocking protocol; hence it has some disadvantages which are listed below:
  1. A cohort locks the required resources while it is waiting for a message from coordinator. Other processes competing for resource will have to wait for the locks to be released.
  2. A single node will continue to wait even if all other sites have failed.
  3. If the coordinator fails permanently, some cohorts will never resolve their transactions, causing resources to be locked up forever.