Clock Synchronization

Figure 11.1 as motivation

Lamport's paper

WWV clock

Fort Collins, CO has atomic clock built on vibratiions of Cesium 133 atom.

Universal Coordinated Time (introduction of leap seconds). Can have a receiver on the net tuned to WWV.

Physical Clock Synchronization

Want to define a constant such that the maximum drift rate of the clock ($C$) is bounded.


\begin{displaymath}
1 - \rho \leq dC/dt \leq 1 + \rho
\end{displaymath}

Want maximum drift between two clocks to be $\delta$, must be resampled every $\delta/2\rho$ (each clock can drift amount in opposite directions)

Cristian's algorithm. Send a message to the time server and get back a reply. Adjust clock to the time. Fig. 11-6. Considerations:

Berkeley algorithm. Time server periodically computes a network time and sends it out to everyone else. Does not synchronize to an external time.

Averaging algorithms. Broadcast time--decentralized. Could also pick a random node to send to, less overhead, but slower convergence.

Multiple External Time Sources. Intersect them, average, and throw out any outlyers. Fig. 11-8 for example of OSF's DCE approach. Universal Coordinated Time (UTC).

Network Time Protocol (NTP). Standardized time protocol. Use a hierarchy of servers with those at the top receiving UTC directly (Fig 10.3) Three modes:

  1. multicast--on a LAN
  2. procedure call--like Cristian's algorithm
  3. symmetric--servers communicate and maintain a timing association

Protocol uses symmetric mode to calculate offset $o$ and delay $d$ between two clocks (Fig 10.4).

Mutual Exclusion

Have talked about Lamport's algorithm.

Look at centralized algorithm. Fig. 11-9.

Distributed algorithms. Ricart-Agrawala have an updated version of Lamport's algorithm. Merge Release and Reply messages.

Algorithm:

  1. Send timestamped request message on a request.
  2. Send a reply if not currently requesting/executing the critical section or if requestor's time stamp is smaller than the timestamp on receiver's current request.
  3. Execute critical section when a reply is received from all sites in the request set.
  4. When a site exits its critical section it sends the delayed reply.

Token Based Algorithms

Create a ring (logical or physical) and pass a token between the nodes. If the node needs the critical section then it grabs the token.

Suzuki-Kasami's Broadcast algorithm--keep a vector of current state and broadcast requests. Each machine broadcasts a request for the token when it is needed.

Singhal has a heuristic improvement to only send to the sites it thinks may have the token.

Comparison

Look at Fig 11-12.

Election Algorithms

Bully Algorithm

A process holds an election to elect a leader. The election is held as follows:

  1. P sends an election message to all processes (sites) with a higher number.
  2. If no one responds, P wins and becomes the leader.
  3. If a higher-up answers then it takes over and starts an election. P is done.

Ring-Based

Similar to bully algorithm in trying to elect the highest numbered node.

Any node can start and marks itself as a participant (vs. non-participant) in an election. It puts its identifier in the election message. Successive nodes mark themselves as participants and if they have a greater value then they substitute their id in the elected message, otherwise they just pass the value on.

When the receiver gets its id back then it is the coordinator and sends an elected message. The other nodes use this message to mark themselves as non-participants.

Notion of participation is used to quelch other elections. At worst the protocol could take $3n-1$ messages ($n-1$ to find leader, $n$ before leader sees its id again and $n$ to send around elected message).

Show a picture with 3-9-15-6-12-18 (start with 3).

Byzantine Agreement Problem

One processor initiates the process to try and get all processors to agree on a value (in the possible presense of faulty processors).

Outcome: all non-faulty processors agree on the same value. If the source processor is non-faulty then the common agreed upon value by all non-faulty processors should be the initial value of the source.

Also referred to as the Byzantine Generals Problem where distributed generals are trying to reach agreement on an attack plan. They communicate only through messengers. Some generals are traitors and try and mislead the others (faulty processors).

Lamport showed that with n processors, it is impossible to reach agreement with more than $\lfloor(n-1)/3\rfloor$ processors being faulty. For example, with three processors there cannot be any faulty processors.

With $m$ faulty processors, must have $3m+1$ total processors to reach agreement.

Lamport-Shostak-Pease Algorithm

Oral Message algorithm for $3m+1$ processors with $m$ faulty processors OM(m).

For $m==0$, the source processor simply sends its value to every processor, with a default value of 0 if none received.

Algorithm OM(m) $m > 0$:

  1. Source processor sends its value to every processor.
  2. For each i, let $v_i$ be the value received from source (use default value of 0 if none received). Processor is the new source and executes OM(m-1) and sends value to $n-2$ other processors.
  3. Each processor takes the majority of the values it receives as its value.

Complexity is $O(n^m)$

Must know maximum number of faulty processors ahead of time.

Assume that all processors respond in a finite amount of time. If messages get lost then the problem cannot be solved--like blue army/white army problem.

Transactions

skip for now

Deadlock

skip for now

Termination Detection

How to know when a distributed computation (election, deadlock detection, distributed query) terminates??

Huang, 1989.

Use a weighting algorithm with a controlling agent. Initially the controlling agent has a weight of one and all other nodes have a weight of zero. Show picture.

Invariant: sum of weights in the system is always one.

Algorithm

  1. When a computation is sent by a process with weight $W$ to another process then divide $W$ into $W_1 + W_2$. Reassign $W=W_1$ and send $W_2$ to P.

  2. On receipt of message, the process P adds the weight to its current weight.

  3. If a process is no longer active then it sends its entire weight $W$ to the controlling agent.

  4. On receiving a message the controlling agent adds the weight to its current value. When the weight returns to one the computation is terminated.