Aman Goyal

LeetCode LeetCode

Leader Election: Ensuring a Single Owner in a Distributed System

Core Concept


Do You Even Need Leader Election?

Use leader election only when:


Core Primitive: Compare-And-Swap

lock := sync.Mutex{}
store := map[string]string{}

func compareAndSwap(key, nextValue, currentValue string) (bool, error) {
  lock.Lock()
  defer lock.Unlock()
  _, containsKey := store[key]
  if !containsKey {
    if len(currentValue) == 0 {
      store[key] = nextValue
      return true, nil
    }
    return false, fmt.Errorf(
      "Expected value %s for key %s, but found empty", currentValue, key)
  }
  if store[key] == currentValue {
    store[key] = nextValue
    return true, nil
  }
  return false, nil
}

Foundation of distributed locking


Basic Lock

Acquire

func (Lock l) simpleLock() boolean {
  // compare and swap "1" for "0"
  locked, _ = compareAndSwap(l.lockName, "1", "0")
  return locked
}

Handle missing key

func (Lock l) simpleLock() boolean {
  // compare and swap "1" for "0"
  locked, error = compareAndSwap(l.lockName, "1", "0")
  // lock doesn't exist, try to write "1" with a previous value of
  // non-existent
  if error != nil {
    locked, _ = compareAndSwap(l.lockName, "1", nil)
  }
  return locked
}

Blocking lock

func (Lock l) lock() {
  while (!l.simpleLock()) {
    sleep(2)
  }
}

Event-based (better)

func (Lock l) lock() {
  while (!l.simpleLock()) {
    waitForChanges(l.lockName)
  }
}

Unlock

func (Lock l) unlock() {
  compareAndSwap(l.lockName, "0", "1")
}

Problem: Crash While Holding Lock

Lock never released → system stuck


⏳ Solution: TTL-based Lock

func (Lock l) simpleLock() boolean {
  // compare and swap "1" for "0"
  locked, error = compareAndSwap(l.lockName, "1", "0", l.ttl)
  // lock doesn't exist, try to write "1" with a previous value of
  // non-existent
  if error != nil {
    locked, _ = compareAndSwap(l.lockName, "1", nil, l.ttl)
  }
  return locked
}

Lock auto-expires


Critical Bug (TTL + Unlock)


Fix: Resource Version

Lock with version

func (Lock l) simpleLock() boolean {
  // compare and swap "1" for "0"
  locked, l.version, error = compareAndSwap(l.lockName, "1", "0", l.ttl)
  // lock doesn't exist, try to write "1" with a previous value of
  // non-existent
  if error != null {
    locked, l.version, _ = compareAndSwap(l.lockName, "1", null, l.ttl)
  }
  return locked
}

Safe unlock

func (Lock l) unlock() {
  compareAndSwap(l.lockName, "0", "1", l.version)
}

Prevents unlocking someone else’s lock


Renewable Lock (Leader Election)

func (Lock l) renew() boolean {
  locked, _ = compareAndSwap(l.lockName, "1", "1", l.version, ttl)
  return locked
}

Renewal loop

for {
  if !l.renew() {
    handleLockLost()
  }
  sleep(ttl/2)
}

Leader keeps renewing lease


etcd Example

Create lock

kubectl exec my-etcd-cluster-0000 -- sh -c \
  "ETCD_API=3 etcdctl --endpoints=${ETCD_ENDPOINTS} set my-lock unlocked"

Acquire lock

kubectl exec my-etcd-cluster-0000 -- sh -c \
  "ETCD_API=3 etcdctl --endpoints=${ETCD_ENDPOINTS} \
      set --swap-with-value unlocked my-lock alice"

Failed acquisition

kubectl exec my-etcd-cluster-0000 -- sh -c \
  "ETCD_API=3 etcdctl --endpoints=${ETCD_ENDPOINTS} \
      set --swap-with-value unlocked my-lock bob"
Error:  101: Compare failed ([unlocked != alice]) [6]

Release lock

kubectl exec my-etcd-cluster-0000 -- sh -c \
  "ETCD_API=3 etcdctl --endpoints=${ETCD_ENDPOINTS} \
      set --swap-with-value alice my-lock unlocked"

Lease-based Lock (etcd)

kubectl exec my-etcd-cluster-0000 -- \
    sh -c "ETCD_API=3 etcdctl --endpoints=${ETCD_ENDPOINTS} \
        --ttl=10 mk my-lock alice"

Renew lease

kubectl exec my-etcd-cluster-0000 -- \
    sh -c "ETCD_API=3 etcdctl --endpoints=${ETCD_ENDPOINTS} \
        set --ttl=10 --swap-with-value alice my-lock alice"

Advanced Problem: Dual Leaders

Temporary overlap can happen


Check before acting

func (Lock l) isLocked() boolean {
  return l.locked && l.lockTime + 0.75 * l.ttl > now()
}

Final Insights


Final Trade-off


One-line Summary

Leader election uses distributed locks (CAS + TTL + versioning) to ensure a single owner in a distributed system while handling failures and race conditions.

#Distributed Systems #System Design #Leader Election #Etcd #Distributed Locks #Consensus