Understanding Push Sum and Gossip Protocols

October 24, 2020 5 min read • Code 

Last week I worked with Gossip algorithms as a part of the Distributed Operating System class under Prof. Alin Dobra. Distributed Systems have always been an interesting topic and once you read it's a rabbit hole down from there.

As the scale of distributed systems increases, there is a massive instability in the network because of numerous nodes. Gossip protocols work similarly to the covid-19 epidemic situation we are living in right now. As the name suggests, these protocols work similarly to the spread of the virus to effectively communicate information between nodes. Sometimes these Gossip algorithms are also known as Epidemic algorithm because of their behavior. The goal of this project was to determine the convergence of such algorithms through a simulator based on actors written in F# on various network topologies.

Topologies

The actual network topology plays a critical role in the dissemination speed of Gossip protocols. As part of this project, I've experimented with various topologies. The topology determines who is considered a neighbor in the above algorithms.

Full Network: Every actor is a neighbor of all other actors. That is, every actor can talk directly to any other actor.

2D Grid: Actors form a 2D grid. The actors can only talk to the grid neighbors.

Line: Actors are arranged in a line. Each actor has only 2 neighbors (one left and one right, unless you are the first or last actor).

Imperfect 2D Grid: Grid arrangement but one random other neighbor is selected from the list of all actors (4+1 neighbors).

Gossip


_10
ProcessInformation ->
_10
if rumourCount < 11 then
_10
let rnd = Random().Next(0, neighbours.Length)
_10
if not converged_nodes.[neighbours.[rnd]] then
_10
neighbours.[rnd] <! CallWorker
_10
mailbox.Self <! ProcessInformation

The node (A) in the network randomly selects another node to share information. Here, it is assumed that each node in the network maintains a list of all other nodes, or obtains information from a centralized server (in the code snippet, "neighbors" are lists that maintains list of neighboring nodes).

On receipt of information, the receiving node (B) processes the information. The above snippet is a straightforward implementation of actor that accepts a message and sends a message to random neighbor


_10
CallWorker ->
_10
if rumourCount = 0 then
_10
mailbox.Self <! ActivateWorker
_10
if (rumourCount = 10) then
_10
supervisor <! ReportMsgRecvd "Rumor"
_10
converged_nodes.[mailbox.Self] <- true
_10
rumourCount <- rumourCount + 1

In the next round of this process, both A and B again select nodes randomly and transmit the information

Each actor keeps track of rumors and how many times it has heard the rumor. It stops transmitting once it has heard the rumor 10 times (10 is arbitrary, you can select other values)

The entire network is said to be converge once all the nodes have received information for at least 10 times.

Result

Gossip Gossip Log Scale
  • Line topology is the slowest to converge. This is because it has only access to 2 neighbors (left and right node).

  • Full topology is the fastest to converge in all scenarios. This is because they connect it to all the nodes and convergence is faster to achieve in this scenario.

  • As expected, 2D and imperfect 2D would achieve the convergence time in between the line and full with imperfect 2D performing slightly better or equal to 2D performance.

PushSum


_10
StartPushSum delta ->
_10
let index = Random().Next(0, neighbours.Length)
_10
_10
sum <- sum / 2.0
_10
weight <- weight / 2.0
_10
neighbours.[index] <! ComputePushSum(sum, weight, delta)

The push sum protocol is one of the gossip-based aggregation algorithms that is based on the interactive pairwise distribution of aggregated values among particulars entities.


_25
ComputePushSum (s: float, w, delta) ->
_25
let newsum = sum + s
_25
let newweight = weight + w
_25
_25
let cal = sum / weight - newsum / newweight |> abs
_25
_25
if alreadyConverged then
_25
_25
let index = Random().Next(0, neighbours.Length)
_25
neighbours.[index] <! ComputePushSum(s, w, delta)
_25
else
_25
if cal > delta then
_25
termRound <- 0
_25
else
_25
termRound <- termRound + 1
_25
_25
if termRound = 3 then
_25
termRound <- 0
_25
alreadyConverged <- true
_25
supervisor <! Result(sum, weight)
_25
_25
sum <- newsum / 2.0
_25
weight <- newweight / 2.0
_25
let index = Random().Next(0, neighbours.Length)
_25
neighbours.[index] <! ComputePushSum(sum, weight, delta)

To properly fulfill the mentioned functionality, the entities store two values: the value of the averaged quantity and the weight. The entities are assigned an initial value, which is updated at each iteration in terms of the information from the adjacent nodes and the inner state. The parameter weight is set to 1 for each entity at the beginning of the whole process

It sends this entity the half of the value of its average quantity and the half of its weight. After all, complete this procedure, each entity calculates the average estimation by counting the ratio of the sums of these two parameters

Result

PushSum PushSum

If you want to get started with some of these algorithms, I would highly recommend reading Epidemic algorithms for replicated database maintenance and Gossip-Based Computation of Aggregate Information


 Guide to Hacktoberfest and winning a free T-ShirtConfigure Nginx Proxy Manager with Bitwarden 

Want more?

Subscribe to get my latest content via email. I won’t send you spam, and you can unsubscribe at any time.