Understanding Push Sum and Gossip Protocols

October 24, 2020 5 min read • Code 

Last week I worked with Gossip algorithms as a part of the Distributed Operating System class under Prof. Alin Dobra. Distributed Systems have always been an interesting topic and once you read it’s a rabbit hole down from there.

Gossip protocol meme

As the scale of distributed systems increases, there is a massive instability in the network because of numerous nodes. Gossip protocols work similarly to the covid-19 epidemic situation we are living in right now. As the name suggests, these protocols work similarly to the spread of the virus to effectively communicate information between nodes. Sometimes these Gossip algorithms are also known as Epidemic algorithm because of their behavior. The goal of this project was to determine the convergence of such algorithms through a simulator based on actors written in F# on various network topologies.

Topologies

The actual network topology plays a critical role in the dissemination speed of Gossip protocols. As part of this project, I’ve experimented with various topologies. The topology determines who is considered a neighbor in the above algorithms.

Full Network: Every actor is a neighbor of all other actors. That is, every actor can talk directly to any other actor.

2D Grid: Actors form a 2D grid. The actors can only talk to the grid neighbors.

Line: Actors are arranged in a line. Each actor has only 2 neighbors (one left and one right, unless you are the first or last actor).

Imperfect 2D Grid: Grid arrangement but one random other neighbor is selected from the list of all actors (4+1 neighbors).

Gossip

converged_nodes.[mailbox.Self] <- true
ProcessInformation ->
if rumourCount < 11 then
let rnd = Random().Next(0, neighbours.Length)
if not converged_nodes.[neighbours.[rnd]] then
neighbours.[rnd] <! CallWorker
mailbox.Self <! ProcessInformation
converged_nodes.[mailbox.Self] <- true
ProcessInformation ->

The node (A) in the network randomly selects another node to share information. Here, it is assumed that each node in the network maintains a list of all other nodes, or obtains information from a centralized server (in the code snippet, “neighbors” are lists that maintains list of neighboring nodes).

On receipt of information, the receiving node (B) processes the information. The above snippet is a straightforward implementation of actor that accepts a message and sends a message to random neighbor

In the next round of this process, both A and B again select nodes randomly and transmit the information

Each actor keeps track of rumors and how many times it has heard the rumor. It stops transmitting once it has heard the rumor 10 times (10 is arbitrary, you can select other values)

The entire network is said to be converge once all the nodes have received information for at least 10 times.

Result

Gossip

Gossip Log Scale

  • Line topology is the slowest to converge. This is because it has only access to 2 neighbors (left and right node).

  • Full topology is the fastest to converge in all scenarios. This is because they connect it to all the nodes and convergence is faster to achieve in this scenario.

  • As expected, 2D and imperfect 2D would achieve the convergence time in between the line and full with imperfect 2D performing slightly better or equal to 2D performance.

PushSum

StartPushSum delta ->
let index = Random().Next(0, neighbours.Length)
neighbours.[index] <! ComputePushSum(sum, weight, delta)
sum <- sum / 2.0
weight <- weight / 2.0
neighbours.[index] <! ComputePushSum(sum, weight, delta)
neighbours.[index] <! ComputePushSum(sum, weight, delta)

The push sum protocol is one of the gossip-based aggregation algorithms that is based on the interactive pairwise distribution of aggregated values among particulars entities.

To properly fulfill the mentioned functionality, the entities store two values: the value of the averaged quantity and the weight. The entities are assigned an initial value, which is updated at each iteration in terms of the information from the adjacent nodes and the inner state. The parameter weight is set to 1 for each entity at the beginning of the whole process

It sends this entity the half of the value of its average quantity and the half of its weight. After all, complete this procedure, each entity calculates the average estimation by counting the ratio of the sums of these two parameters

Result

PushSum

PushSum

If you want to get started with some of these algorithms, I would highly recommend reading Epidemic algorithms for replicated database maintenance and Gossip-Based Computation of Aggregate Information


 Guide to Hacktoberfest and winning a free T-Shirt

Want more?

Subscribe to get my latest content via email. I won’t send you spam, and you can unsubscribe at any time.