Last week I worked with Gossip algorithms as a part of the Distributed Operating System class under Prof. Alin Dobra. Distributed Systems have always been an interesting topic and once you read it's a rabbit hole down from there.
As the scale of distributed systems increases, there is a massive instability in the network because of numerous nodes. Gossip protocols work similarly to the covid-19 epidemic situation we are living in right now. As the name suggests, these protocols work similarly to the spread of the virus to effectively communicate information between nodes. Sometimes these Gossip algorithms are also known as Epidemic algorithm because of their behavior. The goal of this project was to determine the convergence of such algorithms through a simulator based on actors written in F# on various network topologies.
The actual network topology plays a critical role in the dissemination speed of Gossip protocols. As part of this project, I've experimented with various topologies. The topology determines who is considered a neighbor in the above algorithms.
Full Network: Every actor is a neighbor of all other actors. That is, every actor can talk directly to any other actor.
2D Grid: Actors form a 2D grid. The actors can only talk to the grid neighbors.
Line: Actors are arranged in a line. Each actor has only 2 neighbors (one left and one right, unless you are the first or last actor).
Imperfect 2D Grid: Grid arrangement but one random other neighbor is selected from the list of all actors (4+1 neighbors).
_10ProcessInformation ->_10if rumourCount < 11 then_10let rnd = Random().Next(0, neighbours.Length)_10if not converged_nodes.[neighbours.[rnd]] then_10neighbours.[rnd] <! CallWorker_10mailbox.Self <! ProcessInformation
The node (A) in the network randomly selects another node to share information. Here, it is assumed that each node in the network maintains a list of all other nodes, or obtains information from a centralized server (in the code snippet, "neighbors" are lists that maintains list of neighboring nodes).
On receipt of information, the receiving node (B) processes the information. The above snippet is a straightforward implementation of actor that accepts a message and sends a message to random neighbor
_10CallWorker ->_10if rumourCount = 0 then_10mailbox.Self <! ActivateWorker_10if (rumourCount = 10) then_10supervisor <! ReportMsgRecvd "Rumor"_10converged_nodes.[mailbox.Self] <- true_10rumourCount <- rumourCount + 1
In the next round of this process, both A and B again select nodes randomly and transmit the information
Each actor keeps track of rumors and how many times it has heard the rumor. It stops transmitting once it has heard the rumor 10 times (10 is arbitrary, you can select other values)
The entire network is said to be converge once all the nodes have received information for at least 10 times.
Line topology is the slowest to converge. This is because it has only access to 2 neighbors (left and right node).
Full topology is the fastest to converge in all scenarios. This is because they connect it to all the nodes and convergence is faster to achieve in this scenario.
As expected, 2D and imperfect 2D would achieve the convergence time in between the line and full with imperfect 2D performing slightly better or equal to 2D performance.
_10StartPushSum delta ->_10let index = Random().Next(0, neighbours.Length)_10_10sum <- sum / 2.0_10weight <- weight / 2.0_10neighbours.[index] <! ComputePushSum(sum, weight, delta)
The push sum protocol is one of the gossip-based aggregation algorithms that is based on the interactive pairwise distribution of aggregated values among particulars entities.
_25ComputePushSum (s: float, w, delta) ->_25let newsum = sum + s_25let newweight = weight + w_25_25let cal = sum / weight - newsum / newweight |> abs_25_25if alreadyConverged then_25_25let index = Random().Next(0, neighbours.Length)_25neighbours.[index] <! ComputePushSum(s, w, delta)_25else_25if cal > delta then_25termRound <- 0_25else_25termRound <- termRound + 1_25_25if termRound = 3 then_25termRound <- 0_25alreadyConverged <- true_25supervisor <! Result(sum, weight)_25_25sum <- newsum / 2.0_25weight <- newweight / 2.0_25let index = Random().Next(0, neighbours.Length)_25neighbours.[index] <! ComputePushSum(sum, weight, delta)
To properly fulfill the mentioned functionality, the entities store two values: the value of the averaged quantity and the weight. The entities are assigned an initial value, which is updated at each iteration in terms of the information from the adjacent nodes and the inner state. The parameter weight is set to 1 for each entity at the beginning of the whole process
It sends this entity the half of the value of its average quantity and the half of its weight. After all, complete this procedure, each entity calculates the average estimation by counting the ratio of the sums of these two parameters
If you want to get started with some of these algorithms, I would highly recommend reading Epidemic algorithms for replicated database maintenance and Gossip-Based Computation of Aggregate Information