Tuesday, 2 December 2014

Using negative feedback to stabilize software - 2

OK, now is the time to see how negative feedback makes a system stable. We saw in the previous post that the system doesn't know under how much load is going to be, so there must be some mechanism for increasing or decreasing the calculation power. If we assume the whole system is running as a single binary we can have some worker classes, if the system works as some separate binaries we can have worker binaries or even worker machines. They all need the same feedback mechanism, here we talk generally about this mechanism.

Using feedback loops to make the software ecosystem stable.
Look at the diagram and compare it with the one we had in the previous post. As you see I've tried to clearly show the feedback loop.

Note that we already assumed that the line queue is a thread safe or multi-access queue. The feedback mechanism for read process reads the LQ size in fix intervals and just sends the result to read process, which also has the responsibility of controlling feedback loop. Here the read process gets the LQ size and if it reaches to some critical point it stops reading. This mechanism just prevents the blind reading files and putting lines in the queue. So the code for the sensor and control section of feedback loop can be something like this which must be executed in a separate thread or binary repeatedly like each second.

if (lineQueue.size() > 0.80 * LINE_QUEUE_SIZE) {
} else {
    if (readProcess.status() == STOPPED)  {

For controlling the workers, things may be a little more difficult. Suppose we don't know the performance of each worker, they may run in different machines, the allocated resource for them may be different and ... We have divided the responsibilities of calculation process into 3 parts, "Worker manager & Dispatcher" which distributes lines to workers and adds new or removes old workers, "Load Control Feedback" which is a sensor to find out the calculation process and the "Calculation Processes Worker". The workers pseudo code could be something like the following:

while (true) {
  line := lineQueue.pop();
  if (line == null) {
  } else {
sleep(1);  // just in case ...

The feedback sensor and control could be something like the following which like the read process must get executed each 5 seconds:

tempWorker := workersList.first();
while (tempWorker != null) {
  if (tempWorker.status() == IDLE) 
    if (tempWorker.idleTime > 60) 
  tempWorker := tempWorker.next(); 

if (workersList.allBusy()) 
  if (lineQueue.size() > 0.50 * LINE_QUEUE_SIZE) {

There is also dispatcher code which repeatedly checks for idle workers and sends them lines for calculation.

There are some fixed numbers above, let us see what they are. The 0.80 * LINE_QUEUE_SIZE is nothing just a threshold at which we can feel there must be something wrong in the processing section, so we stop reading files. The 60 is the time for workers if we find them idle we remove them. And the 0.50 * LINE_QUEUE_SIZE is the threshold in which we decide to add a new worker to help the calculation processes. This threshold should not let the system experiences the other threshold (0.80) ever.

No comments:

Post a Comment