Wednesday 26 November 2014

Stability patterns in software ecosystem

If you ever have been involved in a big software project in which there are many different binaries should integrate and work together, you should have noticed that sometimes malfunctioning or failure of one of these binaries, causes the whole system get broken, while most of the time such a big software ecosystem should not really get broken because of just malfunctioning of one part.

You know what this mean? Consider a human body as a closed ecosystem, should the whole human body get broken (dead) if suddenly something happens to one of the feet? Of course not, there should be some mechanism doesn't let the fault or error spreads across the ecosystem.



Sample of a software ecosystem we need to consider
stability patterns for each module.
OK, let us get back to the software. Consider an ecosystem of software products working together, I call them M1 ... M5. It is not important but in a good module/component design, you usually (not always) have a hierarchy of processing or co-working like the one we have in the picture.

Before continuing, let me tell why I use the term "ecosystem", I always try to model things
with some more tangible things, and in fact, if not all, most of the human inventions somehow has relation to something in nature too. Since these modules or software binaries should work together to fulfill our requirements, I consider  the whole system as an ecosystem. This modeling lets us have a better sense of the problem. For this example, each of these modules can be interpreted as an animal or plant and ...

Working scenario
I don't want to get into the detail of these patterns in this post, so just think that M5 & M4 are doing something and they send the results to M2 and M2 does something else on the given results and send them to M1. M3, on the other hand, is sending some information to M1 so M1 processes the given information from M2 and M3. The system can also continue working with just one of M4 or M5 be alive and it will also work if even M3 doesn't function properly. Just consider there are always rules between modules in a software ecosystem, these rules or policies help us to force stability patterns which make the system stable.

What stability patterns are exactly?
These are nothing but some program codes we have to implement in every connection point we have with other software creatures in the ecosystem, which should enforce the already defined rules or policies. These patterns make sure the whole system will be stable as much as possible.

For example, M2 have an input which gets data from M5 or M4 or both. Since M2 should work unless both of them are dead, M2 must have some background process which just checks for M5 and M4 to be at least one alive. Until it doesn't find both of them dead, M2 should not tell M1 that there is something wrong with lower-hand processes. If for example M5 dies, M2 should just ignore it for a while and check for M5's health once in a while, and if M2 finds M5 alive later, it can start collecting and processing its data again.

Consider there is another rule or policy in the system, that defines the data throughput from M2 to M1 must be between 100 to 500 message/sec. OK stability patterns force us as a developer to run a process in M1 that controls the input of the module to make sure the throughput is always between 100-500 and if it is not, there are choices. M1 either can stop the system, gives fault error or drops the extra messages if the throughput goes higher than 500 and it can again stop and gives fault message or just gives some warning if throughput is under 100.

The important thing is, it is totally wrong to ignore and not to enforce these inter binary module rules or policies. The system may work at first in good conditions but who knows the M2 may get unstable and then this is the time when nobody knows for example what will be the response of M1 if throughput goes higher than 500.

Defensive programming
We may talk some day about defensive programming, you actually can use these patterns even in a single binary software between any modules have interaction between each other. It makes sure none of the modules do an unwanted reaction to the given inputs.

No comments:

Post a Comment