Wednesday, 2 September 2015

Coding lessons from Hollywood: Readable Python conditionals with visitor pattern

Programs have conditional statements. They read like the following

if condition do action 
else do something else. 

These can get long, hairy and will read like

if condition_1 do this 
else if condition_2 do that 
else if condition_3 do that 
..... 
.....
finally if nothing matches our expected conditions do something else. 

Switch-case statements which allow you to code this in a more reader friendly manner are not present in Python which means that we do it with chained conditional statements. Chained conditional statements can be ok when there are small operations to be done with each condition. For example, 



However as the number of conditions, context and operations to be performed get complicated, the code can become unwieldy, difficult to understand and thus non-maintainable. The code architecture of an application can go south. 

On the lighter side if you need Hollywood to convince you on this, have a look at this guy reading his colleague's code. Your / Others life may depend on it. :)) 

Chains of code like that will qualify as a hack to sleep in less than 30 seconds. For example if we are reading data off a network and need to perform different operations on the packet type. Imagine 20 different packet types. Something like the following can be cumbersome to read after 3-6 months of initial release. The following is an example of using long conditional to do a job.




Where as the following

looks better, reads better and takes the operation done away to another place (decoupled). Even better we can code the handler to take the packet itself and figure things out inside the handler. This can be done in Python using the visitor pattern. This pattern comes across in data structures but serves well in this context too. Plus this approach has been around for quite sometime in Python. But less frequently utilized for a long switch-case scenario.

So given are a number of options and corresponding operations to be performed. We can use different methods of a Python class to handle the different options. This means that we need to know which method for which option? Python makes this very easy as methods are also attributes in the dynamic __dict__ key value pair of objects. i.e we can query a method in an object, add a method in run-time or even change it for a class. 

Finding out which method handles which option can be done as shown in the following code.

We take the option and check if there is a method tied to that option in __dict__ attribute set. If we find one we use that. If no method is found we just do the default operation corresponding to default switch-case statement. Also we can use Python's handy *args and **kwargs to pass data into the handlers.


A class that implements the different methods for the various options can be like the following. Here we do some work on numeric options and some operations based on country codes.




These can be called as follows


To make things better, the Base class can look for functions named after the PacketTypeClass too, if we have different packet classes like IPv4Packet, IPv6Packet etc. i.e handler methods are named after the object type that they handle. 

It is arguable that introducing a class and object everywhere to handle switch-case statements or long conditionals is not helpful all the time. If the conditional chains are small and context is not that complicated it is acceptable to do conditionals as such. A simple dict with options as keys and method names as values will suffice. But that is exactly what a class in Python does with its __dict__ and whichever way chosen, handler code need to go somewhere in the code structure. Better done in a readable way.