Saturday, April 28, 2012

Multidimensional programming

From time to time I reverse some piece of open source code just to understand how it works. The biggest problem with that is the amount of things I need to juggle in my head until they find a place on the canvas of reversed blueprint. If there are too many - I give up or put the project on the back burner. This was with Stackless and Twisted, but I really glad to finally get to them.

Quite often is starts with some bug that seems easy to fix and I go for it. In ideal world it shouldn't take more than 15 minutes for studying the code to gain understanding what should be changed, but it is also important to have a confidence that the change won't break anything.

I won't tell you how to deal with that complexity, but I'd like to share an idea that I've got from Large Hadron Collider. =) Let me tell you that story...

I always had troubles trying to squeeze more than 3 dimensions into my head. I could imagine a dot moving along X or Y axes easily, could imagine it moving along Z axis, but everything more than that caused a confusion.

I could have my brain exploded when some time ago a friend of mine tried to explain what's going on in LHC. He said that physics and mathematicians are trying to figure out how many dimensions our universe has. They assume that our universe has more than 4 dimensions (3 coordinates and 1 time value). In fact they argue that the truth is somewhere between 9 and 26!

I am not a fan of scientific theories - my neurons are pretty calm to that matter, but when you just need to understand, because the person in front of you is knowledgeable and tries all the best to explain  - here is when you start to feel colliding brain cells under your skull. To the honor of my friend, he didn't try to build-up suspense (as I would probably do before explaining properly) and managed to draw a clear picture in my mind. A standard plot with two axes - X and Y, and a dot.

"Look at this dot on X/Y plane. This dot has coordinates in two dimensions - X and Y", - he explained. "These are only two you can see here. But this dot is from the real world, so it also has Z coordinate", - he added putting a label "Z=0.1" next to the dot. "This dot also have speed", - he drew "V=0.1 m/s". "Now we can see values for all 4 traditional dimensions of our dot, but we can add more. There are many things that we can classify our dot. It probably has color.", - he drew a color box and a hex value. "It has temperature", - a new label "t = 25ะก" appeared in the column. "There are 6 already, and you can add your own.". I immediately imagined a dot travelling through canvas of "Stars!" game with all those labels nearby that constantly change their values as the dot moved. "Wow! Now I see", - that was a nice feeling - I must admit I can be pretty dumb sometimes. =) It didn't make a science fan out of me, but did make an important short-circuit in the depths of my head, which popped up a few months later..

A few months later.

As it usually happens in software development you need at least some basic design before setting down to code, and with time design phase completely faded from our process into "who made it first - wins" motto. It is hard to argue with that, so I didn't. But as a result after some time, the stability of our releases dropped. People were not communicating, and started to forget about some aspects of our system that could fire at any time in completely unexpected places. Even though all our code undergoes reviews, the reviewers tend to forget about those aspect as well. The process went out of control - we couldn't keep all the aspects in our heads when planning for the next feature to be released, and I found myself in same state I was when trying to understand the complexity of multidimensional string theories. That lead me to the idea that we need to control the amount of aspects that we need to keep in mind when coding and restructure our architecture to keep those aspects at minimum. This will help us to regain sense of confidence into what we are doing, and save some money from the bills from the nearest bar on the name of our release manager.

So, the `multidimensional programming` concept means that at any point of your code there are multiple things that you should be aware of. These things are the 'dimensions', and the more you have - the more complex your application is. Basically, these are the things that can be broken by any change at this place. Good application architecture is orthogonal - you can work at only one dimension at a time without thinking too much about all others. But you need to know all of them anyway to gain a sense of confidence.

For example, a recent change in Spyder IDE requires me to rename some file. This breaks the code, which I grep and fix - that's one dimension. But it will also likely to break a translation for the strings in that file, because the string is now at a different place. I imagine that nobody will be interested to check and translate the same stuff over and over again, so that's one more thing I'd like to avoid, so I should keep that in mind.

Another example are web applications. You need to keep in mind 'user privileges', type of HTTP request ('ajax', ...) and response ('json', ...) required. You need to make sure `critical errors` are handled and reported, and `static files` are correctly served by web server. You need to save incomplete `data between requests`, and cleanup it where possible. Make sure there is sufficient `XSRF protection` and `browser compatibility`. There are a lot more to it, and so far these have nothing to do with the logic of your web application. Frameworks help to deal with that, but inside they are still multidimensional. If framework is not flexible for you - that probably means it tries to keep some dimensions orthogonal, and there could be a good reason for that.

Maybe that's not much, but at least now you can argument that Large Hadron Collider experiments have much in common with software engineering, and when somebody asks about your job - you can proudly state that you're on par with scientists with their string theory, but in your own big enterprise application universe. =)


  1. I do not find the comparison to “dimensions” to be very useful, but your overall point — that frameworks, libraries, and well-written code reduce the number of things that we have to think of at the same time — is an excellent one, and is a point to which programmers do not pay enough attention.

    I often find myself stopping and thinking hard if a Python module that I am writing is importing *three* quite different kinds of libraries up at the top — for example, the CherryPy mechanism for building a plug-in, *and* the "threading" library for managing workers, *and* the "socket" library for letting those workers make connections. I find code easier to read and test (which is huge — just writing the tests can force me to untangle the code!) if I have one module that deals just connecting the plugin to the threads, then another module focuses on the sockets. For very small modules or programs I will bend this rule, but when program logic starts to get hairy, I often restrict myself to only two moving parts at a time!

  2. I agree that the idea is far from being refined and the "multidimensional" may sound too vague and abstract. The point is to give a tool to understand and deal with complexity. The tool is far from being ideal, but I hope it can already help together with a notion of "technical debt" to argument about bad design decisions.

    By the way - here is another way of looking at "Dimensions of programming" by Peter Norvig that you may find interesting too -