“Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away” – Antoine de Saint Exupéry. If you’ve ever invited me do some pair programming with you, you probably have a good idea what this quote is all about. I often wind up asking questions like:
- Why do you need this boolean named retVal? Could it be eliminated the use of early return statements?
- Is the else clause in this if statement necessary? Could it be avoided with a return statement? Or break/continue in a loop?
- I noticed that this method has parts that are nested five levels of braces deep. Is there anything we can do to reduce that?
Two recent posts Spartan Programming (Jeff Atwood Coding Horror) taken from the Spartan Programming page on the Technion Wiki and Life After If, For and Switch (Scott Hansleman Computerzen), have reminded me of this subject.
Most of the Spartan ideas make sense to me:
- Minimize depth of nesting of control structures
- Minimize the number of lines occupied – the idea being its easier to make sense of a method if its all on one screen. I would prefer that most methods be < 10 lines long. Some people read this suggestion to mean all blank lines should eliminated. I would disagree preferring to use blank lines like paragraph breaks, they indicate we’re changing gears.
- Minimize Token and Character Count suggest removing braces etc. – i.e. remove every non-essential character. I think these measures can be harmful and wouldn’t use them.
- Minimize the number of Parameters to a method and avoid the use of out params.
- Reduce the number of Variables – a number of excellent techniques are shown. I disagree with only two: the use of ternary operator and encouraging the use of very terse variable names.
Jeff Atwood codifies the frugal use of variables:
- Minimize number of variables. Inline variables which are used only once. Take advantage of
foreach
loops. - Minimize visibility of variables and other identifiers. Define variables at the smallest possible scope.
- Minimize accessibility of variables. Prefer the greater encapsulation of
private
variables. - Minimize variability of variables. Strive to make variables
final
in Java andconst
in C++. Use annotations or restrictions whenever possible. The only caveat here – having tried making both parameters and local variables final I found no benefit and so gave up. - Minimize lifetime of variables. Prefer ephemeral variables to longer lived ones. Avoid persistent variables such as files.
- Minimize names of variables. Short-lived, tightly scoped variables can use concise, terse names. On this we completely disagree.
- Minimize use of array variables. Replace them with collections provided by your standard libraries.
My additional rules:
- Booleans make bad parameters. If you have method that has doStuff(realParam, true, false) – how does someone reading the calling know what true and false mean. At the very minimum use enums with meaningful names.
- One letter variable names like i,j,k and e come from the early days of Fortran where variables were 1-6 characters long. With so few characters its not surprise that common convention dictated that i,j,k were loop/index variables. We’ve come a long since then – use names that while concise get the point across.
- Rather than for(int index = 0; index < size(); index++) use Java (or .NET’s) foreach. They save an unnecessary indexing variable.
- Check arguments at the start of any public or package level method and after that assume they’re right. You can find libraries to simplify this for you (I’ve written one at work) and if you don’t work with me see: “I take exception to that argument” on Code Project.
- Minimize the use of instanceof – as your class structure grows these will be hard to maintain. More on this in up coming post.
- The if/switch statement you don’t write is one you don’t have to test.
My cardinal rule: Always program as if the person maintaining your code six months from now is an axe murder. The last thing you want to do is give him (or her) an excuse to come visit you in frustration.
So next you invite me to pair program at least you have an idea about some of the questions I will ask.
Mark Levison has been helping Scrum teams and organizations with Agile, Scrum and Kanban style approaches since 2001. From certified scrum master training to custom Agile courses, he has helped well over 8,000 individuals, earning him respect and top rated reviews as one of the pioneers within the industry, as well as a raft of certifications from the ScrumAlliance. Mark has been a speaker at various Agile Conferences for more than 20 years, and is a published Scrum author with eBooks as well as articles on InfoQ.com, ScrumAlliance.org an AgileAlliance.org.
Pete Verdon says
I think I agree with more or less everything here.
An exception: I don’t think there’s anything wrong with using i/j/k as loop index variables. I agree that if it’s a simple loop it should be a foreach, but if you need the index for some reason then i is the best possible name for it. Unless the author is particularly perverse, I know as soon as I see it that i is the loop index – it’s one of the most powerful conventions going.
An observation: Objective C attempts to solve the object.doStuff(realParam, true, false) problem by naming the parameters in the call as well as the definition.
As I understand it (I don’t actually program in Objective C) that call might look like [object doStuff:realparam useWidget:true bePersistant:false] . It looks a bit alien at first when you’re used only to function(arguments) syntax inherited from C, but I think I could get used to it.
Tim Fischer says
I agree with most – have issues with foreach. At the moment, I am dealing with serious speed issues to handle a massive ammount of rows in a table. I Killed foreach early on due to it’s speed. Actually, in an almost direct conflict with “fewer lines of code”, when programming for speed, I’ve found the need in many places to *add* more lines of code (unrolling loops, etc, cause quite a bit more code).
dealing with Javascript (or most interpreted languages), small variables in a loop massively speed things up as well.
Just from experience…
Tim Fischer says
Mark,
Well – it’s javascript that I am working in, but most interpreted languages work the same. To be honest, I searched the net for timings on all things js before attempting to make my grid work faster. I got massive speed increases when not using foreach and by unrolling the loops. probably around 400% faster or better. Other speed increases came from using arrays to joing strings instead of concatenating, etc – but when dealing with over 10,000 rows consisting of 5 or more cells each – every microsecond is seriously worth the trouble.
I wish I could give you more on the foreach other than my personal experience and “best Guess” – which is that in a foreach loop, each iteration has more calculations etc to do in order to “figure” out what it is that “Each” means, but if I iterate through an array of objects I am directly accessing the pointer via it’s index – no calcs required.
My grid loads 3000 rows in about 17 seconds with all of the functionality of the infragistics WebGrid (which takes closer to 2 and 1/2 minutes to render). Through the above tweaks, I gained much in the manner of speed.
For Compiled languages – I may just do the same for consistency sake though – unless I find a bottleneck! 😉
-Tim
Mark Levison says
JavaScript – ahh now that’s an interesting beast. I really don’t know it and can’t make any comment. However I wouldn’t generalize lessons learned about performance from JavaScript to any other language. I’m certain that foreach in Java and its .NET equivalent have been well thought out by their implementors and don’t add a performance hit.
Andrew Binstock says
This reminds me a lot of the rules for OO practice from Thoughtworks’ Jeff Bay at https://binstock.blogspot.com/2008/04/perfecting-oos-small-classes-and-short.html
Rafferty Uy says
Great post!
May I ask why foreach is better than for loops? How is having an indexing variable a bad thing?
I believe I read somewhere that the for loop is more efficient than the foreach loop.
Rafferty
Mark Levison says
Tim – I would be interested in knowing more about your experience with foreach.
Have you disassembled the foreach loop to see how its byte code differs from regular for loops? What additional instructions are there? Have you written tests that time the two variety of loops? Would you share those timings. What JVM version are you working with?
I’ve never encountered a situation where the foreach loop was slower in anyway that I can measure. Hence my interest in your findings.
In addition I’m certain that modern compilers and JVM’s can make much better decisions about loop unrolling than I can.
Finally I assume that you’re doing these optimisations in code that has already proven to be bottleneck.