|
This is a cross-post from my blog, Game Dev Without a Cause.
Any technically-inclined person on the net should be familiar with the acronym RTFM which stands for the phrase "Read the Freaking Manual" (the non-family-friendly version is more colorful, of course). It's the standard response to a question that you think wouldn't have been asked had the asker read the (freaking) manual in the first place.
For the programming set, a natural variant for RTFM is RTFC, "Read the Freaking Code". For any given program, the source code is, by definition, the most accurate description of how a given program will behave. Barring inaccurate comments and misleading variable names, source code is the only form of documentation guaranteed to be 100% accurate. What keeps source code from being great documentation is that it is so hard to read. It's so hard in fact that:
Most Programmers Can't Read Code
Programmers can't read code? But isn't that their job? Strange as it may seem, it's true and there are reasons for this.
Now that I have your attention, I'll take a moment to admit my cheekiness in the above statement. Almost any programmer can read code in the strictest sense, i.e. they can parse a given line of code and have a reasonable idea of what it will do. But, being able to understand large amounts of code, the designs and intentions implicit in how code is written and by whom, taking note of what code is not there, and being able to extrapolate on how the code will evolve over time, is a different story. These abilities can be unfortunately rare in many programming teams. In other words, most programmers have some facility in code reading, but they are actually deficient in code comprehension.
So, to refine my original statement:
Because most programmers have poor code comprehension, Most Programmers Can't Read Code Effectively
Joel Spolsky of Joel on Software fame made a similar observation with this line from one of his posts:
It’s harder to read code than to write it.
So why is code so much harder to read than to write? There are a couple of reasons that I think are primarily responsible for this phenomenon.
The first reason code is harder to read than to write has to do with the sheer amount of data you need to keep in your head in order to read code. When you write code, you only need to remember the variables, algorithms, data, etc. relevant to the feature your are currently writing. On the other hand, when you read code, you need to keep data not just about the feature you are currently investigating but data about other potentially relevant functionality. In order to understand code well enough to leverage it, you need to build up a working model of the design in your head gleaned from clues hinted at by the code you read. While you write code, you can ignore exception and error cases with the expectation that you'll get to them once you get the core logic working. In most case, when reading code, all those exception and error cases are already implemented and embedded into the code that you're reading. Not only do you have to keep several times more stuff in your head when you're reading code versus writing it, but you also have to pretend that you're Sherlock Holmes as you try to deduce the intended design and usage of the code you see and fit it into a larger mental picture of the software project. It's no wonder so many programmers chicken out at the thought of understanding a large codebase and opt to just write their own. They essentially mind-trick themselves into thinking that they are too dumb to read code.
The other major factor that confounds reading code is pride. Programmer pride. Writing code is a surprisingly personal endeavor. When you spend a lot of time thinking deeply about a problem and crafting code to solve that problem, you can easily become emotionally attached to the code that you created. This can be advantageous because it encourages programmers to constantly refine their favorite code, but it also leads to programmers having territorial feelings over a particular problem domain and biases towards their own code. This sort of pride can make it extremely difficult for a programer to read someone else's implementation and fairly compare it with their own. Indeed, if they start reading code with the assumption that they won't like it, they will never be able to apply the brain power necessary to understand the code they are supposed to read.
My boss said something recently that spoke to this element of programmer pride and I'll paraphrase it here:
You can't look down on someone when you read their code. If you don't respect the person writing the code, you won't be able to apply the energy needed to understand it.
For many programmers, the toughest barrier preventing them from reading code may be saying to themselves: "The person who wrote this just might be smarter than I am."
There you have it. Reading code is hard because it can be both mentally and emotionally taxing. So that's the problem, now what can we do about it?
Honestly, I wish I knew. For recruiting, my best recommendation is to try to find ways to test for code comprehension. For folks you already work with, try to find the programmers who actively read code, are able to grok it, and are able to leverage that understanding so your team doesn't have to write code that has already been written. Once you find these guys, hold them up and encourage others to follow their example.
As for individual programmers, learn to read code. The skills needed to read code are not innate, they are learned and honed through practice. Don't be scared of trying to read code that seems inscrutable at first glance. With enough work, you will be able to understand it. Remember that you are smart enough to understand any code that comes before you, just not as smart as the guy who actually wrote it. The only way to get as smart as that guy is by reading his code.
|
A few things I have learned as a programmer:
1 - Names matter. Good names for variables, constants, functions, classes all matter a great deal. Name these entities in such a way that when you use them in your code, it results in almost-English that even a non-programmer could mostly understand. Don't abbreviate names, but also don't use overly long names, either.
2 - Whitespace matters. Good indentation and extra blank lines around blocks of code can make it much easier to read. Extra spaces or tabs to align operators in similar lines of code (such as variable initialization) help make it clear that you're performing a bunch of similar operations. Format your code to make it look structured.
3 - Comments matter (but don't comment where clarity could be improved directly in the code). Write comments for an audience of people who are not familiar with your project, and use comments to help them understand why you're doing what you're doing in this bit of code. If, while reading some block of code, you have a thought pop into your head "Why didn't I... " or "What about..." then answer that question in a comment, so you don't waste time re-answering it every time you look at the code.
4 - Code organization matters. Methods should be short and simple and do one thing only. When you do this, it becomes easy and obvious what the method should be named. When you use this well-named method in your code, the code becomes easier to understand when you read it.
If a comment is a waste then it's the content of that comment that is worthless and not the concept of commenting. Comments are designed for the very thing this article is pointing at. Which is translating the intent of the original writer from the code alone is not always easy, if the author of the code puts thought into their comments to lay out the pieces the code doesn't tell it fills in the gaps and lets a reader comprehend that code much faster. Redundant comments about what code does are a waste, this is true, the code already tells you that.
So maybe the core problem there is some engineers don't know how to convey their intent well enough in spoken form so they just write effective code and hope it bleeds through.
That's just my opinion anyway. I've seen plenty of comments that are a complete waste of time, I've also seen some that have saved me a lot of headache understanding something.
I've noticed that code is often clear enough in what it does, but not why it does it that way or when it might be appropriate to change it and how.
One might be jumping into a big code base and lack the necessary experiences and knowledge of context to understand what lead to a decision.
In one case I wrote a DFA that performed what one could perform with a regex. Why??? I spelled out the whole scenario that lead to it and the relevant analysis in the comment. One can easily judge later if it's still appropriate to maintain the code, rather than having to guess whether the prior programmer was some kind of an idiot.
I'm not talking about functions that fail. I'm talking about appropriateness of algorithms and sometimes factors outside of the actual code. Typical user behavior, expected tendencies in data, even the occasional hack to deal with a bug outside of the project's control.
It's unusual that an explanation is warranted, but it occurs.
A little humility goes a long way. Even if you think your code is as clear as can be, someone else who encounters it later may not. Spending 5 seconds writing a comment now so someone else doesn't bang their head against their desk trying to figure it out later is a pretty reasonable thing to do.
5 - Don't use magic numbers. I hate finding a line of code with a magic number in there and having no idea of what it's supposed to represent. Use a const, use a define, just make sure you do something to indicate what that number is.
Programmer pride kicks in when I'm asked to fix up someone else's codebase. Not only do I have to read their code, I have to fix their bugs - and I most likely have no respect for the programmer who wrote the code, nor any personal pride in the work since I did not write it. I'm probably not even learning anything from their code. When an employer wants to know if you can read code, usually they want you to read code to fix up someone else's codebase, not to learn from it - and trying to package this as a "learning experience" is just trying to put a pretty label on rotten meat. You don't need to read much code to work with another programmer's clean API/interface - it is only when you have to get into the internals that you have to read the code, and this is most often for debugging.
A good analogy is that an artist will often learn from another's painting, but would you ask an artist to finish someone's incomplete painting? This is not to say I wont do it, or I can't do it, or it is not a useful skill to learn, but I'd just like to point out that most of the time when a boss complains that a programmer can't read code, it's more that they are trying to push a crap debugging job on them.
Seems like shuffling the responsibility back onto the code writer is committing the same sin Robert identified - not respecting the author of the code.
a lead programmer i worked with stated that " programmers should never write code for themselves" it's not an easy mantra or mindset to adapt but in terms of making sure the entire programming team is on the same page it makes perfect sense . the core idea behind it was you write the code, comment as much as you can in order to kind of set up a " thought process guide" and generally follow a standard guide of naming conventions for various variable types( not anything drastic just starting functions with capitals , and variables with lower case names). just a thought . I have read poorly written code and I have had others approach me and let me know my code is confusing too them . 9 times out of 10 comments are the answer.
The way this is somewhat handled at the company I work at is code review. No code is checked in without a review. Each folder on the project has an "OWNERS" file. Those are people who know the code in that folder, it's design, it's assumptions, exceptions, etc... Someone listed in that file is required to review patches to stuff in that folder.
I've found this tremendously helpful. A good reviewer will noticed the things you weren't aware of. They'll point out the solutions that already exist. And, they'll help you get familiar with the code.
If you happen to be someone who ends up committing changes to that folder often eventually you'll get added to the OWNERS file for that folder. If the folder gets too big it's split.
It helps that we have a reasonably good review system. You make a patch in git and type "upload-patch". If your patch is new a new issue is created on the review site. If it's existing your changes will be uploaded to the issue. On the site you can review the changes side by side and comment on any line. The patch will also be run through the continuous build system's pre-commit patch testing so you have some chance of knowing before hand whether your patch will break anything.
1. In Main stream programming you are commonly hired at a company and your first few years tend to be modifying and fixing already written code (hopefully by people who are still at the company).
2. At game companies there tends to be a common practice to always re-write the entire game engine. Commonly caused by the entire team being completely new hires.
When you spend 2 or 3 years working with the companies code base you develop the skills needed to read and understand code written by anyone.