As Phil Karlton once said, “There are only two hard things in Computer Science: cache invalidation and naming things”. If naming things is hard — why actually spend time and energy on it? But before we can judge whether thinking about good variable names is worth the time, we should look into how we read code.
You can argue about which version is more readable and I think rightfully so. But it was eye-opening to me, that he was very much aware of his own strategy of reading code. And consequently, he wrote code in a way so it was easier for him to read.
Looking back, I had strong opinions on how “good” code should look like. It was interestingly mostly based on other colleagues’ recommendations or books. But I was actually really unaware of how I read code — what my personal strategy was to understanding code. Knowing this I was really keen on finding it out how I read code. And if the way I wrote code — or even more importantly suggestions I made in code reviews — did make it easier for me to read code.
What is my personal strategy?
If you take the ignoreCSP.js file in BackstopJS as a reference, this is how I would read it in my first pass.
The first thing which is obvious: I skip all comments. Although if you look at the original version below you could see that the comments in this file seem to be accurate, I rather jump directly into the code. This is because I don’t trust comments. Generally, I found it very frustrating to use mental capacity to parse comments if you cannot be one hundred percent sure they are actually accurate or they don’t really add relevant information (the comment in line 33 is a good example). I also found the more comments in a codebase, the more likely I am to ignore them even in my second pass. On the other side if comments are really rare in a codebase I tend to include them earlier because then they usually add very relevant information. I especially skip comments at the top of a file, because more often than not it includes irrelevant information about licenses, authors or information which can be derived from the filename anyhow.
I found out that the more comments in a codebase, the more likely I am to ignore them
Function and variable names
If you look at lines 23–24 you can see something interesting. Based on the variable name and the context I feel like I have enough information about the assignment of the variable (the part after the
= ) to skip it. You can also see the same pattern in lines 35–37.
If you look at line 25, the variable name
agent doesn’t give me enough information to derive the assignment, so I read the assignment up to the part where I feel like I have enough information to skip the details (in this case the parameters).
A similar pattern can be found in lines 38–44. The variable name
options together with the context, I have at line 38 doesn’t give me enough knowledge about the usage or implementation so I need to read the full object assignment. From line 38 the pattern repeats until the end of the file. The names of the variables do not give me enough information to skip assignments or implementations with confidence anymore, mostly because variable names, like
buffer, are very generic in this context (what does buffer mean in a function called
How does that influence my coding style?
The first thing is again really obvious. Since I don’t include comments in my first pass I try to avoid them wherever possible.
Luckily I feel like it gets less and less popular to add multiline comments at the top of a file. Especially in the era of VCS-managed code, there should be no necessity to include
@author tags. Even though I feel like the current trend moves away from putting the copyright notice in every file, GPLv3 still states that you should attach the copyright information to every file (either the full notice or at least a pointer to where the full notice can be found). At least the big projects, like spring-framework, angular and react put the license information in every file, so there seems to be a legal obligation to keep that. However, I also feel that multiline comments at the top of a file describing what the file is about are often a code smell. If a file is so complex that it needs a description, in most cases, there is a way to separate the file into smaller files where the name of the file can be self-explanatory.
The same goes for single-line comments within the file. In the majority of cases, I try to replace them with code-level alternatives. Could we rename a variable so it contains the same information as the comment? Probably we need to extract a variable or function so we can express what the comment says. Or in some cases, the information is just redundant, like in our example above in line 33. At least in my experience, it’s relatively easily possible to replace the majority of inline comments with code constructs, which fits my style of reading code much more. And at least in my experience, the names of variables or functions are more likely to be kept in sync with what the function or variable does than the inline comments. As a cherry on top — as mentioned above — if comments are used rarely in a codebase I start to give them much more attention because I feel like the chance is higher that they hold relevant up to date information.
Names and context
As you could see above my style of reading code heavily depends on good variable names in a context where you can understand them without diving into the implementation.
If you take the
options variable in line line 38. The variable name itself might not be too bad, but in the context of the
intercept method it’s hard to derive its implementation. But what if we could identify two separate concerns in the
intercept method: e.g. a
fetchRequest and a
options as a variable name would be good enough within the context of a function called
fetchRequest ? The same goes for the
buffer variable in line 48. But maybe if we extract a
body would be good enough?
When the context is not good enough it has usually one of two reasons. The name of the class or function is not expressive or it violates the single responsibility principle. But if you take care of both of those aspects — I have a much easier time reading code.
As you might guess my performance in reading code is heavily influenced by the number of mutable variables and side effects. I feel like there is a general trend towards immutable data structures, which obviously works in my favor. But if I see mutable code in code reviews I try really hard to work on an immutable alternative. And interestingly enough the main reason for doing so is not because I think that chains of
fold , etc. are easier to read than loops (which I do think). The main reason is because if the context and the variable names are good it lets me avoid diving into the implementation.
If you take a look at line 60–70 of App.js in the Backstop.js examples. A mutable variable
index is initialized and then conditionally set in a loop. I’d need to parse the whole block because I don’t know when or how the
index variable is mutated or what else is potentially being done in the loop.
If you would replace the loop with a
.find function and changing the variable name you could potentially have enough information when you read this line to skip the implementation.
What about everything else?
Indentation, Braces, order of arguments and many more. While a I feel like more and more of the things related to plain formatting (like where to put braces or indentations) are usually tackled by style checkers or code formatters. And while I have opinions about e.g. the order of arguments I put extra focus on immutability, good variable names, good context, because it benefits my style of reading code so massively.
Is this without risk?
No obviously not. It happens that names of functions suggests something different than what the function actually does but I still skip it. Or variables are misleading, but when I read them I feel that I can (in these cases wrongly) derive the implementation. Then I’m reaching the end of the function or file without finding the piece of functionality is that I’m looking for and need to go over it a second time and this time skipping fewer details. But nevertheless, I feel like as long as I adjust the amount of things I skip when reading code to the codebase and how well variable names and side effects are managed I feel quite comfortable.
However, this should not be a case for writing code in a specific way, nor do I want to make a case for reading code my way. But if you take away one thing I can only recommend taking some time to evaluate your own way of reading and interpreting code. How do you navigate between functions and files? Which parts of the code do you read more carefully and which parts do you skip when skimming or debugging code? Being aware of your own style of reading code may help you to write code in a way that it’s easier for you — and potentially others — to understand.