René Wiersma - Software Made Simple

Readability

May 31, 2022

Recently there was some discussion on Twitter about how to refactor a particular piece of code to make it more readable. The author of the initial tweet posted the following code (the original was in Java, I will be using C# here)

if (x % 2 == 0 && x > y)
{
	// Do something
}

The proposed improvement was:

bool isEven = x % 2 == 0;
bool isBigger = x > y;

if (isEven && isBigger)
{
	// Do something
}

In the comments of that particular thread, some people like the improved version better. Some preferred the original. Readability of code is a subjective thing. What one person thinks is more readable, another might think of too abtract, or too obscure, etc.

In our field, there is little consensus about what makes some code objectively better than other (functionally the same) code.

I personally think the author took a step in the right direction with the second version. The intention is good, but the execution is not quite there. Sometimes half an improvement leads to a worse situation.

One problem with the second version, as I see it, is that the if-statement doesn't scan properly. What is even? What is bigger than what? I would propose the following:

if (x.IsEven && x.IsBiggerThan(y))
{
	// Do something
}

Now the line reads more like a sentence in natural English. The implementation of IsEven and IsBiggerThan can be moved to extension methods. IsBiggerThan could be replaced by IsGreaterThan, which is the more commonly used mathematical term. So:

if (x.IsEven && x.IsGreaterThan(y))
{
	// Do something
}

Of course, the real question if we should be doing this refactoring in the first place. We have abstracted away the implementation of IsEven and IsGreaterThan, but abstractions are not free. You win something and you lose something when adding an abstraction. But do we win more than we lose?

In this particular case I would argue to just leave the original code intact. The x % 2 == 0 construct to determine evenness will be familiar to all but the most junior of programmers, and x > y will be obvious to anyone with even the most basic knowledge of math. So, we don't win much by this abstraction.

There are two other things that bother me about this code. The first is the use of the names x and y for the variables. The second one is that mysterious comment // Do something. Let's start with the variable names.

If this code is about positioning, or coordinates or some such, then x and y make sense. In most other domains, x and y are just placeholder names. This stems from mathematics where the first unknown is often simply named x, and the second one is named y, etc.

However, when programming, not naming your variables after what they represent is a missed opportunity to clarify the intent of the code.

The code was given without context or domain, so we can only guess what x and y represent, and what kind of business rule is behind the IsEven and IsGreaterThan-line.

A suggestion given by @AstridSawatzky on Twitter is that it represents a dice roll in a Dungeons and Dragons style game, where the roll is successful if the result is even and greater than a certain threshold. So, let's go with that. How can we express that idea in code?

if (diceTotal.IsEven && diceTotal.IsGreaterThan(threshold))
{
	// Do something
}

Now at least the variable names make more sense, and the code is becoming a lot expressive. Given the domain, the code becomes more readable, not just for programmers, but also for non-programming domain experts

Now for that // Do something. That seems to imply that some side effect is happening. Let's say something like:

if (diceTotal.IsEven && diceTotal.IsGreaterThan(threshold))
{
	Console.WriteLine("Attack was successful");
}

The problem with that code is that it is mixing domain logic with control flow logic and infrastructure logic. In the domain we should not care about Consoles and WriteLines (infrastructure), or even the particular if-statement (control flow).

Another way to put it is that it violates the CQRS principle (mixing command and query).

The domain code could look like this:

public bool IsSuccessFul(int diceTotal, int threshold) => diceTotal.IsEven && diceTotal.IsGreaterThan(threshold);

Now IsSuccessFul does not have any side-effects and can be tested easily. It's a pure domain concept, and the language used in the code reflects the language used in the domain.

When we learn more about the domain, we can determine where this function should be placed.

This article turned out as a bit of a brain dump. The main take aways, for me, are:

Abstractions are not free
Readability is in the eye of the beholder, especially when taken out of its (domain) context
Clarity of the code improves more objectively when the language used makes sense in the domain
Separating domain code from control flow code and infrastructural code improves clarity
Useful variable names go a long way towards making code more readable

-- Rene Wiersma