how to write self commenting code

16 Sep 2023 :: 7 min read

okay let me just start with this: i think those who say “you shouldn’t write comments because your code should comment itself” have likely only ever written hello world programs and have never actually had to work on a team.

it’s a bit short-sighted to me the idea that your code should always comment itself, because even if we ignore all the situations where it can’t do that (think high performance code, driver code, or weird hacks), how easy it is to read your code is so subjective. and this isn’t even just across skill levels of developers, although this is definitely a consideration one should make.

take for example code that i normally write. these days i’m coming at it from the perspective of an embedded software engineer, since that’s the domain that i work in primarily. my code is (hopefully) quite easily readable for someone else who works in a similar area, but if you were to look at my code as a full stack engineer then it’d likely make zero sense. and the same goes the other way–the same full stack engineer may write some code informed by their own experience and perspective and it just wouldn’t make sense to me.

but this isn’t to say i don’t believe in self commenting code, just that i don’t think it’s the silver bullet some may lead you to believe it is. it’s a useful skill to be able to write self commenting code (one that i hope i can help you develop by the end of this post!) but please don’t let it replace writing actual good comments.

oh, and for the purposes of this post i’ll be writing all the examples in c++. the concepts map over to other languages just fine though :)

naming things

it seems stupid but it’s shocking how often i see people get this wrong. the first step in making your code readable is to actually name your stuff meaningfully. for example:

void foo(uint32_t* a, size_t b, size_t c) {
    constexpr uint32_t d = 0xFFFFFF;
    for (size_t i = 0; i < b*c; ++i) {
        a[i] = d;
    }
}

this is terrible to read. some of you reading this may have a vague idea of what this does, but really it’s anyone’s guess what’s happening here. if we name things more meaningfully though:

void clear_white(uint32_t* frame_buffer, size_t width, size_t height) {
    constexpr uint8_t kWhite = 0xFFFFFF;
    for (size_t i = 0; i < width*height; ++i) {
        frame_buffer[i] = kWhite;
    }
}

suddenly it’s a lot clearer that this function is for clearing a frame buffer to white. a very simple example, yeah, but it’s shocking how often i’ll see truly awful naming of symbols in code that make no sense, even in context.

as a little bonus, this is also why code style matters so much when working on teams. take for example the following:

#define PI 3.14f
constexpr float kTau = PI * 2;

class MyClass {
public:
    MyClass();
    ~MyClass() = default;

    void set_enabled(bool enabled);
    bool get_enabled();

private:
    bool m_enabled = false;
};

just by looking at the name of these we can determine a few things:

MyClass is a type since it’s written in pascal case
PI is a macro because it’s in screaming snake case
kTau is a constant since it’s in pascal case with a k prefix (i believe this is hungarian notation?)
m_enabled is a member variable because it’s in snake case with a m_ prefix (again, i believe this is hungarian notation)

this is incredibly helpful for when you’re just scanning through a codebase trying to get a grasp for what’s what. so please, keep to the code style of whatever code you’re working on :) (even if you don’t like it!)

grouping data

sometimes it just makes more sense to group data semantically into a structure, for example:

float dot(float a_x, float a_y, float b_x, float b_y) {
    return a_x * b_x + a_y * b_y;
}

this kinda sucks cause there’s a lot to write out when you’re calling the function, and there’s loads of arguments too. if you create a struct though:

struct Vector2 {
    float x;
    float y;
};

float dot(Vector2 a, Vector2 b) {
    return a.x * b.x + a.y * b.y;
}

now we have a shorter signature that’s easier to read, and it’s easier to understand overall because we can see how the data all relates to each other.

minimise inheritance tree size

this one goes out to all my object oriented enjoyers out there–please keep inheritance to a minimum i’m begging you. i know that it’s useful, but it becomes incredibly painful when you’re however many layers deep and you have to jump to 5 different definitions just to find the original source of a member in a class. it may mean less code for you to write, but it will ultimately mean that things are harder to understand.

might i suggest instead creating interfaces and inheriting from many of those to keep the inheritance tree shallow while still allowing you to do any of the polymorphism shenanigans you need? it’ll still have the issue of “where the hell is this member coming from”, but it’s at least easier to search through the interfaces when you know everything that’s being inherited from upfront.

avoid code golfing!

i know how enticing it is to write really short snippets of code that do a lot. but let’s be honest with ourselves for a second here, this kinda sucks:

if ((a && b && (! c || d)) || e) {
    // do something
}

because not only is it hard to understand what it does, it’s also really awful to debug. consider something like this:

const bool condition1 = a && b;
const bool condition2 = ! c || d;
const bool condition3 = condition1 && condition2;
if (condition3 || e) {
    // do something
}

yeah it’s more lines of code, but it’s so much easier to read, and it’s also easier to debug. consider that now you can write

printf("c1 (%d), c2 (%d), c3(%d), e (%d)\n",
    condition1, condition2, condition3, e);

whenever you need, without having to do a quick refactor just to be able to extract that info. plus, having distinct lines like this means you’re giving your debugger more places to be able to break on if you need to.

another neat effect of writing your code like this is that you’re now given the chance to give meaningful names to the conditions you’re checking for, which makes this kinda code so much more readable. really, i think exercising this opportunity give names to evaluated values can do a lot for making your code comment itself meaningfully, since it allows your code to read more like an actual thought process instead of a bunch of arcane computer-y stuff.

don’t get sucked into the idea of “modern code”

hoo boy, this one may ruffle some feathers, but i feel like i need to get this out there:

shorter code does not mean better code.

mind you, i think this comes from a good place. once upon a time we had to write so much boilerplate just to get things done (see: c/c++ header files, java get/set method patterns) and it was pretty awful. one of the best things modern languages can do is reduce boilerplate like this which serves no helpful purpose to the programmer. however i think we may get a bit too eager to reduce our code, extending this treatment to things beyond boilerplate and instead onto regular logic. take for example views in c++:

auto ints = { 0, 1, 2, 3, 4, 5 };
const auto is_even = [](int num) { return num % 2 == 0; };
const auto double_value = [](int num) -> int { return num * 2; };
for (int i : ints | std::views::filter(ints) | std::views::transform(double_value)) {
    printf("%d\n", i);
}

you can probably have a good guess at what this code does, but it’s kinda unclear without looking at the documentation or seeing the output. this is going to be the case for a lot of people too, since this is a newer c++20 feature (and how many of you are actually keeping up to date with new language/standard library features?). so what if we were to write this in simpler, more verbose code?

std::vector<int> ints = { 0, 1, 2, 3 , 4, 5 };
for (int i : ints) {
    bool is_even = i % 2 == 0;
    if (! is_even) return;

    int doubled = i * 2;
    printf("%d\n", doubled);
}

now the logic flows downwards, and as we read the code we can mentally accumulate the steps in logic in our head. because of this, it’s become so much clearer what the code actually does, and at the cost of a couple extra lines.

now look, i get that this is a very simple example, but at least in my own programming experience i’m yet to come across situations where the “modern” approach (read: coated in syntactical sugar and uses more obscure new language features) has read better, because it ends up being less expressive. and that’s the key thing i want you to take away from this, even if you ignore everything else: you should be writing code that is expressive.

when to write a comment

you will need to write comments occasionally, this is unavoidable. but perhaps think of them less as a written guided tour through the code, and instead as a way to explain the broad strokes of a section of code, plus any rationale involved. for example:

LEDDriver::set_brightness(float value) {
    // we want as much brightness as we can get, so always
    // max out the dot current
    const uint8_t brightness = static_cast<uint8_t>(value * 255);
    m_i2c.write(Registers::kDotCurrent, 0xFF);
    m_i2c.write(Registers::kPWMBrightness, brightness);
}

the code itself is already easy enough to read, but you can’t convey reason through the code particularly well. without the comment, it would seem strange to always max out the dot current, but with the rationale we now have a better idea of why we’re doing that. in cases like these, 100% write a comment to justify the existence of that code. especially cause if you don’t, then someone may come along one day thinking “this seems silly, i should remove it” only to break everything in weird ways.

closing thoughts

once again, i think that the idea of ditching comments in favour of self commenting code is a silly one. you need a healthy mix of readable, expressive code and helpful comments to ensure that any code you produce is highly maintainable. but, i do also want to stress that none of what i’ve said should be taken as gospel–to be prescriptive about how you write code ignores the reality of your situation. that, and these ideas shouldn’t be applied everywhere since it can start going in the other direction–overly verbose code creates too much noise that obscures the actual functionality of the code.

it may be the more “current year internet” thing to be incredibly absolute and assertive about “x way is the only way you should be doing things”, but reality is never that simple and the same thing rings true here. really, i encourage you to take what i’ve said onboard and try to apply it to your own code. over time you’ll learn when to apply these ideas and when you maybe shouldn’t.