Code Comments

29 September 2024

Two things happened at work recently that made me think about how I comment code. One of my more junior colleagues asked for feedback on their work and we ended up talking about how to comment code; on the same day I was trying to understand some code written by a colleague who left recently and wishing they had done a better job adding comments.

Thinking about how I comment code, I realized that I’ve ended up with my own idiosyncratic conventions for different categories of code comments and I decided to write them up here, together with some reflections on what I could improve.

Interface documentation

The first category is comments that document an interface. I don’t mean formal interface documentation that you’d use to document an API you offer to your customers, but things like the comment you add right before a function declaration. Interfaces internal to a codebase.

Interface documentation lets other programmers use an interface without having to read and understand the code that implements it, so it’s part of writing modular code. Sometimes programmers forget that and instead treat writing interface documentation as a chore. Tools that check you’ve documented each parameter, return value, etc encourage that mindset.

Most programming languages have conventions for interface documentation. It’s tricky for some other types of interfaces, though. For example, a database schema is an interface shared by all functions that accesse the database directly. How do you document that? For the Postgres databases we use in my team at work we haven’t found a good way, so right now those are undocumented just due to a lack of tooling.

Describe things that aren’t clear from the code

This follows the conventional advice for commenting: explain code that’s hard to understand or easy to misunderstand; add comments to warn of potential pitfalls; and explain why you made certain choices. It’s about writing for the reader, anticipating what information would help a programmer reading your code in the future.

The code I mentioned in the introduction (the one that could have benefitted from better comments) included a database query somewhat like this:

select id
from messages
where id in (select id from messages)
limit 10 for update skip locked

Ignoring the last line for now, this uses a subquery to get all message IDs, then gets the IDs of those messages. Which is a bit redundant. Reading this code, I’m 90% sure the subquery was just a mistake… but still, maybe it somehow tricks the database into running the query differently? Either way, I wish the person who wrote the code had taken the time to add a comment: if there’s a good reason for the subquery, they could have explained it, and if not they would have realized that while writing the comment and simplified the query.

There’s one detail that I didn’t appreciate until the more junior colleague I mentioned before pointed it out. Remember my slightly pretentious remark about “writing for the reader”? I always think of the reader of the code as someone just like me, but that’s misleading since most of my teammates are less experienced than me. (Some of them are students working part-time with us.)

With the SQL query above, would a comment that explains the for update skip locked part be a good idea? Until recently, I would have said no – I already know what it means, and otherwise I wouldn’t mind looking it up in the Postgres documentation. But someone more junior would probably benefit from a short explanation.

TODO comments

This is something I do a lot while implementing new functionality: add a comment prefixed with “TODO” to indicate changes I still need to make. This way there’s fewer things I have to keep in mind, and it makes it easy to see where I still need to make changes. Usually, all of the TODOs will be gone by the time by the time I finish the task I’m working on.

Some codebases have a lot of TODO (or similar) comments that stay in the code for a long time. I think that’s distracting for anyone reading the code so I normally avoid it. If there’s a TODO the code should actually be broken.

Section headers

This last one is something I do a lot but I don’t know how common it is. What I do is if I have a long function or method, I’ll split its code into sections separated by newlines and add a comment before each section that concisely says what it does. It looks a bit like this:

Screenshot of pseudocode with comments

The purpose of this is that when I look at the function in the future, I can just look at the comments first to get a quick overview of what the different parts of the function do. Then when I’ve found the relevant section I’ll actually read the code. (This works best with syntax highlighting that helps you focus only on comments.)

The section headers sometimes include trivial comments that seem to explain very obvious code. I think that’s fine in this case.