• +43 660 1453541
  • contact@germaniumhq.com

Documenting Legacy Code


Documenting Legacy Code

We’re almost done. We covered how to build the project, how to navigate around it using grep, ctags and UML, and also how to approach finding out the issue in a structured approach. The last step is to plan for the future, and document our findings.

So let’s start with the not obvious: I am not a fan of documenting code. I am actually refraining from documenting, because the code needs to be readable. If we have to change it with an English description on what it does, it’s probably not so readable. Regardless, there are use cases when documentation becomes necessary, so let’s see what we are to document:

  1. The whys: Document why the code is doing some decisions it’s far more valuable, esp. when multiple options are available to solve the same problem. I can see what the code does, I can’t reason on why it does that.

  2. The flags: If a program needs flags to be executed, or some environment settings, these need to be present in the documentation as well. People can’t magically know what’s in 7 layers deep after the application has started. This is especially true for environments that load their components dynamically, such as spring boot.

  3. The architecture: For this UML is a great fit. A lot of the time behavior is dictated by a place of the component in the overall system. To put it in context, having this diagram I find it of utmost importance. Imagine all you get is the ocde of some shared library. To be able to reason on changes impact, we need to see it how it ties to the system.

Documentation serves of an easy way for us to reenter the project domain space again, after some time of pausing. We just recheck our notes, and we are faster in dwelling into the project. As time passes by, we’ll see that knowledge solidifies in our head, so some of the diagrams (esp the architectural ones) are less needed. That’s just us growing up in knowledge.

PS: From all the things I’ve spoke about I use some more than other:

  • CI/CD always

  • grep/find almost all the time

  • UML - when spagetti code is spagetti

  • ctags - especially on C++ in multiprojects, or opensource. I index them together, since it’s faster than a full fledge IDE setup.

  • scientific testing - when the bug is really convoluted, or the space is too big (ie too much code to cover)

  • documenting - always. I even keep a private separate "documentation" that’s just a bunch of small notes

Ok, that ends up our series, have fun working with legacy code.