How do I write great documentation for my tech project? 

To answer simply, you don’t. If you are reading this blog post looking for a guide on how to write great documentation I’m afraid you are going to leave disappointed. It is my opinion that such a guide can’t exist. Documentation, at all levels in tech, is a complex problem and for anyone to presume they had a solution to that problem would be particularly bold. To paraphrase Socrates, “I am the wisest man alive, for I know one thing, and that is that great documentation does not exist.” 

That said, over the years we are getting better at it, and if you care a lot about this problem you might be able to move the needle a bit and achieve some truly decent documentation. It’s often the most we can hope for. 

What I’m hoping to do in this blog post is to shed some light on why the problem is so hard to solve, go over some of the attempts to solve it, and generally open up a discussion about what we, as members of the tech community, can try to do about it. 

Why is documentation so hard? 

In one of my first-ever lectures in school, the professor said to all the students, “It’s easy to succeed in my course, all you have to do is pay attention and take notes.” At the time it felt like a pretty innocuous statement. All we would have to do was show up, provide our undivided attention, and also write down everything that was happening. 

Then came the first actual lecture. I started by listening intently, taking in what the professor was saying, and trying to break down the concepts in my head such that I could understand and internalize them. I looked down, my notepaper was blank. I had failed part two of how to succeed in the class. 

So next lecture I tried to take better notes. I scribbled line after line of detail as the professor went over the material. Great result, my note paper was full! I read over my notes after class, and in my head, there was a lot of, “Wait, what does that mean?” and, “I don’t understand any of this.” I had failed part one of how to succeed in the class. 

While this story isn’t directly relevant to documentation in tech, it highlights what I see as one of the main problems: it’s really hard to think or do, while also documenting that which you are thinking about, or doing. 

As humans, we are awful at multitasking (no developer really needs convincing of this, even predating memes, Dilbert was making jokes about the damage caused by the dreaded “context switch”). So our options traditionally have been to either think, then document, then do, or think, then do, then document. I must urge you not to start with anything other than thinking, I believe that is how we ended up with PHP. 

In the world of sequential development, you see a lot of “think, document, do” going on. We talk about how we are going to solve the problem, or what the API is going to do, someone writes it all up on a wiki page, and then other people go away and develop that based on what the wiki page says.

The biggest problem with this is it only documents what was expected. Outside of what we can predict, before starting there is a wealth of chaos that gets injected into a project before we are done. Some limitation or other will create an edge behavior which, if not in the original documentation becomes a piece of implicit knowledge, a “known unknown”.

Back in my early days with OpenMarket, the company that Infobip has acquired, there was a famous wiki page titled simply, “Undocumented Magic” with a body that said only, “Good luck out there….” and linked to a second page, “Documented Magic”. 

In more agile development we tend to see more of the “think, do, document” approach. We think a little about what the problem is and what might solve it, do something, evaluate and repeat if needed. Once we’ve finally settled and solved the problem we then document what we did. Great! And like many things agile vs sequential, I do think this is an improvement.

Unfortunately, if you’ve ever finished something you iterated on multiple times throughout the process and then try and write down how it ended up, it’s a pretty heavy process and it’s really hard to do a good job when you just want to think about something else. Think of it like short-timers syndrome for a project: It’s done, and you kind of just want to wash your hands of it, but now you have all this boring writing to do. 

So what do we do? 

There’s genuinely only one real approach that I have seen be effective in tackling this problem. That is to take the documentation from being a static, disconnected entity and make it a living piece of the actual work being done. I’ll run over some examples of how I’ve seen this done at different levels. And, we are in luck, because our industry has developed some solid open-source tools to help support this, so a special shoutout to all the contributors out there. 

At the code level, the technique is pretty simple. All you have to do is structure your code with readability in mind and name everything perfectly. The pros are obvious: your code becomes self-documenting in its existence. Consider the following examples: 

Example 1: 

private final Map<String, Thing> map = new HashMap<>(); 

public void purge() { 
for (Map.Entry thing : map.entrySet()) { 
    		if (thing.getAgeMillis() >= 10000) { 
        			map.values().remove(thing); 
    		} 
        } 
} 

Example 2: 

private final Map<String, Thing> thingsRegistry = new HashMap<>(); 

public void deleteOldThings() { 
   thingsRegistry.values().removeIf(Thing::isOld); 
} 

You get it, right? The cons here are that no one teaches you how to do this well in school, you are expected to write code like example 1 in interviews, and naming things is really, really hard. But, it’s not that hard to do a decent job. I’m looking at you Spring… TransactionAwarePersistenceManagerFactoryProxy… 

At the next level, we have project documentation. This is where tooling comes in handy. Both at OpenMarket and now at Infobip, we’ve seen great success with Swagger generating YAML from our API annotations and having our deployables self-host the Swagger UI which displays that YAML. I just did a google search, and there are so many other options like this out there.

The one real pro here is super strong; your API documentation is accurate and up to date with the function described. We used to document all of our APIs only in wiki pages, and I can’t tell you how many times I’ve referenced a wiki page while writing a curl command only to have to submit a support ticket because the wiki was out of date. I’d rather there just wasn’t a wiki page so I didn’t waste my time.

On the downside the Swagger generation isn’t perfect – you often have to put additional Swagger-specific annotations on your APIs to get the YAML it creates to best describe the function. It also tends to rely on specific frameworks to be in use, and this couples you to a technology stack a bit more than I would like. 

One more level up and we tend to get into a harder space to talk about. But I’ll try anyway! Two tools/approaches I’ve found really useful at this level are behavior-driven development/testing (BDD) and contract testing. Tools like Cucumber for BDD allow not only for behavior to be clearly specified in a “given, when, then” format, but they also take that specification and run it as tests against the product. The result of something like this is that at the end you have not only a clear description of the product’s behavior but an indication that it really does that thing as described.

The drawback with these tools is someone has to write the infrastructure that makes them work, and that’s not free. For contract testing, a tool like pact.io can be really good. It forces you to define contracts between you and your consumers and makes it hard to break those contracts. Again the downside here is that it’s not magic – someone has to put in the work and make it happen. Further, with contracts it takes both sides buying in to make it work. 

Where do we go from here? 

That’s always the question, and the discussion around documentation is alive and well. Tooling improvements for living documentation keep coming, and the approach is regularly being validated as probably the best. 

Putting all that aside, no matter how much better the tools and approaches get, I think it’s important to always keep in mind that the best we can ever hope for is decent. So, ultimately, it’s on all of us as software professionals to keep that in mind: to always be wary – validate and verify risky assumptions, and never ever trust that six-year-old wiki page. 

Good luck out there….