How To Devise a Logging Strategy
Updated: Oct 21, 2019
Logging tends to be a bit of a shambles on most projects.
Development teams have a tendency to either avoid logging completely or do it excessively. You often see both extremes out in the wild.
It’s easy to identify the teams that have treated logging as an after-thought. They’re the ones that have an on-going “Logging Project” that never seems to end. Or they might have an endless list of logging-related User Stories in the backlog.
At the other end of the scale, you see codebases that are dominated by excessive error handling and logging that serves only to obscure logic. Defensive coding is a good thing but too much leaves programs fat, slow and not very pretty.
Excessive logging statements littered all over the codebase adds complexity and seriously degrades readability. Often this happens when developers don’t know what to do with the exceptions so they just log them and carry on.
This creates bloat and unhandled exceptions leave the program in an unknown state which leads to hard-to-find bugs. The solution is usually more logging which only makes the problem worse.
The truth is logging is hard. It’s hard because it’s a cross-cutting concern that touches every level of the codebase. Logging makes us violate the single responsibility principle and feel dirty.
Even if we try to separate our logic from our logging, we end up with more complexity by adding decorators, dynamic proxies and reflection. This isn’t ideal despite how ‘decoupled’ and clever it seems.
Learn when to handle exceptions
For example, one such rule is that you don’t catch exceptions unless you have a specific and useful recovery to perform. When you do catch an exception, you’ll probably feel an urge to add in some logging. So if you’re violating this rule by catching exceptions and not fixing the problem, you’ll end up with log statements all over the place.
A better approach would be to catch them higher up the stack (ideally as close to the service boundary or program entry point as you can). In many cases, the code in-between shouldn’t have to deal with the exceptions or won’t know what to do with them anyway.
If you do need to catch specific details of an exception then rethrow it (or add it to a new Exception’s InnerException) in the catch block so it preserves the call stack. Don’t silently swallow it. And only log the exception once, not multiple times, as it bubbles upwards.
Logging Rules of Thumb
There’s a lot of information out there on HOW and WHY to log. But there’s not much guidance on WHEN and WHAT to log.
So I’ve scoured the literature and summarised what I believe are the 3 most useful rules of thumb.
Rule 1: Always log to an audience
The number one most important thing when it comes to logging is to know the audience.
This might sound a bit trivial but it’s really important because it signposts the strategy we need to take. Another way of putting this is to think about the requirements of those that depend on the log.
Maybe the audience is other developers, actual users, support staff, system administrators, or the ops team. Put yourself in their shoes and ask yourself ‘Is this message I’m about to log useful?’
But how do you know what’s useful?
Before answering that question, it’s helpful to differentiate between two types of logging: support logging and diagnostic logging.
Support logging refers to errors and info that you’d want to present to your audience (i.e. developers, users, admins, support staff or whoever). Your audience might want to diagnose failures, monitor progress or performance, perform auditing, or just track what‘s happened.
Diagnostic logging, on the other hand, refers to debug and trace messages that are primarily used by yourself as the developer and other developers picking up your code. It’s basically infrastructure for programmers that helps them understand what’s going on in the systems they’re building.
The key difference, however, between these two types of log is in how they should be approached. The right approach will help you find the right balance between bloat and bareness.
This leads us nicely to rule 2.
Rule 2: Support logs should be Test-Driven based on the end-user’s requirements
Basing your logging on unit tests will ensure that you’ve designed a logging strategy rooted in actual requirements. You’ll then know what each log message is for and be sure that it works because it’s covered by tests. As a bonus, your test coverage metric will also be much higher.
It’s pretty easy to mock out logging frameworks, especially if you’re using a great technology like .NET Core, so this should be easy. The hard part is persuading yourself and other developers that testing logging is important.
Diagnostic logging is just scaffolding so doesn’t need to be Test-Driven or as consistent as support logs. If you’re doing Test-Driven Development you’ll probably find that you don’t need to do much diagnostic logging anyway. Your unit tests, not log statements, will detect and catch faults.
One final point on log levels. You’re probably using a logging hierarchy to denote the severity of your logs. Let’s say you’re using Trace, Debug, Info, Warn, Error and Fatal. You might think that Trace or Debug level logs can be excluded from your tests because they’re ‘diagnostic’.
However, if they’re designed to be helpful to and read by other people then they deserve tests. You clearly have an audience in mind to present to when writing the logging code. In this case, they are by definition support logs.
The only logs that don’t require tests are ones you’re probably going to delete anyway.
Rule 3: Treat logging as a feature
Logging shouldn’t be something that developers do if they feel like it. Instead, it should be an important part of the development and deployment cycle. It should be part of your definition of ‘Done’.
As we’ve seen, useful and relevant logging comes from having an end-user in mind. Therefore, logging is a feature of the application not a nice-to-have. When you treat logging as a feature, it’s less likely to get forgotten about.
So there you have it. Three simple rules that can be used to form the basis of a successful logging strategy.
The main takeaway from this article is that if logging is good enough to get into production then it deserves its own tests. This will ensure the logging you do is focused, relevant and serves a purpose.