Logging Best Practices with Example

Alperen Bayramoğlu
5 min readJul 29, 2023

--

Logging is essential for the detection of failures and should be noted first, logging is not only for diagnostics it is an important concern for an application. Logging is used for application performance monitoring, profiling, diagnostics, business dashboards, visualization, app analytics, etc. In this article, I will try to summarize logging best practices with an example.

Photo by NEOM on Unsplash

Firstly we should use a logging library (like log4j, JCL, Logback, Winston …) and not reinvent the wheel and we should log at the appropriate level.

Logging Levels

These logging levels are not mandatory but they are traditional and logging levels may vary by application requirements.

Trace

  • It should not be used in production.
  • It is used for tracing bugs in development time by the developer.
  • Trace logs should not be committed to the VCS. (We may use Git hooks for preventing this)

Debug

  • It should not be used in production.
  • It is used for diagnostic and error-catching.
  • It can be committed to the VCS.
  • Debug level generally logs the application’s inner workings. (Like function calls, parameter passing between functions, etc.):
    — Dbg: Beginning operation XXX
    — Dbg: Operation XXX is done the result is YYY

Info

  • Client processes can be logged at the info level. For example web request of client.

Notice

  • At this level, important events which are not errors are logged.
  • It may be most suitable for a production environment.
  • The validation errors of the sent data of the client may be logged at this level.

Warning

  • Events that may result in an error may be logged at this level.
  • For example, the memory cache is getting closer to the capacity, database call is taking longer than expected, etc.
  • Actions that need to be taken for precaution can be shown to the log reader with a warning message.

Error

  • All possible errors should be logged at this level.
  • API endpoints that return 5xx status should be logged at this level.
  • The Logged errors at this level are not user/client errors rather they are application errors like exceptions.

Fatal

  • The worst errors are logged at this level. (e.g. Segmentation faults)
  • Generally after logging the error at this level, the application will exit with a status non-zero.

Best Practices

Use timestamps at logs and consider these timestamps are in format UTC and standard ISO8601 date format.

Log messages should be comprehensible and include context.

  • Bad: Transaction failed
  • Good: Transaction 324 is failed: the checksum is incorrect
  • Bad: User is registered
  • Good: User with id = 324 is registered as username = uname and email = user@email.com

For easier parsing of the automation tools payload may be separated — Created user records | { count: 5, total: 8 }

Automatic parsing of log messages is essential in most situations we should write log messages considering this.

We should not log Debug, Trace logs at the Production, and logging at the debug level in production should do nothing.

We may include stack-trace while logging exceptions

Thread name, id may be included in logs if we are logging a multithreaded application.

We should not log sensitive information for example API keys, credentials, and passwords as clear text. We can use placeholders:

  • Bad: ‘Error: API Key a4jdg23d49wGDQYhtM6 is incorrect’
  • Good: ‘Error: API Key |HIDDEN| is incorrect.

While writing log messages we should include the location of the unit that logs to the log message so we can easily find the source of error in the large codebase. We can use macros like FILE, and __filenamewhich are defined by the programming environment.

We should implement abstraction for the logging framework and use the logging framework behind an abstraction (framework independent)

Log messages should be consistent. For example, if we are using customer_idfor stating the id of the customer, we should use exactly customer_idin all of the other messages.

Log messages should be deleted or rotated after a specific duration(e.g. last 1 month) or file size. If we are logging into files, we may compress log files for reducing disk usage.

Send log messages to a single service in a distributed environment. (SoC, Fluentd, Sentry) In the distributed environment we may want to store our log messages in a single location and in a structured format(like JSON) so we can automatically parse log messages and do some analysis of them.

Log messages may be written to multiple locations simultaneously like console, file, and remote service(these locations may be named Sinks, Transports)

Structured Logging

Structured logging means logging with a specific structure like JSON, or XML. Automation tools can parse structured logs easily.

Example unstructured log: ‘Error: This is an error message with abc value of 5 at 2023–01–01T00 …

Example structured log:

{
"level": "error",
"timestamp": "2023-01-01T00",
"message": "This is error message",
"context": {
"abc": 5
}
}

For example, we can use the jqcommand line tool to parse from JSON log message:

echo '{"fruit":{"name":"apple","color":"green"}}' | jq '.[] | select(.color=="green")'

Output:


{
"name": "apple",
"color": "green"
}

Key Points

  • We should care about logging because logging is not only used for diagnostics it is also used for profiling, performance monitoring, analytics …
  • Use logging levels like Debug, Trace, Info, Notice, Warning, Error, and Fatal.
  • We should use logging frameworks/libraries and we should use them behind an abstraction.
  • We should include timestamps, context, log level, and log location in log messages. Log messages should be consistent.
  • For a clean look, understandable, and easily parseable logs we should use structured logging.
  • Don’t log sensitive information as clear text. Use placeholders.
  • Log files should be rotated or deleted after a specific period of time or file size and they should be compressed for reducing disk usage.
  • We may send log messages to the logging service in distributed systems.

Example

Thanks for reading. For the example: Logging Abstraction in NodeJS Using Winston, Typescript

References

--

--