Sinclair Target

How to Make Your Logs Less Noisy in Python

Mar 13, 2024

The logging module in the Python standard library has an extremely flexible design. You can use it to configure almost any conceivable logging scheme. If you wanted every log message from your application read aloud to you, Python’s logging module can do that!

The downside of this flexibility is that it can be hard to set things up when all you want is to log messages to the console. The logging module attempts to solve this problem by providing a helper function called basicConfig(). This function allows you to set a log level for your application and a destination for your log messages without having to understand the underlying object hierarchy of the logging module.

But basicConfig() sets things up in a way that doesn’t work so well, especially if your application has any third-party dependencies. Every time I set up logging for a new project, all I can remember is that basicConfig() is not what I want, even if I’m just trying to log to the console. I then turn with a groan once again to the fairly involved documentation for the logging module to re-learn how to get it right.

So, mostly for the benefit of future me, here’s what I’ve found to be the best way to configure logging in Python while avoiding the problems that come with relying on basicConfig().

Noisy Logs Are The Default

Most people probably first reach for basicConfig() because they want to set the log level of their application to INFO. The problem with basicConfig() is that it sets a log level on something known as the root logger. You tell basicConfig() that you want to set a log level of INFO, and it says, “Sure, how about I make everything log at the INFO level!” The root logger sits at the root of the logger tree (more on this in a minute), so, in the absence of any other configuration, its log level determines the log level of every logger that exists in your application.

If you are writing a script or something with no third-party dependencies, this is fine. If you are building an application in Python with lots of third-party dependencies, you’ve just set the log level for all those dependencies too, so your logs are likely to become cluttered with the log output of the libraries you depend on. In particular, I find that the DEBUG log level becomes unusable—some third-party libraries might not log too many messages at the INFO level, but most are very chatty at the DEBUG log level. If you want to emit DEBUG log messages from your own code, good luck spotting them in the wall of DEBUG log messages coming from code you didn’t write.

Of course, you could use grep or a similar tool to filter the log output down to only the messages you care about. But when I’m developing an application I assume my dependencies work. I don’t care about the DEBUG messages emitted by a third-party library unless I’m actively trying to track down a problem with that library. Why gum up the log output from my application with messages I don’t need?

When I’m setting up logging for my application and choosing a log level, what I’m really trying to do is set a log level for log messages from my own code. If I’m debugging an issue and want to turn on more logging, I want to set the log level to DEBUG and see the extra DEBUG log messages emitted by my own application. It might also be useful to control the log level of specific third-party libraries—but that’s a separate slider that I’ll touch only rarely.

Separate Control of Log Levels

I like this image of a slider controlling a log level. Ideally, we’d have multiple sliders—a control panel of sliders—to control the log levels for different parts of our application. Minimally, we’d like one slider for our application code and another for third-party code, or perhaps a slider for our application code and one separate slider for each third-party library we depend on. If we’re working at a company or organization with internal tools, then maybe we’d also want an additional slider for first-party dependencies.

How can we set up this control panel? There are just two crucial steps.

The first thing we have to do is follow best practices for naming our loggers. I mentioned before that under the hood the Python logging module organizes “loggers” into a tree. The logging module is designed to be used this way but it doesn’t happen automatically. If you only ever log messages by calling logging.info() (i.e. by calling the info() method on the logging module) then implicitly what you’re doing is only ever making use of the root logger.

You can and should create other loggers with different names. You can do this using the getLogger() function in the logging module and passing __name__ (a special variable containing the name of the current module) as the first argument. A logger object is just an interface into the whole logging system that allows you to track where log messages came from. The convention recommended by the standard library documentation is to create one logger for each module in your application using the name of the module as the logger name. That way it’s easy to trace each log message back to the module that emitted it.

If you’ve organized your code into packages, then your module names will look something like toplevelpackage.nextpackage.bar. If you create a logger with this name, then the Python logging module turns the dot-delimited name into a branch of loggers to add to the logger tree. It will create a logger called toplevelpackage that sits just below the root logger, which is a parent to the nextpackage logger, which in turn is a parent to the bar logger.

This tree structure is useful for a few different reasons. What’s most important for us though is that the logging level you set on a parent logger is inherited by all of its descendant loggers (provided the descendant loggers don’t have their own log levels set). If we follow the recommended naming conventions, we then have a way to set log levels for parts of our application module by module or even package by package.

The second thing we have to do is this: When our application starts up, and we want to say, set the log level of our application based on what has been passed to a --loglevel command-line flag (a very common pattern), we should set the log level of the logger highest in the tree that is not the root logger. If we are developing a package, this should be the name of the top-level package in our application. If there is no top-level package because we didn’t organize our code that way, we can still explicitly create a top-level logger for our application (perhaps using the name of the application as the logger name) and ensure our other loggers are children of it.

Any third-party libraries we depend on might create their own loggers, but these will be children of the root logger and not children of our own application-specific logger. So when we set a log level on the top-level logger in our application but leave the root logger alone, then we don’t have to worry about enabling the DEBUG log level, since it’s only our own application code that we’d be affecting!

I’m glossing over some details here. Let’s take a look at how we might implement this and I’ll explain the parts I’ve skipped over so far.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
def configure_logging(log_level):
    # Choose formatting of log messages
    console_formatter = logging.Formatter(
        "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
    )

    # Create console handler. No log level is set, so logs everything
    console_handler = logging.StreamHandler()
    console_handler.setFormatter(console_formatter)

    # Root logger is responsible for logging to console. (Has handler attached.)
    root_logger = logging.getLogger()
    root_logger.addHandler(console_handler)
    root_logger.setLevel(logging.WARN)  # High default threshold for logging

    # Set up application logger
    application_logger = logging.getLogger(__package__)

    if isinstance(log_level, str):
        log_level = getattr(logging, log_level.upper())
    application_logger.setLevel(log_level)

The above code block contains a single function that we can call to set the log level of our application on startup. I’ve chosen to format my log messages here in a way that I like, but otherwise this code configures the root logger in a manner very similar to how the root logger would be configured by default. We attach a StreamHandler to the root logger, which means all of our log messages from throughout our application (including third-party libraries) will get logged to the stderr stream. We set the root logger to have a log level of WARN, which means that we will only hear from our third-party libraries if something bad happens. This is also the default in Python if you don’t explicitly configure logging.

The important part of the configure_logging() function is that it takes the caller-supplied log_level and sets it on the top-level application logger. This implementation assumes that the module containing configure_logging() resides in the top-level package of your application, so it uses the special variable __package__ as the top-level application logger name. If you’ve named your other loggers according to the recommended convention, then this function has set you up so that only your own logger objects are logging at the given log_level.

If you later decide you want to see those intrusive third-party logs, all you need to do is lower the log level set on the root logger. Alternatively, you could set a lower log level on the logger for just the particular third-party lib you need log messages from.

As you can see, we’ve had to dip our toes into the logging object hierarchy here. But the effort gives us a configuration miles better than basicConfig().

First-Party Dependencies

I mentioned before that we might also want to have a slider for first-party dependencies. Maybe you find yourself needing to debug the behavior of your colleague’s code more often than you find yourself debugging your third-party dependencies. In that case, it would be nice to to be able to enable the DEBUG log level for that first-party code without affecting the log level of your third-party dependencies.

One way to accomplish that could be to wrap getLogger() with a function like this:

1
2
3
4
5
ORG_NAME = "myorg"


def get_logger(logger_name):
    return logging.getLogger(f"{ORG_NAME}.{logger_name}")

If you use this function to create loggers across your organization, then all you need to do is set the log level of the ORG_NAME logger, which will be a child of the root logger, to whatever you desire. Only your first-party code will be affected; third-party code will create and use loggers that are not children of the ORG_NAME logger.

Having good logging is critical to understanding what your application is doing as it runs. It might not seem like a big deal to have noisy third-party log messages in your logs, because, after all, that’s just more information than you need, right? The information you do need is still there too. But I’ve found that having to sift through lots of stuff you don’t care about to get to the stuff you do makes logs harder to read and less likely to be relied upon. So do yourself a favor and Marie Kondo those unnecessary third-party log messages out of existence. It will make understanding and debugging your application easier.