1. Introduction
1.导言
Application logs are important resources for troubleshooting, measuring performance, or simply checking the behavior of a software application.
应用程序日志是排除故障、衡量性能或简单检查软件应用程序行为的重要资源。
In this tutorial, we’ll learn how to implement structured logging in Java and the advantages of this technique over unstructured logging.
在本教程中,我们将学习如何在 Java 中实现结构化日志,以及这种技术相对于非结构化日志的优势。
2. Structured vs. Unstructured Logs
2.结构化日志与非结构化日志
Before jumping into code, let’s understand the key difference between unstructured and structured logs.
在开始编写代码之前,我们先来了解一下非结构化日志和结构化日志的主要区别。
An unstructured log is a piece of information printed with consistent formatting but without structure. It’s simply a block of text with some variables concatenated and formatted within it.
非结构化日志是格式一致但没有结构的打印信息。它只是一个文本块,其中包含一些变量的连接和格式化。
Let’s look at one example of an unstructured log taken from a demo Spring application:
让我们来看一个非结构化日志的例子,它来自一个 Spring 应用程序演示:
22:25:48.111 [restartedMain] INFO o.s.d.r.c.RepositoryConfigurationDelegate - Finished Spring Data repository scanning in 42 ms. Found 1 JPA repository interfaces.
The above log shows the timestamp, log level, fully-qualified class name, and a description of what Spring is doing at that time. It’s a helpful piece of information when we’re observing application behavior.
上面的日志显示了时间戳、日志级别、全称类名以及 Spring 当时正在做什么的描述。这对我们观察应用程序行为很有帮助。
However, it’s harder to extract information from an unstructured log. For instance, it’s not trivial to identify and extract the class name that generated that log, as we might need to use String manipulation logic to find it.
然而,从非结构化日志中提取信息的难度更大。例如,要识别和提取生成该日志的类名并非易事,因为我们可能需要使用 String 操作逻辑才能找到它。
In counterpart, structured logs show each piece of information individually in a dictionary-like way. We can think of them as informational objects instead of Strings. Let’s look at a possible structured log solution applied to the unstructured log example:
与之相对应的是,结构化日志以类似字典的方式单独显示每条信息。我们可以将其视为信息对象,而不是字符串。让我们看看应用于非结构化日志示例的可能的结构化日志解决方案:
{
"timestamp": "22:25:48.111",
"logger": "restartedMain",
"log_level": "INFO",
"class": "o.s.d.r.c.RepositoryConfigurationDelegate",
"message": "Finished Spring Data repository scanning in 42 ms. Found 1 JPA repository interfaces."
}
In a structured log, it’s easier to extract a specific field value since we can access it using its name. Hence, we don’t need to process text and find specific patterns therein to extract information. For example, in our code, we can simply use the class field to access the class name that generated the log.
在结构化日志中,提取特定字段值更加容易,因为我们可以使用其名称进行访问。因此,我们不需要处理文本并在其中找到特定模式来提取信息。例如,在我们的代码中,我们只需使用 class 字段即可访问生成日志的类名。
3. Configuring Structured Logs
3.配置结构化日志
In this section, we’ll dive into the details of implementing structured logging in Java applications using logback and slf4j libraries.
在本节中,我们将深入探讨在 Java 应用程序中使用 logback 和 slf4j 库实现结构化日志的细节。
3.1. Dependencies
3.1 依赖性
To make things work properly, we need to set a few dependencies into our pom.xml file:
为使工作正常进行,我们需要在 pom.xml 文件中设置一些依赖项:
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>2.0.9</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.4.14</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-core</artifactId>
<version>1.4.14</version>
</dependency>
The slf4j-api dependency is a facade to the logback-classic and logback-core dependencies. They work together to implement the logging mechanism with ease in Java applications. Note that if we’re using Spring Boot, then we don’t need to add these three dependencies because they are a child of spring-boot-starter-logging.
slf4j-api 依赖项是 logback-classic 和 logback-core 依赖项的门面。它们协同工作,在 Java 应用程序中轻松实现日志机制。请注意,如果我们使用的是 Spring Boot,则无需添加这三个依赖项,因为 它们是 spring-boot-starter-logging. 的子依赖项。
Let’s add another dependency, logstash-logback-encoder, that helps to implement structured log formats and layouts:
让我们添加另一个依赖项,logstash-logback-encoder,它有助于实现结构化日志格式和布局:
<dependency>
<groupId>net.logstash.logback</groupId>
<artifactId>logstash-logback-encoder</artifactId>
<version>7.4</version>
</dependency>
Remember to always use the latest possible version of the dependencies mentioned.
请记住,一定要尽可能使用上述依赖项的最新版本。
3.2. Configuring the Basics of logback for Structured Logs
3.2.为结构化日志配置 logback 的基本功能
To log information in a structured way, we need to configure logback. To do so, let’s create a logback.xml file with some initial content:
要以结构化的方式记录信息,我们需要配置 logback 。为此,让我们创建一个包含一些初始内容的 logback.xml 文件:
<configuration>
<appender name="jsonConsoleAppender" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="net.logstash.logback.encoder.LogstashEncoder">
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="jsonConsoleAppender"/>
</root>
</configuration>
In the above file, we configured an appender named jsonConsoleAppender that uses the existing ConsoleAppender class from logback-core as its appender.
在上述文件中,我们配置了一个名为 jsonConsoleAppender 的 应用程序,它使用 logback-core 中现有的 ConsoleAppender 类作为其应用程序。
We’ve also set an encoder pointing to the LogstashEncoder class from the logback-encoder library. That encoder is responsible for transforming a log event into JSON format and outputting the information.
我们还设置了 encoder 指向 logback-encoder 库中的 LogstashEncoder 类。该编码器负责将日志事件转换为 JSON 格式并输出信息。
With all that set, let’s see a sample log entry:
准备就绪后,让我们来看看日志条目示例:
{"@timestamp":"2023-12-20T22:16:25.2831944-03:00","@version":"1","message":"Example log message","logger_name":"info_logger","thread_name":"main","level":"INFO","level_value":20000,"custom_message":"my_message","password":"123456"}
The above log line is structured in a JSON format with metadata information and customized fields like message and password.
上述日志行采用 JSON 格式,包含元数据信息和自定义字段(如 message 和 password )。
3.3. Improving Structured Logs
3.3.改进结构化日志
To make our logger more readable and secure, let’s modify our logback.xml:
为了使我们的日志记录器更具可读性和安全性,让我们修改logback.xml:
<configuration>
<appender name="jsonConsoleAppender" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="net.logstash.logback.encoder.LogstashEncoder">
<includeCallerData>true</includeCallerData>
<jsonGeneratorDecorator class="net.logstash.logback.decorate.CompositeJsonGeneratorDecorator">
<decorator class="net.logstash.logback.decorate.PrettyPrintingJsonGeneratorDecorator"/>
<decorator class="net.logstash.logback.mask.MaskingJsonGeneratorDecorator">
<defaultMask>XXXX</defaultMask>
<path>password</path>
</decorator>
</jsonGeneratorDecorator>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="jsonConsoleAppender"/>
</root>
</configuration>
Here, we’ve added a few tags to improve the readability of the output, added more metadata, and obfuscated some fields. Let’s look at each one individually:
在这里,我们添加了一些标签以提高输出的可读性,添加了更多元数据,并对一些字段进行了混淆处理。让我们逐一查看:
- configuration: The root tag containing the logging configuration
- appender name: The appender name that we’ve defined to reuse in other tags
- appender class: The fully-qualified name of the class that implements the logging appender. We’ve used the ConsoleAppender class from logback-core.
- encoder class: The logging encoder implementation, which in our case is the LogstashEncoder from logstash-logback-encoder
- includeCallerData: Adds more information about the caller code that originated that log line
- jsonGeneratorDecorator: To print JSON in a pretty format, we’ve added that tag with a nested decorator tag that references the CompositeJsonGeneratorDecorator class.
- decorator class: We’ve used the PrettyPrintingJsonGeneratorDecorator class to print the JSON output in a prettier way, showing each field in a different line.
- decorator class: Here, the MaskingJsonGeneratorDecorator class obfuscates any field data.
- defaultMask: The mask that substitutes the fields defined in the path tag. This is useful to mask sensitive data and make our applications PII complainant when using structured logs.
- path: The field name to apply the mask defined in the defaultMask tag
With the new configuration, the same logs of section 3.2. should look similar to:
使用新配置后,3.2.节中的相同日志应与之相似:
{
"@timestamp" : "2023-12-20T22:44:58.0961616-03:00",
"@version" : "1",
"message" : "Example log message",
"logger_name" : "info_logger",
"thread_name" : "main",
"level" : "INFO",
"level_value" : 20000,
"custom_message" : "my_message",
"password" : "XXXX",
"caller_class_name" : "StructuredLog4jExampleUnitTest",
"caller_method_name" : "givenStructuredLog_whenUseLog4j_thenExtractCorrectInformation",
"caller_file_name" : "StructuredLog4jExampleUnitTest.java",
"caller_line_number" : 16
}
4. Implementing Structured Logs
4.实施结构化日志
To illustrate structured logging, we’ll use a demo application with a User class.
为了说明结构化日志,我们将使用一个带有 User 类的演示应用程序。
4.1. Creating the Demo User Class
4.1.创建演示用户类 4.1.
Let’s first create a Java POJO named User:
让我们首先创建一个名为 User 的 Java POJO:
public class User {
private String id;
private String name;
private String password;
// getters, setters, and all-args constructor
}
4.2. Exercising Use-Cases of Structured Loggers
4.2.结构化日志记录器用例演练
Let’s create a test class to illustrate the usage of structured logging:
让我们创建一个测试类来说明结构化日志的用法:
public class StructuredLog4jExampleUnitTest {
Logger logger = LoggerFactory.getLogger("logger_name_example");
//...
}
Here, we created a variable to store an instance of the Logger interface. We’ve used the LoggerFactory.getLogger() method with an arbitrary name as a parameter to get a valid implementation of Logger.
在这里,我们创建了一个变量来存储 Logger 接口的实例。我们使用了带有任意名称作为参数的 LoggerFactory.getLogger() 方法来获取 Logger 的有效实现。
Now, let’s define a test case to print a message at info level:
现在,让我们定义一个测试用例,在 info 级别打印一条信息:
@Test
void whenInfoLoggingData_thenFormatItCorrectly() {
User user = new User("1", "John Doe", "123456");
logger.atInfo().addKeyValue("user_info", user)
.log();
}
In the above code, we’ve defined a User with some data. Then, we used the addKeyValue() method of the LoggingEventBuilder to append the user_info information to the logger variable created before.
在上述代码中,我们定义了一个包含一些数据的 User 。然后,我们使用 LoggingEventBuilder 的 addKeyValue() 方法将 user_info 信息附加到之前创建的 logger 变量中。
Let’s see how the logger outputs the log with the newly added information user_info:
让我们看看 logger 如何输出带有新添加信息 user_info 的日志:
{
"@timestamp" : "2023-12-21T23:58:03.0581889-03:00",
"@version" : "1",
"message" : "Processed user succesfully",
"logger_name" : "logger_name_example",
"thread_name" : "main",
"level" : "INFO",
"level_value" : 20000,
"user_info" : {
"id" : "1",
"name" : "John Doe",
"password" : "XXXX"
},
"caller_class_name" : "StructuredLog4jExampleUnitTest",
"caller_method_name" : "whenInfoLoggingData_thenFormatItCorrectly",
"caller_file_name" : "StructuredLog4jExampleUnitTest.java",
"caller_line_number" : 21
}
Logs are also helpful in identifying errors in our code. Thus, we can also use LoggingEventBuilder to illustrate error logging in a catch block:
日志还有助于识别代码中的错误。因此,我们还可以使用 LoggingEventBuilder 来说明 catch 块中的错误日志:
@Test
void givenStructuredLog_whenUseLog4j_thenExtractCorrectInformation() {
User user = new User("1", "John Doe", "123456");
try {
throwExceptionMethod();
} catch (RuntimeException ex) {
logger.atError().addKeyValue("user_info", user)
.setMessage("Error processing given user")
.addKeyValue("exception_class", ex.getClass().getSimpleName())
.addKeyValue("error_message", ex.getMessage())
.log();
}
}
In the test above, we’ve added more key-value pairs for the exception message and class name. Let’s see the log output:
在上面的测试中,我们为异常消息和类名添加了更多键值对。让我们看看日志输出:
{
"@timestamp" : "2023-12-22T00:04:52.8414988-03:00",
"@version" : "1",
"message" : "Error processing given user",
"logger_name" : "logger_name_example",
"thread_name" : "main",
"level" : "ERROR",
"level_value" : 40000,
"user_info" : {
"id" : "1",
"name" : "John Doe",
"password" : "XXXX"
},
"exception_class" : "RuntimeException",
"error_message" : "Error saving user data",
"caller_class_name" : "StructuredLog4jExampleUnitTest",
"caller_method_name" : "givenStructuredLog_whenUseLog4j_thenExtractCorrectInformation",
"caller_file_name" : "StructuredLog4jExampleUnitTest.java",
"caller_line_number" : 35
}
5. Advantages of Structured Logging
5.结构化日志的优势
Structured logging has some advantages over unstructured logging, like readability and efficiency.
结构化日志与非结构化日志相比有一些优势,比如可读性和效率。
5.1. Readability
5.1.可读性
Logs are typically one of the best tools to troubleshoot software, measure performance, and check if the applications behave as expected. Thus, creating a system where we can read log lines more easily is important.
日志通常是排除软件故障、衡量性能和检查应用程序是否按预期运行的最佳工具之一。因此,创建一个能让我们更轻松地读取日志行的系统非常重要。
Structured logs show data as a dictionary, which makes it easier for the human brain to search for a specific field across the log line. It’s the same concept as searching for a specific chapter in a book using an index versus reading the content page by page.
结构化日志以字典形式显示数据,使人脑更容易在日志行中搜索特定字段。这与使用索引搜索一本书中的特定章节与逐页阅读内容的概念相同。
5.2. Efficiency
5.2 效率
In general, data visualization tools like Kibana, New Relic, and Splunk use a query language to search for a specific value across all log lines in a specific time window. Log search queries are easier to write when using structured logging since the data is in a key-value format.
一般而言,Kibana、New Relic 和 Splunk 等数据可视化工具使用查询语言在特定时间窗口内搜索所有日志行中的特定值。使用结构化日志时,日志搜索查询更易于编写,因为数据是 key-value 格式。
Additionally, using structured logging, it’s easier to create business metrics about the data provided. In that case, searching for business data in a consistent, structured format is easier and more efficient than searching for specific words in the whole log text.
此外,使用结构化日志更容易创建有关所提供数据的业务指标。在这种情况下,以一致的结构化格式搜索业务数据要比在整个日志文本中搜索特定字词更容易、更高效。
Finally, queries to search structured data use less complex algorithms, which might decrease cloud computing costs depending on the tool used.
最后,搜索结构化数据的查询使用不太复杂的算法,这可能会降低云计算成本,具体取决于所使用的工具。
6. Conclusion
6.结论
In this article, we saw one way to implement structured logging in Java using slf4j and logback.
在本文中,我们介绍了使用 slf4j 和 logback 在 Java 中实现结构化日志的一种方法。
Using formatted, structured logs allows machines and humans to read them faster, making our application easier to troubleshoot and reducing the complexity of consuming log events.
使用格式化、结构化的日志可以让机器和人类更快地读取日志,从而使我们的应用程序更容易排除故障,并降低消耗日志事件的复杂性。
As always, the source code is available over on GitHub.
与往常一样,源代码可在 GitHub 上获取。