Mask Sensitive Data in Logs With Logback – 用Logback屏蔽日志中的敏感数据

最后修改: 2021年 6月 14日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

With the large amount of data being logged, it’s important to mask sensitive details of the users while logging. In the new GDPR-present world, among many concerns, we must give special attention to logging sensitive data of individuals.

随着大量的数据被记录,在记录时掩盖用户的敏感细节是很重要的。在新的GDPR呈现的世界中,在众多的关注中,我们必须特别注意记录个人的敏感数据。

In this tutorial, we’ll see how to mask sensitive data in logs with Logback. Overall, this approach isn’t the real way to solve the problem – it’s kind of a last line of defense for our log files.

在本教程中,我们将看到如何用Logback屏蔽日志中的敏感数据。总的来说,这种方法并不是解决问题的真正方法–它算是我们日志文件的最后一道防线。

2. Logback

2.回数

Logback is one of the most widely used logging frameworks in the Java Community. It’s a replacement for its predecessor, Log4j. It offers a faster implementation than Log4j, and it provides more options for configuration and more flexibility in archiving old log files.

Logback是Java社区中使用最广泛的日志框架之一。它是其前身Log4j的替代者。它提供了比Log4j更快的实现方式,而且它提供了更多的配置选项,在归档旧的日志文件方面具有更大的灵活性。

Sensitive data is any information that is meant to be protected from unauthorized access. This can include anything from personally identifiable information (PII), such as Social Security numbers, to banking information, login credentials, address, email, and others.

敏感数据是指任何需要保护的信息,以防止未经授权的访问。这可以包括从个人身份信息(PII),如社会安全号码,到银行信息、登录凭证、地址、电子邮件和其他任何信息。

We’ll mask the sensitive data that belongs to users while logging within our application.

我们将掩盖属于用户的敏感数据,同时在我们的应用程序内进行登录。

3. Masking Data

3.屏蔽数据

Let’s say we log user details in the context of a web request. We need to mask the sensitive data related to users. Let’s assume our application receives the following request or response that we logged:

假设我们在网络请求的背景下记录用户的详细信息。我们需要屏蔽与用户有关的敏感数据。让我们假设我们的应用程序收到以下我们记录的请求或响应。

{
    "user_id":"87656",
    "ssn":"786445563",
    "address":"22 Street",
    "city":"Chicago",
    "Country":"U.S.",
    "ip_address":"192.168.1.1",
    "email_id":"spring@baeldung.com"
 }

Here, we can see that we have sensitive data like ssn, address, ip_address, and email_id. Hence, we have to mask this data while logging.

在这里,我们可以看到我们有敏感数据,如ssnaddressip_addressemail_id。因此,我们必须在记录时屏蔽这些数据。

We’ll mask the logs centrally by configuring masking rules for all log entries produced by Logback. In order to do that, we have to implement a custom ch.qos.logback.classic.PatternLayout.

我们将通过为Logback产生的所有日志条目配置屏蔽规则来集中屏蔽这些日志。为了做到这一点,我们必须实现一个自定义的ch.qos.logback.classic.PatternLayout

3.1. PatternLayout

3.1.PatternLayout[/em]

The idea behind the configuration is to extend every Logback appender we need with a custom layout. In our case, we’ll write a MaskingPatternLayout class as an implementation of PatternLayout. Each mask pattern represents the regular expression that matches one type of sensitive data.

配置背后的想法是用一个自定义的布局来扩展我们需要的每一个Logback appender。在我们的案例中,我们将编写一个MaskingPatternLayout类作为PatternLayout的实现。每个掩码模式代表匹配一种敏感数据的正则表达式。

Let’s build the MaskingPatternLayout class:

让我们来构建MaskingPatternLayout类。

public class MaskingPatternLayout extends PatternLayout {

    private Pattern multilinePattern;
    private List<String> maskPatterns = new ArrayList<>();

    public void addMaskPattern(String maskPattern) {
        maskPatterns.add(maskPattern);
        multilinePattern = Pattern.compile(maskPatterns.stream().collect(Collectors.joining("|")), Pattern.MULTILINE);
    }

    @Override
    public String doLayout(ILoggingEvent event) {
        return maskMessage(super.doLayout(event));
    }

    private String maskMessage(String message) {
        if (multilinePattern == null) {
            return message;
        }
        StringBuilder sb = new StringBuilder(message);
        Matcher matcher = multilinePattern.matcher(sb);
        while (matcher.find()) {
            IntStream.rangeClosed(1, matcher.groupCount()).forEach(group -> {
                if (matcher.group(group) != null) {
                    IntStream.range(matcher.start(group), matcher.end(group)).forEach(i -> sb.setCharAt(i, '*'));
                }
            });
        }
        return sb.toString();
    }
}

The implementation of PatternLayout.doLayout() is responsible for masking matched data in each log message of our application if it matches one of the configured patterns.

PatternLayout.doLayout()的实现负责在我们应用程序的每条日志消息中屏蔽匹配的数据,如果它与配置的模式之一相匹配。

The maskPatterns list from logback.xml constructs a multiline pattern. Unfortunately, the Logback engine does not support constructor injection. If it comes as a list of properties, addMaskPattern is invoked for every config entry. So, we have to compile the pattern every time we add a new regex to the list.

来自logback.xml maskPatterns 列表构建了一个多线模式。不幸的是,Logback引擎不支持构造函数注入。如果它以属性列表的形式出现,addMaskPattern就会为每个配置项被调用。因此,我们必须在每次向列表中添加新的重组词时编译该模式。

3.2. Configuration

3.2.配置

In general, we can use regex patterns for masking sensitive user details.

一般来说,我们可以使用重码模式来掩盖敏感的用户细节

For example, for the SSN, we can use a regex like:

例如,对于SSN,我们可以使用一个类似的重码。

\"SSN\"\s*:\s*\"(.*)\"

And for the address, we can use:

而对于地址,我们可以使用。

\"address\"\s*:\s*\"(.*?)\" 

Furthermore, for the IP address data pattern (192.169.0.1), we can use the regex:

此外,对于IP地址数据模式(192.169.0.1),我们可以使用regex。

(\d+\.\d+\.\d+\.\d+)

Finally, for email, we can write:

最后,对于电子邮件,我们可以写。

(\w+@\w+\.\w+)

Now, we’ll add these regex patterns in maskPattern tags inside our logback.xml file:

现在,我们将在maskPattern标签中,在我们的logback.xml文件中添加这些重码模式。

<configuration>
    <appender name="mask" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
           <layout class="com.baeldung.logback.MaskingPatternLayout">
	       <maskPattern>\"SSN\"\s*:\s*\"(.*?)\"</maskPattern> <!-- SSN JSON pattern -->
	       <maskPattern>\"address\"\s*:\s*\"(.*?)\"</maskPattern> <!-- Address JSON pattern -->
	       <maskPattern>(\d+\.\d+\.\d+\.\d+)</maskPattern> <!-- Ip address IPv4 pattern -->
	       <maskPattern>(\w+@\w+\.\w+)</maskPattern> <!-- Email pattern -->
	       <pattern>%-5p [%d{ISO8601,UTC}] [%thread] %c: %m%n%rootException</pattern>
            </layout>
        </encoder>
    </appender>
</ configuration>

3.3. Execution

3.3.执行

Now, we’ll create the JSON for the above example and use logger.info() to log the details:

现在,我们将为上述例子创建JSON,并使用logger.info()来记录细节。

Map<String, String> user = new HashMap<String, String>();
user.put("user_id", "87656");
user.put("SSN", "786445563");
user.put("address", "22 Street");
user.put("city", "Chicago");
user.put("Country", "U.S.");
user.put("ip_address", "192.168.1.1");
user.put("email_id", "spring@baeldung.com");
JSONObject userDetails = new JSONObject(user);

logger.info("User JSON: {}", userDetails);

After executing this, we can see the output:

执行后,我们可以看到输出。

INFO  [2021-06-01 16:04:12,059] [main] com.baeldung.logback.MaskingPatternLayoutExample: User JSON: 
{"email_id":"*******************","address":"*********","user_id":"87656","city":"Chicago","Country":"U.S.", "ip_address":"***********","SSN":"*********"}

Here, we can see that the user JSON in our logger has been masked:

在这里,我们可以看到,我们的记录器中的用户JSON已经被屏蔽了。

{
    "user_id":"87656",
    "ssn":"*********",
    "address":"*********",
    "city":"Chicago",
    "Country":"U.S.",
    "ip_address":"*********",
    "email_id":"*****************"
 }

With this approach, we can only mask those data in log files for which we’ve defined regular expressions in maskPattern in logback.xml.

通过这种方法,我们只能屏蔽那些我们在logback.xml中的maskPattern定义了正则表达式的日志文件中的数据。

4. Conclusion

4.总结

In this tutorial, we covered how to use the PatternLayout feature to mask sensitive data in application logs with Logback. Also, we saw how to add regex patterns in logback.xml for masking specific data.

在本教程中,我们介绍了如何使用PatternLayout特性来掩盖Logback应用日志中的敏感数据。此外,我们还看到了如何在logback.xml中添加regex模式以屏蔽特定数据。

As usual, code snippets are available over on GitHub.

像往常一样,代码片段可以在GitHub上找到