Using Transactions for Read-Only Operations – 在只读操作中使用事务

最后修改: 2022年 5月 23日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this article, we’ll discuss read-only transactions. We’ll talk about their purpose and how to use them, as well as check some of their nuances related to performance and optimization. For the sake of simplicity, we’ll focus on MySQL’s InnoDB engine. But, keep in mind that some of the information described can change depending on the database/storage engine.

在这篇文章中,我们将讨论只读事务。我们将讨论它们的目的和如何使用它们,以及检查它们与性能和优化有关的一些细微差别。为了简单起见,我们将专注于MySQL的InnoDB引擎。但是,请记住,所描述的一些信息会根据数据库/存储引擎的不同而改变。

2. What Is a Transaction?

2.什么是事务?

A transaction is an atomic operation that consists of one or more statements. It’s atomic because all statements within this operation either succeed (are committed) or fail (are rolled back), which means all or nothing. The letter ‘A’ of the ACID properties represents the atomicity of transactions.

事务是一个原子操作,由一个或多个语句组成。它是原子的,因为这个操作中的所有语句要么成功(提交),要么失败(回滚),这意味着要么全部,要么没有。ACID属性中的字母’A’代表事务的原子性。

Another critical thing to understand is that all statements in the InnoDB engine become a transaction, if not explicitly, then implicitly. Such a concept gets a lot harder to understand when we add concurrency to the equation. Then, we need to clarify another ACID property, the ‘I’ of Isolation.

另一个需要理解的关键问题是,InnoDB引擎中的所有语句都会成为一个事务,如果不是显性的,就是隐性的。当我们把并发性加入到方程式中时,这样的概念就变得更难理解了。然后,我们需要澄清另一个ACID属性,即隔离的 “I”。

Understanding the isolation level property is essential for us to be able to reason about trade-offs of performance vs. consistency guarantees. However, before going into details about isolation level, remember that as all the statements in InnoDB are transactions, they can be committed or rolled back. If no transaction is specified, the database creates one, and based on the autocommit property, it may be committed or not.

了解隔离级别属性对于我们能够推理出性能与一致性保证的权衡至关重要。然而,在讨论隔离级别的细节之前,请记住,由于InnoDB中的所有语句都是事务,它们可以被提交或回滚。如果没有指定事务,数据库会创建一个事务,根据autocommit属性,它可能会被提交或不提交。

2.1. Isolation Levels

2.1.隔离级别

For the article, we’ll assume the default one from MySQL — repeatable read. It provides a consistent read within the same transaction, which means that the first read will establish a snapshot (point in time), and all subsequent reads will be consistent with respect to each other. We can refer to the MySQL official documentation for more information about it. Of course, keeping such snapshots has its consequences but guarantees a good consistency level.

在这篇文章中,我们将假设来自MySQL的默认方式–可重复读取。它在同一事务中提供了一致的读取,这意味着第一次读取将建立一个快照(时间点),并且所有后续的读取将彼此一致。我们可以参考MySQL官方文档以了解更多相关信息。当然,保持这样的快照有其后果,但可以保证良好的一致性水平。

Different databases may have other names or isolation level options, but most likely, they’ll be similar.

不同的数据库可能有其他名称或隔离级别选项,但最有可能的是,它们会是类似的。

3. Why and Where to Use a Transaction?

3.为什么和在哪里使用事务?

Now that we understand better what a transaction is and its different properties let’s talk about read-only transactions. As explained earlier, in the InnoDB engine, all statements are transactions, and therefore, they may involve things like locking and snapshots. However, we can see that some of the overhead related to transaction coordination, such as marking rows with transaction IDs and other internal structures, may not be necessary for plain queries. That’s where read-only transactions come into play.

现在我们已经更好地理解了什么是事务以及它的不同属性,让我们来谈谈只读事务。正如前面所解释的,在InnoDB引擎中,所有的语句都是事务,因此,它们可能涉及到像锁定和快照这样的东西。然而,我们可以看到,一些与事务协调相关的开销,例如用事务ID和其他内部结构标记行,对于普通查询来说可能是不必要的。这就是只读事务发挥作用的地方。

We can explicitly define a read-only transaction using the syntax START TRANSACTION READ ONLY. MySQL also tries to detect read-only transitions automatically. But further optimizations can be applied when declaring one explicitly. Read intense applications can leverage those optimizations and save resource utilization on our database cluster.

我们可以使用语法START TRANSACTION READ ONLY明确定义一个只读事务。MySQL也会尝试自动检测只读事务。但在明确地声明一个事务时,可以应用进一步的优化。阅读密集型应用程序可以利用这些优化,并节省我们数据库集群的资源利用率

3.1. Application vs. Database

3.1.应用程序与数据库

We need to know that dealing with persistence layers in our application may involve many layers of abstractions. Each of those layers has a different responsibility. However, to simplify, let’s say that in the end, those layers impact either how our application deals with the database or how the database deals with the data manipulation.

我们需要知道,在我们的应用程序中处理持久化层可能涉及许多抽象层。这些层中的每一层都有不同的责任。然而,为了简化,我们说,最终,这些层会影响我们的应用程序如何处理数据库,或者数据库如何处理数据操作。

Of course, not all applications have all those layers, but it represents a good generalization. Assuming we have a Spring application, in short, these layers serve the purpose of:

当然,不是所有的应用程序都有所有这些层,但它代表了一个很好的概括。假设我们有一个Spring应用程序,简而言之,这些层的作用是:。

DB

  • DAO: Acts as a bridge between business logic and persistence nuances
  • Transactional abstraction: Takes care of the application level complexity of transactions (Begin, Commit, Rollback)
  • JPA Abstraction: Java specification that offers a standard API between vendors
  • ORM Framework: The actual implementation behind JPA (for example, Hibernate)
  • JDBC: Responsible for actually communicating with the database

The main takeaway is that many of those factors may affect how our transactions behave. Nonetheless, let’s focus on a particular property group that directly impacts this behavior. Usually, clients can define those properties at the global or session level. The list of all properties is extensive, so we’ll only discuss two of them that are crucial. However, we should be familiar with them already.

主要的启示是,其中许多因素可能会影响我们的交易行为。尽管如此,让我们关注一下直接影响这种行为的特定属性组。通常情况下,客户可以在全局或会话级别定义这些属性。所有属性的列表很广泛,所以我们只讨论其中两个至关重要的属性。然而,我们应该对它们已经很熟悉了。

3.2. Transaction Management

3.2.事务管理

The way the JDBC driver starts a transaction from the application side is by turning off the autocommit property. It’s the equivalent of a BEGIN TRANSACTION statement, and from that moment on, all the following statements must be committed or rolled back in order to finish the transaction.

JDBC驱动从应用端启动事务的方式是关闭autocommit属性。这相当于一个BEGIN TRANSACTION语句,从那一刻起,所有下面的语句都必须提交或回滚,以便完成事务。

Defined at the global level, this property tells the database to treat all the incoming requests as manual transactions and requires the user to commit or roll back. However, this is no longer valid if the user overrides this definition at the session-level. As a result, many drivers turn this property off by default to guarantee consistent behavior and ensure the application has control over it.

在全局层面上定义,这个属性告诉数据库将所有传入的请求视为手动事务,并要求用户提交或回滚。然而,如果用户在会话级别覆盖了这个定义,这就不再有效了。因此,许多驱动程序默认关闭这个属性,以保证行为的一致性,并确保应用程序对其进行控制。

Next, we can use the transaction property to define if write operations are allowed or not. But there’s a caveat: Even in a read-only transaction, it’s possible to manipulate tables created using the TEMPORARY keyword. This property also has global and session scope, though we normally deal with this and other properties at the session level in our applications.

接下来,我们可以使用transaction属性来定义是否允许写操作。但是有一点需要注意:即使在只读事务中,也可以操作使用TEMPORARY关键字创建的表。这个属性也有全局和会话范围,尽管在我们的应用程序中,我们通常在会话级别处理这个和其他属性。

A caveat is that when using connection pools, due to the nature of opening connections and reusing them. The frameworks or libraries dealing with transactions and connections, have to ensure that the sessions are in a clean state before starting a new transaction.

需要注意的是,在使用连接池时,由于打开连接和重复使用的性质。处理事务和连接的框架或库,必须确保在开始一个新的事务之前,会话处于一个干净的状态。

For this reason, a few statements may be executed to discard any remaining pending changes and make the session set up properly.

由于这个原因,可能会执行一些语句来丢弃任何剩余的待定更改,并使会话设置正确。

We already saw that read-heavy applications could leverage read-only transactions to optimize and save resources in our database cluster. But, many developers also forget that switching between setups also causes round-trips to the database, affecting the throughput of the connections.

我们已经看到,重读的应用程序可以利用只读事务来优化和节省我们数据库集群的资源。但是,许多开发者也忘记了,在设置之间的切换也会导致数据库的往返,影响连接的吞吐量。

In MySQL, we can define those properties at the global level as:

在MySQL中,我们可以在全局层面上定义这些属性为。

SET GLOBAL TRANSACTION READ WRITE;
SET autocommit = 0;
/* transaction */
commit;

Or, we can set the properties at the session level:

或者,我们可以在会话级别设置这些属性。

SET SESSION TRANSACTION READ ONLY;
SET autocommit = 1;
/* transaction */

3.3. Hints

3.3.提示

In the case of transactions that only execute one query, enabling the autocommit property may save us round-trips. If that’s the most common cause in our application, using a separate data source set as read-only and having autocommit enabled by default will work even better.

在事务只执行一个查询的情况下,启用自动提交属性可能会为我们节省往返次数。如果这是我们应用中最常见的原因,那么使用单独的数据源设置为只读并默认启用autocommit,效果会更好。

Now, if transactions have more queries, we should use an explicit read-only transaction. Creating a read-only data source can also help save round trips by avoiding the switch between write and read-only transactions. But, if we have mixed workloads, the complexity of managing a new data source may not justify itself.

现在,如果事务有更多的查询,我们应该使用一个明确的只读事务。创建一个只读数据源也可以通过避免写和只读事务之间的切换来帮助节省往返次数。但是,如果我们有混合的工作负载,管理一个新的数据源的复杂性可能无法证明其合理性

Another important point when dealing with a transaction with multiple statements is to consider the behavior determined by the isolation level, as it can change our transaction’s result and maybe impact performance. For the sake of simplicity, we’ll only consider the default one (repeatable read) during our examples.

在处理有多个语句的事务时,另一个重要的问题是考虑由隔离级别决定的行为,因为它可以改变我们的事务的结果,也许会影响性能。为了简单起见,在我们的例子中,我们将只考虑默认的(可重复读取)。

4. Putting It Into Practice

4.将其付诸实践

Now, from the application side, we’ll try to understand how to deal with those properties and which layers can access such behavior. But, again, it’s clear that there are many different ways of doing it, and depending on the framework, this may change. Therefore, taking JPA and Spring as an example, we can have a good understanding of what it would look like in other situations as well.

现在,从应用方面,我们将尝试理解如何处理这些属性,以及哪些层可以访问这样的行为。但是,还是那句话,很明显,有很多不同的方法,而且根据框架的不同,这可能会发生变化。因此,以JPA和Spring为例,我们可以很好的理解在其他情况下也会是什么样子。

4.1. JPA

4.1 JPA

Let’s see how we can effectively define a read-only transaction in our application using JPA/Hibernate:

让我们看看如何在我们的应用程序中使用JPA/Hibernate有效地定义一个只读事务。

EntityManagerFactory entityManagerFactory = Persistence.createEntityManagerFactory("jpa-unit");
EntityManager entityManager = entityManagerFactory.createEntityManager();
entityManager.unwrap(Session.class).setDefaultReadOnly(true);
entityManager.getTransaction().begin();
entityManager.find(Book.class, id);
entityManager.getTransaction().commit();

It’s important noticing that there’s no standard way to define a read-only transaction in JPA. For that reason, we needed to get the actual Hibernate session to define it as read-only.

需要注意的是,在JPA中没有定义只读事务的标准方法。由于这个原因,我们需要获得实际的Hibernate会话,将其定义为只读。

4.2. JPA+Spring

4.2. JPA+Spring

When using the Spring transaction management system, it gets even more straightforward as we see next:

当使用Spring事务管理系统时,正如我们接下来所看到的,它变得更加直接了当。

@Transactional(readOnly = true)
public Book getBookById(long id) {
    return entityManagerFactory.createEntityManager().find(Book.class, id);
}

By doing this, Spring takes on the responsibility of opening, closing, and defining the transaction mode. However, even this is sometimes unnecessary as when using Spring Data JPA, we already have such configuration ready.

通过这样做,Spring承担了打开、关闭和定义事务模式的责任。然而,即使这样做有时也是不必要的,因为在使用Spring Data JPA时,我们已经做好了这样的配置。

The Spring JPA repository base class marks all the methods as read-only transactions. By adding this annotation at the class level, the behavior of the methods can change by just adding the @Transactional at the method level.

Spring JPA资源库基类将所有方法标记为只读事务。通过在类层添加这个注解,只需在方法层添加@Transactional,就可以改变方法的行为。

Last, it’s also possible to define the read-only connection and change the autcommit property when configuring our data source. As we saw, this can further improve the application’s performance if we only need reads. The data source holds those configurations:

最后,在配置我们的数据源时,也可以定义只读连接并改变autcommit属性。正如我们看到的,如果我们只需要读取,这可以进一步提高应用程序的性能。数据源持有这些配置。

@Bean
public DataSource readOnlyDataSource() {
    HikariConfig config = new HikariConfig();
    config.setJdbcUrl("jdbc:mysql://localhost/baeldung?useUnicode=true&characterEncoding=UTF-8");
    config.setUsername("baeldung");
    config.setPassword("baeldung");
    config.setReadOnly(true);
    config.setAutoCommit(true);
    return new HikariDataSource(config);
}

However, this only makes sense in scenarios where the predominant characteristic of our application is single query resources. Also, if using Spring Data JPA, it’s necessary to disable the default transactions created by Spring. Therefore, we only need to configure the enableDefaultTransactions property to false:

然而,这只有在我们的应用程序的主要特征是单一查询资源的情况下才有意义。另外,如果使用Spring Data JPA,有必要禁用Spring创建的默认事务。因此,我们只需要将enableDefaultTransactions属性配置为false

@Configuration
@EnableJpaRepositories(enableDefaultTransactions = false)
@EnableTransactionManagement
public class Config {
    //Definition of data sources and other persistence related beans
}

From this moment, we have complete control and responsibility to add the @Transactional(readOnly=true) when necessary. Nonetheless, this is not the case for the majority of the application, so we shouldn’t change those configurations unless we’re sure that our application will profit from them.

从这一刻起,我们有完全的控制权和责任在必要时添加@Transactional(readOnly=true)/em>。尽管如此,对于大多数应用程序来说,情况并非如此,所以我们不应该改变这些配置,除非我们确信我们的应用程序会从中获利。

4.3. Routing Statements

4.3.路由声明

In a more real-life scenario, we could have two data sources, a writer one and a read-only one. Then, we’d have to define which data source to use at the component level. This approach handles the read connections more efficiently and prevents the unnecessary commands used to ensure the session is clean and has the appropriate setup.

在一个更真实的场景中,我们可以有两个数据源,一个是写的,一个是只读的 一个。然后,我们就必须在组件级别上定义使用哪一个数据源。这种方法可以更有效地处理读取连接,并防止使用不必要的命令来确保会话是干净的并有适当的设置

There are multiple ways to reach this outcome, but we’ll first create a router data source class:

有多种方法可以达到这个结果,但我们首先要创建一个路由器数据源类。

public class RoutingDS extends AbstractRoutingDataSource {

    public RoutingDS(DataSource writer, DataSource reader) {
        Map<Object, Object> dataSources = new HashMap<>();
        dataSources.put("writer", writer);
        dataSources.put("reader", reader);

        setTargetDataSources(dataSources);
    }

    @Override
    protected Object determineCurrentLookupKey() {
        return ReadOnlyContext.isReadOnly() ? "reader" : "writer";
    }
}

There’s a lot more to know about routing data sources. However, to sum up, in our case, this class will return the appropriate data source when the application requests it. To do that, we use the ReadOnlyContent class that will hold the data source context at runtime:

关于路由数据源,还有很多东西需要了解。然而,总结起来,在我们的案例中,这个类将在应用程序请求时返回适当的数据源。要做到这一点,我们使用ReadOnlyContent类,它将在运行时保持数据源上下文。

public class ReadOnlyContext {

    private static final ThreadLocal<AtomicInteger> READ_ONLY_LEVEL = ThreadLocal.withInitial(() -> new AtomicInteger(0));

    //default constructor

    public static boolean isReadOnly() {
        return READ_ONLY_LEVEL.get()
            .get() > 0;
    }

    public static void enter() {
        READ_ONLY_LEVEL.get()
            .incrementAndGet();
    }

    public static void exit() {
        READ_ONLY_LEVEL.get()
            .decrementAndGet();
    }
}

Next, we need to define those data sources and register them in the Spring context. For this, we only need to use the RoutingDS class created previously:

接下来,我们需要定义这些数据源并在Spring上下文中注册它们。为此,我们只需要使用之前创建的RoutingDS类。

//annotations mentioned previously
public Config {
    //other beans...

    @Bean
    public DataSource routingDataSource() {
        return new RoutingDS(
          dataSource(false, false),
          dataSource(true, true)
        );
    }
    
    private DataSource dataSource(boolean readOnly, boolean isAutoCommit) {
        HikariConfig config = new HikariConfig();
        config.setJdbcUrl("jdbc:mysql://localhost/baeldung?useUnicode=true&characterEncoding=UTF-8");
        config.setUsername("baeldung");
        config.setPassword("baeldung");
        config.setReadOnly(readOnly);
        config.setAutoCommit(isAutoCommit);
        return new HikariDataSource(config);
    }

    // other beans...
}

Almost there — now, let’s create an annotation to tell Spring when to wrap a component in a read-only context. For this, we’ll use the @ReaderDS annotation:

差不多了–现在,让我们创建一个注解来告诉Spring何时将一个组件包裹在只读上下文中。为此,我们将使用@ReaderDS注解。

@Inherited
@Retention(RetentionPolicy.RUNTIME)
public @interface ReaderDS {
}

Last, we use AOP to wrap the component execution within the context:

最后,我们使用AOP来包装上下文中的组件执行。

@Aspect
@Component
public class ReadOnlyInterception {
    @Around("@annotation(com.baeldung.readonlytransactions.mysql.spring.ReaderDS)")
    public Object aroundMethod(ProceedingJoinPoint joinPoint) throws Throwable {
        try {
            ReadOnlyContext.enter();
            return joinPoint.proceed();
        } finally {
            ReadOnlyContext.exit();
        }
    }
}

Usually, we want to add the annotation at the highest point level possible. Still, to make it simple, we’ll add the repository layer, and there’s only a single query in the component:

通常情况下,我们要在尽可能高的点层添加注释。不过,为了简单起见,我们还是要添加资源库层,而且组件中只有一个查询。

public interface BookRepository extends JpaRepository<BookEntity, Long> {

    @ReaderDS
    @Query("Select t from BookEntity t where t.id = ?1")
    BookEntity get(Long id);
}

As we can observe, this setup allows us to more efficiently deal with read-only operations by leveraging entire read-only transactions and avoiding the session context switch. As a result, this can considerably increase our application’s throughput and responsiveness.

正如我们所观察到的,这种设置允许我们通过利用整个只读事务和避免会话上下文切换来更有效地处理只读操作。因此,这可以极大地提高我们应用程序的吞吐量和响应速度。

5. Conclusion

5.总结

In this article, we looked at read-only transactions and their benefits. We also understood how the MySQL InnoDB engine deals with them and how to configure the main properties that affect our application’s transactions. Furthermore, we discussed the possibilities of additional improvements by using dedicated resources like dedicated data sources. As usual, all code samples used in this article are available over on GitHub.

在这篇文章中,我们研究了只读事务及其好处。我们还了解了MySQL InnoDB引擎是如何处理这些事务的,以及如何配置影响我们应用程序事务的主要属性。此外,我们还讨论了通过使用专用资源(如专用数据源)来进行额外改进的可能性。像往常一样,本文中使用的所有代码样本都可以在GitHub上找到。