A Guide to Transactions Across Microservices – 跨越微服务的事务指南

最后修改: 2017年 12月 14日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.介绍

In this article, we’ll discuss options to implement a transaction across microservices.

在这篇文章中,我们将讨论实现跨微服务交易的选项。

We’ll also check out some alternatives to transactions in a distributed microservice scenario.

我们还将检查在分布式微服务场景中对事务的一些替代方案。

2. Avoiding Transactions Across Microservices

2.避免跨微服务的事务

A distributed transaction is a very complex process with a lot of moving parts that can fail. Also, if these parts run on different machines or even in different data centers, the process of committing a transaction could become very long and unreliable.

分布式交易是一个非常复杂的过程,有很多移动的部分可能会失败。此外,如果这些部分运行在不同的机器上,甚至在不同的数据中心,提交交易的过程可能会变得非常漫长和不可靠。

This could seriously affect the user experience and overall system bandwidth. So one of the best ways to solve the problem of distributed transactions is to avoid them completely.

这可能会严重影响用户体验和整个系统带宽。因此,解决分布式事务问题的最佳方法之一是完全避免它们。

2.1. Example of Architecture Requiring Transactions

2.1.要求事务的架构实例

Usually, a microservice is designed in such way as to be independent and useful on its own. It should be able to solve some atomic business task.

通常情况下,微服务的设计方式是独立的,并对其本身有用。它应该能够解决一些原子性的业务任务。

If we could split our system in such microservices, there’s a good chance we wouldn’t need to implement transactions between them at all.

如果我们能够将我们的系统分割成这样的微服务,我们很有可能根本不需要在它们之间实现交易。

For example, let’s consider a system of broadcast messaging between users.

例如,让我们考虑一个用户之间的广播信息传递系统。

The user microservice would be concerned with the user profile (creating a new user, editing profile data etc.) with the following underlying domain class:

用户微服务将关注用户配置文件(创建新用户、编辑配置文件数据等),其底层领域类如下。

@Entity
public class User implements Serializable {

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private long id;

    @Basic
    private String name;

    @Basic
    private String surname;

    @Basic
    private Instant lastMessageTime;
}

The message microservice would be concerned with broadcasting. It encapsulates the entity Message and everything around it:

message微服务将关注广播。它封装了实体Message和它周围的一切。

@Entity
public class Message implements Serializable {

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private long id;

    @Basic
    private long userId;

    @Basic
    private String contents;

    @Basic
    private Instant messageTimestamp;

}

Each microservice has its own database. Notice that we don’t refer to the entity User from the entity Message, as the user classes aren’t accessible from the message microservice. We refer to the user only by id.

每个微服务都有自己的数据库。注意,我们没有从实体Message中引用实体User,因为用户类不能从message微服务中访问。我们只通过id来引用用户。

Now the User entity contains the lastMessageTime field because we want to show the information about the last user activity time in her profile.

现在,User实体包含lastMessageTime字段,因为我们想在她的个人资料中显示最后一次用户活动时间的信息。

However, to add a new message to the user and update her lastMessageTime, we’d now have to implement a transaction across microservices.

然而,为了给用户添加一条新的消息并更新她的lastMessageTime,我们现在必须在微服务间实现一个事务。

2.2. Alternative Approach Without Transactions

2.2.无事务的替代方法

We can alter our microservice architecture and remove the field lastMessageTime from the User entity.

我们可以改变我们的微服务架构,从User实体中删除lastMessageTime字段。

Then we could display this time in the user profile by issuing a separate request to the messages microservice and finding the maximum messageTimestamp value for all messages of this user.

然后,我们可以通过向消息微服务发出单独请求,并找到该用户所有消息的最大messageTimestamp值,在用户档案中显示这个时间。

Probably, if the message microservice is under high load or even down, we won’t be able to show the time of the last message of the user in her profile.

可能,如果message微服务处于高负荷状态,甚至停机,我们将无法在用户的个人资料中显示其最后一条信息的时间。

But that could be more acceptable than failing to commit a distributed transaction to save a message just because the user microservice didn’t respond in time.

但这可能比仅仅因为用户微服务没有及时响应而无法提交分布式事务以保存消息更容易被接受。

There are of course more complex scenarios when we have to implement a business process across multiple microservices, and we don’t want to allow inconsistency between those microservices.

当然,还有更复杂的情况,即我们必须在多个微服务中实现一个业务流程,而且我们不希望允许这些微服务之间存在不一致。

3. Two-Phase Commit Protocol

3.两阶段承诺协议

Two-phase commit protocol (or 2PC) is a mechanism for implementing a transaction across different software components (multiple databases, message queues etc.)

两阶段提交协议(或2PC)是一种跨不同软件组件(多个数据库、消息队列等)实现交易的机制。

3.1. The Architecture of 2PC

3.1.2PC的架构

One of the important participants in a distributed transaction is the transaction coordinator. The distributed transaction consists of two steps:

分布式交易的重要参与者之一是交易协调人。分布式交易由两个步骤组成。

  • Prepare phase — during this phase, all participants of the transaction prepare for commit and notify the coordinator that they are ready to complete the transaction
  • Commit or Rollback phase — during this phase, either a commit or a rollback command is issued by the transaction coordinator to all participants

The problem with 2PC is that it is quite slow compared to the time for operation of a single microservice.

2PC的问题是,与单个微服务的运行时间相比,它的速度相当慢。

Coordinating the transaction between microservices, even if they are on the same network, can really slow the system down, so this approach isn’t usually used in a high load scenario.

协调微服务之间的交易,即使它们在同一个网络上,也会使系统变得非常慢,所以这种方法通常不用于高负载的情况。

3.2. XA Standard

3.2 XA标准

The XA standard is a specification for conducting the 2PC distributed transactions across the supporting resources. Any JTA-compliant application server (JBoss, GlassFish etc.) supports it out-of-the-box.

XA标准是用于在支持资源中进行2PC分布式事务的规范。任何兼容JTA的应用服务器(JBoss、GlassFish等)都支持开箱即用。

The resources participating in a distributed transactions could be, for example, two databases of two different microservices.

参与分布式交易的资源可以是,例如,两个不同的微服务的两个数据库。

However, to take advantage of this mechanism, the resources have to be deployed to a single JTA platform. This isn’t always feasible for a microservice architecture.

然而,为了利用这一机制,资源必须被部署到单一的JTA平台。这对于微服务架构来说并不总是可行的。

3.3. REST-AT Standard Draft

3.3.REST-AT标准草案

Another proposed standard is REST-AT which had undergone some development by RedHat but still didn’t get out of the draft stage. It’s however supported by the WildFly application server out-of-the-box.

另一个提议的标准是REST-AT,RedHat已经进行了一些开发,但仍然没有脱离草案阶段。然而,WildFly应用服务器开箱即支持它。

This standard allows using the application server as a transaction coordinator with a specific REST API for creating and joining the distributed transactions.

这个标准允许使用应用服务器作为交易协调者,并有一个特定的REST API用于创建和加入分布式交易。

The RESTful web services that wish to participate in the two-phase transaction also have to support a specific REST API.

希望参与两阶段交易的RESTful网络服务也必须支持特定的REST API。

Unfortunately, to bridge a distributed transaction to local resources of the microservice, we’d still have to either deploy these resources to a single JTA platform or solve a non-trivial task of writing this bridge ourselves.

不幸的是,为了将分布式事务与微服务的本地资源进行桥接,我们仍然必须将这些资源部署到一个JTA平台上,或者解决自己编写这个桥接的非艰巨任务。

4. Eventual Consistency and Compensation

4.最终的一致性和补偿

By far, one of the most feasible models of handling consistency across microservices is eventual consistency.

到目前为止,处理跨微服务一致性的最可行的模式之一是最终一致性

This model doesn’t enforce distributed ACID transactions across microservices. Instead, it proposes to use some mechanisms of ensuring that the system would be eventually consistent at some point in the future.

这个模型并没有在微服务间强制执行分布式ACID事务。相反,它建议使用一些机制来确保系统在未来的某个时间点最终是一致的。

4.1. A Case for Eventual Consistency

4.1.最终一致性的案例

For example, suppose we need to solve the following task:

例如,假设我们需要解决以下任务。

  • register a user profile
  • do some automated background check that the user can actually access the system

The second task is to ensure, for example, that this user wasn’t banned from our servers for some reason.

第二项任务是确保,例如,这个用户没有因为某些原因被禁止进入我们的服务器。

But it could take time, and we’d like to extract it to a separate microservice. It wouldn’t be reasonable to keep the user waiting for so long just to know that she was registered successfully.

但这可能需要时间,我们想把它提取到一个单独的微服务中。让用户等待这么长时间只是为了知道她已经成功注册,这是不可能的。

One way to solve it would be with a message-driven approach including compensation. Let’s consider the following architecture:

解决这个问题的一个方法是采用包括补偿在内的消息驱动方法。让我们考虑以下架构。

  • the user microservice tasked with registering a user profile
  • the validation microservice tasked with doing a background check
  • the messaging platform that supports persistent queues

The messaging platform could ensure that the messages sent by the microservices are persisted. Then they would be delivered at a later time if the receiver weren’t currently available

消息传递平台可以确保微服务发送的消息被持久化。然后,如果接收者目前不在,它们将在稍后的时间内被送达

4.2. Happy Scenario

4.2.幸福的情景

In this architecture, a happy scenario would be:

在这个架构中,一个快乐的场景是。

  • the user microservice registers a user, saving information about her in its local database
  • the user microservice marks this user with a flag. It could signify that this user hasn’t yet been validated and doesn’t have access to full system functionality
  • a confirmation of registration is sent to the user with a warning that not all functionality of the system is accessible right away
  • the user microservice sends a message to the validation microservice to do the background check of a user
  • the validation microservice runs the background check and sends a message to the user microservice with the results of the check
    • if the results are positive, the user microservice unblocks the user
    • if the results are negative, the user microservice deletes the user account

After we’ve gone through all these steps, the system should be in a consistent state. However, for some period of time, the user entity appeared to be in an incomplete state.

在我们经历了所有这些步骤之后,系统应该处于一个一致的状态。然而,在一段时期内,用户实体似乎处于不完整的状态。

The last step, when the user microservice removes the invalid account, is a compensation phase.

最后一步,当用户微服务删除无效账户时,是一个补偿阶段

4.3. Failure Scenarios

4.3.失败情形

Now let’s consider some failure scenarios:

现在我们来考虑一些失败的情况。

  • if the validation microservice is not accessible, then the messaging platform with its persistent queue functionality ensures that the validation microservice would receive this message at some later time
  • suppose the messaging platform fails, then the user microservice tries to send the message again at some later time, for example, by scheduled batch-processing of all users that were not yet validated
  • if the validation microservice receives the message, validates the user but can’t send the answer back due to the messaging platform failure, the validation microservice also retries sending the message at some later time
  • if one of the messages got lost, or some other failure happened, the user microservice finds all non-validated users by scheduled batch-processing and sends requests for validation again

Even if some of the messages were issued multiple times, this wouldn’t affect the consistency of the data in the microservices’ databases.

即使有些消息被多次发布,这也不会影响微服务数据库中数据的一致性。

By carefully considering all possible failure scenarios, we can ensure that our system would satisfy the conditions of eventual consistency. At the same time, we wouldn’t need to deal with the costly distributed transactions.

通过仔细考虑所有可能的故障情况,我们可以确保我们的系统将满足最终一致性的条件。同时,我们也不需要处理昂贵的分布式事务。

But we have to be aware that ensuring eventual consistency is a complex task. It doesn’t have a single solution for all cases.

但我们必须意识到,确保最终的一致性是一项复杂的任务。它并没有一个适用于所有情况的单一解决方案。

5. Conclusion

5.结论

In this article, we’ve discussed some of the mechanisms for implementing transactions across microservices.

在这篇文章中,我们已经讨论了一些跨微服务实现交易的机制。

And, we’ve also explored some alternatives to doing this style of transactions in the first place.

而且,我们还探索了一些替代方案,以首先进行这种类型的交易。