1. Introduction
1.导言
In this tutorial, we’ll clarify the difference between GroupId and ConsumerId in Apache Kafka, which is important in understanding how to set up consumers correctly. In addition, we’ll touch on the difference between ClientId and ConsumerId and see how they are related to each other.
在本教程中,我们将阐明 Apache Kafka 中 GroupId 和 ConsumerId 之间的区别,这对于理解如何正确设置消费者非常重要。此外,我们还将讨论 ClientId 和 ConsumerId 之间的区别,并了解它们之间的关系。
2. Consumer Groups
2.消费者团体
Before exploring the differences between identifier types in Apache Kafka, let’s understand Consumer Groups.
在探索 Apache Kafka 中标识符类型之间的差异之前,让我们先了解一下消费者组。
Consumer Groups consist of multiple consumers who work together to consume messages from one or more topics, accomplishing parallel message processing. They enable scalability, fault tolerance, and efficient parallel processing of messages in a distributed Kafka environment.
消费者组由多个消费者组成,这些消费者共同消费来自一个或多个主题的消息,从而完成并行消息处理。它们可在分布式 Kafka 环境中实现消息的可扩展性、容错性和高效并行处理。
Crucially, each consumer within the group is responsible for processing only a subset of its topic, known as a partition.
最重要的是,组内的每个消费者只负责处理其主题的一个子集,即所谓的分区。
3. Understanding Identifiers
3.了解标识符
Next, let’s define at a high level all of the identifiers we’re considering in this tutorial:
接下来,让我们对本教程中的所有标识符进行高级定义:
- GroupId uniquely identifies a Consumer Group.
- ClientId uniquely identifies a request that is passed to the server.
- ConsumerId is assigned to individual consumers within a Consumer Group and is a combination of the client.id consumer property and the consumer’s unique identifier.
4. Purpose of Identifiers
4.标识符的目的
Next, let’s understand the purpose of each identifier.
接下来,让我们了解每个标识符的用途。
GroupId is central to the load-balancing mechanism, enabling the distribution of partitions among consumers. Consumer Groups manage the coordination, load balancing, and partition assignment among consumers within the same group. Kafka ensures that only one consumer has access to each partition at any given time. If a consumer within the group fails, Kafka seamlessly reassigns the partition to other consumers to maintain continuity of message processing.
GroupId 是负载平衡机制的核心,可在消费者之间分配分区。消费者组管理同一组内消费者之间的协调、负载平衡和分区分配。Kafka 确保在任何给定时间内,每个分区只有一个消费者可以访问。如果组内的消费者发生故障,Kafka 会将分区无缝地重新分配给其他消费者,以保持消息处理的连续性。
Kafka uses ConsumerIds to ensure that each consumer within the group is uniquely identifiable when interacting with the Kafka broker. This identifier, fully managed by Kafka, is used for managing consumer offsets and tracking the progress in processing messages from partitions.
Kafka 使用 ConsumerIds 来确保组内的每个消费者在与 Kafka 代理交互时都是唯一可识别的。该标识符完全由 Kafka 管理,用于管理消费者偏移和跟踪分区消息的处理进度。
Lastly, ClientId tracks the source of requests, beyond just IP/port, by allowing the developer to configure a logical application name that will be included in server-side request logging. Because we have control over this value, we could create two separate clients with the same ClientId. However, in this case, the ConsumerId generated by Kafka will be different.
最后,ClientId 可跟踪请求的来源,而不仅仅是 IP/端口,它允许开发人员配置逻辑应用程序名称,该名称将包含在服务器端请求日志中。由于我们可以控制该值,因此我们可以创建两个具有相同 ClientId 的独立客户端。但是,在这种情况下,Kafka 生成的 ConsumerId 将有所不同。
5. Configuring GroupId and ConsumerId
5.配置 GroupId 和 ConsumerId
5.1. Using Spring Kafka
5.1.使用 Spring Kafka
Let’s define GroupId and ConsumerId for our consumers in Spring Kafka. We’ll achieve this by leveraging the @KafkaListener annotation:
让我们在 Spring Kafka 中为消费者定义 GroupId 和 ConsumerId。我们将利用 @KafkaListener 注解来实现这一目标:
@KafkaListener(topics = "${kafka.topic.name:test-topic}", clientIdPrefix = "neo", groupId = "${kafka.consumer.groupId:test-consumer-group}", concurrency = "4")
public void receive(@Payload String payload, Consumer<String, String> consumer) {
LOGGER.info("Consumer='{}' received payload='{}'", consumer.groupMetadata()
.memberId(), payload);
this.payload = payload;
latch.countDown();
}
Notice how we specified the groupId property to an arbitrary value of our choice.
请注意我们是如何将 groupId 属性指定为我们选择的任意值的。
Additionally, we’ve set the clientIdPrefix property to contain a custom prefix. Let’s inspect application logs to verify that the ConsumerId contains this prefix:
此外,我们还将 clientIdPrefix 属性设置为包含自定义前缀。让我们检查应用程序日志,以验证 ConsumerId 是否包含此前缀:
c.b.s.kafka.groupId.MyKafkaConsumer : Consumer='neo-1-bae916e4-eacb-485a-9c58-bc22a0eb6187' received payload='Test 123...'
The value of consumerId, also known as memberId, follows a specific pattern. It starts with the clientIdPrefix, followed by a counter based on the number of consumers in the group, and finally, a UUID.
consumerId(也称为 memberId)的值遵循特定的模式。它以 clientIdPrefix 开始,然后是一个基于组中消费者数量的计数器,最后是一个 UUID。
5.2. Using Kafka CLI
5.2 使用 Kafka CLI
We can also configure the GroupId and ConsumerId via CLI. We’ll work with the kafka-console-consumer.sh script. Let’s start a console consumer with group.id set to test-consumer-group and client.id property set to neo-<sequence_number>:
我们还可以通过 CLI 配置 GroupId 和 ConsumerId。我们将使用 kafka-console-consumer.sh 脚本。让我们启动一个控制台消费者,将group.id设置为test-consumer-group,并将client.id属性设置为neo-<sequence_number>:。
$ kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic Test --group test-consumer-group --consumer-property "client.id=neo-1"
In this case, we must ensure that each client is assigned a unique client.id. This behavior is different from Spring Kafka, where we set the clientIdPrefix and the framework adds a sequence number to it. If we describe the consumer group, we’ll see the ConsumerId generated by Kafka for each consumer:
在这种情况下,我们必须确保为每个客户端分配一个唯一的 client.id。这种行为与 Spring Kafka 不同,在 Spring Kafka 中,我们设置 clientIdPrefix 并由框架添加序列号。如果我们描述消费者组,我们将看到 Kafka 为每个消费者生成的 ConsumerId:
kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group test-consumer-group --describe
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
test-consumer-group Test 0 0 0 0 neo-1-975feb3f-9e5a-424b-9da3-c2ec3bc475d6 /127.0.0.1 neo-1
test-consumer-group Test 1 0 0 0 neo-1-975feb3f-9e5a-424b-9da3-c2ec3bc475d6 /127.0.0.1 neo-1
test-consumer-group Test 2 0 0 0 neo-1-975feb3f-9e5a-424b-9da3-c2ec3bc475d6 /127.0.0.1 neo-1
test-consumer-group Test 3 0 0 0 neo-1-975feb3f-9e5a-424b-9da3-c2ec3bc475d6 /127.0.0.1 neo-1
test-consumer-group Test 7 0 0 0 neo-3-09b8d4ee-5f03-4386-94b1-e068320b5e6a /127.0.0.1 neo-3
test-consumer-group Test 8 0 0 0 neo-3-09b8d4ee-5f03-4386-94b1-e068320b5e6a /127.0.0.1 neo-3
test-consumer-group Test 9 0 0 0 neo-3-09b8d4ee-5f03-4386-94b1-e068320b5e6a /127.0.0.1 neo-3
test-consumer-group Test 4 0 0 0 neo-2-6a39714e-4bdd-4ab8-bc8c-5463d78032ec /127.0.0.1 neo-2
test-consumer-group Test 5 0 0 0 neo-2-6a39714e-4bdd-4ab8-bc8c-5463d78032ec /127.0.0.1 neo-2
test-consumer-group Test 6 0 0 0 neo-2-6a39714e-4bdd-4ab8-bc8c-5463d78032ec /127.0.0.1 neo-2
6. Summary
6.总结
Let’s summarize the key differences between the three identifiers we’ve discussed:
让我们来总结一下我们讨论过的三种标识符之间的主要区别:
Dimension |
GroupId |
ConsumerId |
ClientId |
What does it identify? |
Consumer Group |
Individual Consumer within a Consumer Group |
Individual Consumer within a Consumer Group |
Where does its value come from? |
Developers set the GroupId |
Kafka generates the ConsumerId based on the client.id consumer property |
Developers set the client.id consumer property |
Is it unique? |
If two consumer groups have the same GroupId, they are effectively one |
Kafka ensures each consumer has a unique value |
It doesn’t have to be unique. Two consumers can be given the same value for the client.id consumer property as per the use case |
7. Conclusion
7.结论
In this article, we’ve looked at some of the key identifiers associated with Kafka consumers: GroupId, ClientId, and ConsumerId. We now understand their purpose and how to configure them.
在本文中,我们了解了与 Kafka 消费者相关的一些关键标识符:GroupId、ClientId 和 ConsumerId。现在我们了解了它们的用途以及如何配置它们。
As always, the complete source code is available over on GitHub.
一如既往,完整的源代码可在 GitHub 上获取 。