A Guide to etcd – etcd 指南

最后修改: 2024年 2月 24日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.导言

In the complex world of distributed systems, ensuring efficient data management is crucial. Distributed reliable key-value stores play a pivotal role in maintaining data consistency and scalability across distributed environments.

在复杂的分布式系统世界中,确保高效的数据管理至关重要。分布式可靠键值存储在保持分布式环境中的数据一致性和可扩展性方面发挥着举足轻重的作用。

In this comprehensive tutorial, we’ll delve into etcd, an open-source distributed key-value store. We’ll explore its fundamental concepts, features, and use cases, and provide a hands-on quickstart guide. Finally, we’ll compare etcd with a couple of other distributed key-value stores to understand its strengths and unique offerings.

在本综合教程中,我们将深入研究开源分布式键值存储 etcd。我们将探讨它的基本概念、功能和用例,并提供快速入门指南。最后,我们会将 etcd 与其他几个分布式键值存储进行比较,以了解它的优势和独特之处。

2. What Are Distributed Key-Value Stores?

2.什么是分布式键值存储?

Distributed key-value stores are a type of NoSQL database that stores data as key-value pairs that span multiple physical or virtual machines.

分布式键值存储是一种 NoSQL 数据库,它将数据存储为跨越多个物理或虚拟机的键值对

This distribution essentially enhances scalability, fault tolerance, and performance. Moreover, each piece of data (value) is associated with a unique identifier (key). This model is highly efficient for certain use cases, such as caching, configuration management, and fast data retrieval.

这种分布从根本上提高了可扩展性、容错性和性能。此外,每条数据(值)都与唯一标识符(键)相关联。这种模式对于某些用例,如缓存、配置管理和快速数据检索,具有很高的效率。

Apache Zookeeper, Consul, and Redis are some of the examples that provide a reliable key-value store.

Apache Zookeeper、Consul 和 Redis 就是提供可靠键值存储的一些例子。

Distributed key-value stores serve as the backbone of many distributed systems, providing a simple yet powerful mechanism for storing and retrieving data.

分布式键值存储是许多分布式系统的支柱,为存储和检索数据提供了一种简单而强大的机制。

Below are some important key aspects of the distributed key-value stores:

以下是分布式键值存储的一些重要关键方面:

  • Simplicity: Basic data structure comprising key-value pairs, making it easy to understand and use for specific types of applications.
  • Scalability: These systems can efficiently handle growing amounts of data and increased load by distributing the workload across multiple nodes.
  • Reliability: They ensure data consistency, fault tolerance, and scalability.
  • Performance: The key-value mechanism provides fast and efficient access to data. Moreover, by distributing it across multiple nodes, it reduces the load on individual machines.
  • Distribution: Since the data is spread across multiple nodes, we get enhanced performance.

Distributed key-value stores find applications in various scenarios, such as configuration management, caching, session storage, service discovery, leader election, etc.

分布式键值存储可应用于各种场景,如配置管理、缓存、会话存储、服务发现、领导者选举等

3. What Is etcd?

3.什么是 etcd?

etcd is a distributed, reliable key-value store for the most critical data of a distributed system. It’s a simple, secure, fast, and reliable key-value store designed for configuration management, service discovery, and coordination of distributed systems.

etcd 是一种可靠的分布式键值存储,用于存储分布式系统中最关键的数据。它是一种简单、安全、快速、可靠的键值存储,专为分布式系统的配置管理、服务发现和协调而设计。

Developed by the CoreOS team and now a CNCF (Cloud Native Computing Foundation) project, etcd provides a reliable and distributed data store that enables the coordination of configurations and the discovery of services in dynamic and scalable environments.

etcd 由 CoreOS 团队开发,现已成为 CNCF(云原生计算基金会)项目,它提供了一个可靠的分布式数据存储,可在动态和可扩展的环境中协调配置和发现服务。

etcd is developed in Go and internally uses the Raft consensus algorithm to manage a highly-available replicated log.

etcd 采用 Go 语言开发,内部使用 Raft 共识算法 管理高可用的复制日志。

Many companies worldwide such as Baidu, Huawei, Salesforce, Ticketmaster, etc. use etcd in production. It’s frequently integrated with applications such as Kubernetes, Locksmith, Vulcand, Doorman, and many others.

全球许多公司(如百度、华为、Salesforce、Ticketmaster 等)都在使用 etcd 生产。它经常与 Kubernetes、Locksmith、Vulcand、Doorman 等应用程序集成。

etcd’s rich feature set makes it a versatile and reliable choice for distributed systems, providing the essential building blocks for configuration management, service discovery, and coordination in cloud-native environments. Its commitment to distributed consistency, high availability, and strong data integrity positions it as a foundational component in the landscape of modern, scalable, and resilient applications.

etcd丰富的功能集使其成为分布式系统的多功能、可靠之选,为云原生环境中的配置管理、服务发现和协调提供了重要的构建模块。

4. Features of etcd

4.etcd 的特点

etcd’s rich feature set makes it a versatile and reliable choice for distributed systems, providing the essential building blocks for configuration management, service discovery, and coordination in cloud-native environments. In certain situations, it may achieve 10,000 writes/sec.

etcd 功能丰富,是分布式系统的多功能可靠选择,为云原生环境中的配置管理、服务发现和协调提供了重要的构建模块。在某些情况下,它的写入速度可达每秒 10,000 次。

Let’s understand some of its key features:

让我们来了解一下它的一些主要功能:

  • HTTP/gRPC API: etcd provides both HTTP and gRPC APIs, making it accessible and interoperable with various programming languages and easily integrated into different types of applications and frameworks.
  • Distributed Consistency: It maintains strong consistency in distributed setups, ensuring that all nodes in the cluster have a consistent view of the data.
  • High Availability: etcd is designed to be highly available, with automatic leader election and failover mechanisms. Thus, an etcd cluster remains operational even in the face of node failures, contributing to system resilience.
  • Watch Support: etcd supports strongly consistent watches, allowing applications to monitor changes to specific key-value stores in real-time.
  • Atomic Transactions: It supports atomic transactions, allowing multiple key-value operations that we can group together and execute as a single atomic unit, thus maintaining data consistency.
  • Lease Management: etcd introduces the concept of leases, allowing keys to have associated time-to-live (TTL) values thus deleting them automatically after the specified period.
  • Role-Based Access Control (RBAC): It supports RBAC, allowing administrators to define roles and permissions for users and applications interacting with the cluster.
  • Snapshot and Backup: It provides mechanisms for creating snapshots of the cluster’s state and supports backup and restoration processes. Thus, it ensures disaster recovery and data durability.
  • Pluggable Storage Backend: etcd offers a pluggable storage backend, enabling users to choose the underlying storage engine that best fits their requirements (e.g., etcd’s default storage engine, LevelDB, or RocksDB). Thus, it provides flexibility and allows optimization based on specific use cases and performance considerations.
  • Integration with Kubernetes: etcd is a critical component in Kubernetes, serving as the primary datastore for configuration and state information. This makes etcd a core part of container orchestration, ensuring that the distributed systems can manage configurations and scale effectively.
  • etcdctl: It’s a command-line client tool designed for interacting with and managing an etcd cluster.

5. Installation

5.安装

Let’s understand how to configure and set up etcd to get it running. etcd is compatible with Linux distributions like Ubuntu, CentOS, and also Windows.

让我们来了解如何配置和设置 etcd 以让它运行。etcd 与 Ubuntu、CentOS 和 Windows 等 Linux 发行版兼容

We can start by updating the package list on Ubuntu:

我们可以先更新 Ubuntu 的软件包列表:

$ sudo apt update

Subsequently, we can install etcd:

随后,我们可以安装 etcd:

$ sudo apt install etcd

Similarly, on CentOS, we first need to enable the EPEL repository and then install etcd:

同样,在 CentOS 上,我们首先需要启用 EPEL 资源库,然后安装 etcd:

$ sudo yum install epel-release
$ sudo yum install etcd

Alternatively, we can visit the official etcd GitHub releases page to download the latest release. Otherwise, we can clone the repo using the following command:

或者,我们可以访问 etcd GitHub 官方发布页面,下载最新版本。否则,我们可以使用以下命令克隆该 repo:

$ git clone -b v3.5.11 https://github.com/etcd-io/etcd.git

For cloning the latest version, we can omit the -b v3.5.11 flag.

要克隆最新版本,我们可以省略 -b v3.5.11 标记。

Then, we can extract the downloaded archive and navigate to the etcd directory:

然后,我们可以解压缩下载的压缩包,并导航到 etcd 目录:

$ tar xvf etcd-v3.5.11-linux-amd64.tar.gz
$ cd etcd

Next, we can run the build script:

接下来,我们可以运行构建脚本:

$ ./build.sh

We can find the binaries under the bin directory. We then need to add the full path to the bin directory to our path:

我们可以在 bin 目录下找到二进制文件。然后,我们需要在路径中添加 bin 目录的完整路径:

$ export PATH="$PATH:`pwd`/bin"

Here, pwd is the UNIX command that gets us the full path name of the current directory. Finally, we can ensure that our PATH contains etcd by checking the version:

这里,pwd 是 UNIX 命令,用于获取当前目录的完整路径名。最后,我们可以通过检查版本来确保 PATH 包含 etcd:

$ etcd --version

6. Configuration Using  the Config File

6. 使用配置文件进行配置

We have multiple options to configure etcd. However, in this tutorial, we’ll create a configuration file with basic settings.

我们有多个选项来配置 etcd。不过,在本教程中,我们将创建一个包含基本设置的配置文件。

The etcd configuration file is a YAML file that contains settings and parameters used to configure the behavior of an etcd node. This file is essential for customizing various aspects of etcd, such as network settings, cluster information, authentication, and storage options. Let’s see an example:

etcd 配置文件是一个 YAML 文件,其中包含用于配置 etcd 节点行为的设置和参数。该文件对于自定义 etcd 的各个方面(如网络设置、群集信息、身份验证和存储选项)至关重要:

# Example etcd-config.yml
# Node name, a unique identifier, in the etcd cluster
name: node-1

# Data directory where etcd will store its data
data-dir: /var/lib/etcd/default.etcd

# Listen addresses for client communication
listen-client-urls: http://127.0.0.1:2379,http://<NODE-IP>:2379

# Advertise addresses for client communication
advertise-client-urls: http://<NODE-IP>:2379

# Listen addresses for peer communication
listen-peer-urls: http://<NODE-IP>:2380

# Advertise addresses for peer communication
initial-advertise-peer-urls: http://<NODE-IP>:2380

# Initial cluster configuration
initial-cluster: node-1=http://<NODE-IP>:2380,node-2=http://<NODE-IP>:2380

# Unique token for the etcd cluster
initial-cluster-token: etcd-cluster-1

# Initial cluster state (new, existing, or standby)
initial-cluster-state: new

# Enable authentication with a shared secret token
auth-token: "some-secret-token"

# Enable authorization with RBAC
enable-authorization: true

# Enable automatic compaction of the etcd key-value store
auto-compaction-mode: periodic
auto-compaction-retention: "1h"

# Secure communication settings (TLS)
client-transport-security:
  cert-file: /etc/etcd/server.crt
  key-file: /etc/etcd/server.key
  client-cert-auth: true
  trusted-ca-file: /etc/etcd/ca.crt

peer-transport-security:
  cert-file: /etc/etcd/peer.crt
  key-file: /etc/etcd/peer.key
  client-cert-auth: true
  trusted-ca-file: /etc/etcd/ca.crt

Let’s understand a few important notes about this configuration:

让我们来了解一下这种配置的几个重要注意事项:

Adding TLS Certificates: secure configurations (client-transport-security and peer-transport-security) are optional but recommended for production deployments, providing encrypted communication.

添加 TLS 证书:安全配置(客户端传输安全和对等传输安全)是可选的,但建议用于生产部署,以提供加密通信。

Adding RBAC: Role-Based Access Control adds a layer of security by controlling access to etcd operations based on user roles and permissions.

添加 RBAC:基于角色的访问控制根据用户角色和权限控制对 etcd 操作的访问,从而增加了一层安全性。

Enabling auto-compaction: Helps manage the size of the etcd data store by periodically (hourly) removing unnecessary data.

启用自动压缩:通过定期(每小时)删除不必要的数据,帮助管理 etcd 数据存储的大小。

Finally, we should ensure that we customize the configuration file based on our specific requirements and security considerations. After editing the file, we can restart the etcd service for the changes to take effect.

最后,我们应确保根据具体要求和安全考虑定制配置文件。编辑文件后,我们可以重启 etcd 服务,使更改生效。

7. Starting and Interacting With etcd

7.启动 etcd 并与之交互

We can start etcd with the specified configuration using the following command:

我们可以使用以下命令以指定配置启动 etcd:

$ ./etcd --config-file=etcd-config.yml

Further, we can interact with etcd using the etcdctl command-line tool that’s designed for interacting with and managing an etcd cluster. It facilitates administrators and developers in executing various operations on an etcd cluster directly from the command line.

此外,我们还可以使用 etcdctl 命令行工具与 etcd 进行交互,该工具专为与 etcd 群集进行交互和管理而设计。它便于管理员和开发人员直接从命令行对 etcd 群集执行各种操作。

Let’s understand with a few examples:

让我们通过几个例子来了解一下:

We can set a key-value pair as:

我们可以将键值对设置为

$ etcdctl put mykey "Hello, etcd!"

Here, mykey is the key, and “Hello, etcd!” is the corresponding value. Subsequently, we can retrieve the value of mykey as:

这里,mykey 是键,“Hello, etcd!” 是相应的值。随后,我们可以通过以下方式获取 mykey 的值:

$ etcdctl get mykey
mykey
Hello, etcd!

To watch changes to mykey, we can simply do:

要观察 mykey 的变化,我们可以简单地这样做:

$ etcdctl watch mykey

Watching a key in etcd allows us to receive real-time notifications about changes to the key, whether the value is modified or the key is deleted. Watch events provide details about the nature of the change, enabling applications to react dynamically to the updates in the etcd key-value store.

通过监视 etcd 中的键,我们可以实时接收键更改的通知,无论是键值被修改还是键被删除。监视事件会提供有关更改性质的详细信息,使应用程序能对 etcd 键值存储中的更新做出动态反应。

It’s important to note that watching a key doesn’t prevent it from being deleted. Watches are mechanisms for observing changes, not for controlling or restricting them.

需要注意的是,监视密钥并不能防止密钥被删除。监视是观察变化的机制,而不是控制或限制变化的机制。

Finally, we can use the following command to check the health of the etcd cluster:

最后,我们可以使用以下命令来检查etcd 群集的健康状况

$ etcdctl endpoint health

If we’re working with a secured etcd cluster, then we may need to provide additional authentication and security options, such as specifying the –cacert, –cert, and –key flags to point to the certificate and key files while checking the health.

如果我们使用的是安全的 etcd 集群,则可能需要提供额外的身份验证和安全选项,例如指定 -cacert-cert-key 标记,以便在检查健康状况时指向证书和密钥文件。

8. Code Example

8.代码示例

To interact with etcd using Java, we can use a Java client library like jetcd or etcd4j. In our example, we’ll use jetcd since it’s the official Java client for etcd v3.

要使用 Java 与 etcd 交互,我们可以使用 Java 客户端库,如 jetcdetcd4j 。在我们的示例中,我们将使用 jetcd,因为它是 etcd v3 的官方 Java 客户端。

jetcd is built upon Java 11. It facilitates all key-based etcd requests and offers SSL security. Moreover, it allows the definition of multiple connection URLs and provides both synchronous and asynchronous APIs, giving us flexibility in choosing the programming model that best fits our application.

jetcd基于 Java 11。它支持所有基于密钥的 etcd 请求,并提供 SSL 安全性。此外,它还允许定义多个连接 URL,并提供同步和异步 API,让我们可以灵活选择最适合我们应用的编程模型。

We can add the jetcd-core dependency to our project as:

我们可以将jetcd-core依赖关系添加到我们的项目中:

<dependency>
    <groupId>io.etcd</groupId>
    <artifactId>jetcd-core</artifactId>
    <version>0.7.7</version>
</dependency>

Now, let’s see a basic example demonstrating the put, retrieve, and delete operations using jetcd:

现在,让我们来看一个使用 jetcd 演示放入、检索和删除操作的基本示例:

public class JetcdExample {
    public static void main(String[] args) {
        String etcdEndpoint = "http://localhost:2379";
        ByteSequence key = ByteSequence.from("/mykey".getBytes());
        ByteSequence value = ByteSequence.from("Hello, etcd!".getBytes());

        try (Client client = Client.builder().endpoints(etcdEndpoint).build()) {
            KV kvClient = client.getKVClient();
            
            // Put a key-value pair
            kvClient.put(key, value).get();
            
            // Retrieve the value using CompletableFuture
            CompletableFuture<GetResponse> getFuture = kvClient.get(key);
            GetResponse response = getFuture.get();
            
            // Delete the key
            kvClient.delete(key).get();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

9. Comparison With Apache Zookeeper and Consul

9.与 Apache Zookeeper 和 Consul 的比较

As distributed system tools, etcd, Apache Zookeeper, and Consul are designed to manage configurations, coordinate, and provide a reliable foundation for building distributed applications. However, they have significant differences in their design philosophies, architecture, and use cases:

作为分布式系统工具,etcd、Apache ZookeeperConsul 设计用于管理配置、协调并为构建分布式应用提供可靠的基础。但是,它们在设计理念、体系结构和使用案例方面存在显著差异:

 
Feature/Aspect etcd Apache ZooKeeper Consul
Consensus Algorithm Raft Zab (ZooKeeper Atomic Broadcast) Consul Raft
Data Model key-value Store Hierarchy of ZNodes key-value Store
Use Cases Cloud-native, Kubernetes Various distributed systems Service discovery, networking
Consistency Model Strong consistency Strong consistency Consistent, eventually consistent
Security Features TLS support, AuthN, and AuthZ Limited built-in security ACLs, TLS, Token-based access
Leadership Election Leader election is inherent in Raft consensus. Nodes participate in elections for leader selection. Centralized leader election through Zab protocol. Nodes elect a leader that coordinates operations. Raft-based leadership election. Each Consul server participates in the Raft consensus algorithm for leader election.
Leader Characteristics The leader holds authority for making decisions and coordinating the cluster. The leader manages the distributed system’s state and coordinates actions. The leader is responsible for cluster coordination and decision-making.
Performance Generally good Good, used in large deployments High-performance, scalable
Integration with Ecosystem Integrates with CNCF projects Integrated with Apache projects Integrates with HashiCorp stack
Monitoring & Observability etcd metrics, Prometheus support Limited built-in monitoring Integrated metrics, Prometheus
Configuration Management Configuration API Used for configuration in Hadoop, Kafka, etc. Dynamic configuration management
Service Discovery Limited Used as part of distributed systems Core feature, DNS-based discovery
Commercial Support Limited Commercial support available Enterprise and open-source offerings
Ease of Use Known for simplicity Can be more complex Easy to use and configure
License Apache License 2.0 Apache License 2.0 MPLv2.0

Choosing between etcd, Apache ZooKeeper, and Consul depends on specific project needs.

在 etcd、Apache ZooKeeper 和 Consul 之间做出选择取决于具体的项目需求。

etcd, with its simplicity and Cloud Native Computing Foundation (CNCF) support, suits cloud-native environments like Kubernetes. Apache ZooKeeper, a robust choice for large-scale deployments, offers strong consistency but comes with added complexity. On the other hand, Consul, known for simplicity and effective service discovery, integrates seamlessly with the HashiCorp stack.

etcd 简单易用,支持云原生计算基金会(CNCF),适合 Kubernetes 等云原生环境。Apache ZooKeeper 是大规模部署的稳健之选,具有很强的一致性,但也增加了复杂性。另一方面,Consul 以简单和有效的服务发现而著称,可与 HashiCorp 堆栈无缝集成。

Security, ease of use, and integration requirements play pivotal roles in the decision-making process. Each tool has its strengths, therefore making an informed selection is crucial for us based on the desired features and use cases.

安全性、易用性和集成要求在决策过程中起着至关重要的作用。每种工具都有自己的优势,因此,根据所需的功能和用例做出明智的选择对我们来说至关重要。

10. Conclusion

10.结论

In this article, we’ve explored etcd comprehensively, discussing its foundational concepts, critical features, and practical applications. The quick start guide will help us set up etcd quickly and interact with it programmatically. Additionally, the comparison with other distributed key-value stores highlights the unique strengths of etcd, making it a reliable choice for various distributed system scenarios.

在本文中,我们全面探讨了 etcd 的基本概念、关键功能和实际应用。快速入门指南将帮助我们快速设置 etcd 并与之进行编程交互。此外,通过与其他分布式键值存储的比较,我们还强调了 etcd 的独特优势,使其成为各种分布式系统场景的可靠选择。

Understanding distributed reliable key-value stores, the criticality of data in distributed systems, and the capabilities of etcd will help us make informed decisions when designing and implementing distributed applications. Finally, as the backbone of many distributed systems, etcd’s simplicity, consistency, and high availability make it a valuable tool for developers navigating the complexities of distributed environments.

了解分布式可靠键值存储、分布式系统中数据的重要性以及 etcd 的功能,有助于我们在设计和实施分布式应用时做出明智的决策。最后,作为许多分布式系统的支柱,etcd 的简单性、一致性和高可用性使其成为开发人员在复杂的分布式环境中游刃有余的宝贵工具。

As always, the source code accompanying the article is available over on GitHub.

与往常一样,本文的源代码可在 GitHub 上获取。