1. Overview
1.概述
Apache Geode is a distributed in-memory data grid supporting caching and data computation.
Apache Geode是一个分布式内存数据网格,支持缓存和数据计算。
In this tutorial, we’ll cover Geode’s key concepts and run through some code samples using its Java client.
在本教程中,我们将介绍Geode的关键概念,并通过使用其Java客户端运行一些代码样本。
2. Setup
2.设置
First, we need to download and install Apache Geode and set the gfsh environment. To do this, we can follow the instructions in Geode’s official guide.
首先,我们需要下载并安装Apache Geode,并设置gfsh环境。要做到这一点,我们可以按照Geode的官方指南中的说明进行。
And second, this tutorial will create some filesystem artifacts. So, we can isolate them by creating a temporary directory and launching things from there.
其次,本教程将创建一些文件系统工件。因此,我们可以通过创建一个临时目录并从那里启动东西来隔离它们。
2.1. Installation and Configuration
2.1.安装和配置
From our temporary directory, we need to start a Locator instance:
从我们的临时目录,我们需要启动一个Locator实例。
gfsh> start locator --name=locator --bind-address=localhost
Locators are responsible for the coordination between different members of a Geode Cluster, which we can further administer over JMX.
Locators负责协调Geode Cluster的不同成员,我们可以通过JMX进一步管理。
Next, let’s start a Server instance to host one or more data Regions:
接下来,让我们启动一个服务器实例来托管一个或多个数据区域。
gfsh> start server --name=server1 --server-port=0
We set the –server-port option to 0 so that Geode will pick any available port. Though if we leave it out, the server will use the default port 40404. A server is a configurable member of the Cluster that runs as a long-lived process and is responsible for managing data Regions.
我们将-server-port选项设置为0,这样Geode将选择任何可用的端口。尽管如果我们不设置它,服务器将使用默认的40404端口。服务器是集群的一个可配置的成员,它作为一个长期的进程运行,负责管理数据区域。
And finally, we need a Region:
最后,我们需要一个Region。
gfsh> create region --name=baeldung --type=REPLICATE
The Region is ultimately where we will store our data.
区域是我们最终将存储数据的地方。
2.2. Verification
2.2.验证
Let’s make sure that we have everything working before we go any further.
让我们在进一步行动之前确保一切正常。
First, let’s check whether we have our Server and our Locator:
首先,让我们检查我们是否有我们的服务器和我们的定位器。
gfsh> list members
Name | Id
------- | ----------------------------------------------------------
server1 | 192.168.0.105(server1:6119)<v1>:1024
locator | 127.0.0.1(locator:5996:locator)<ec><v0>:1024 [Coordinator]
And next, that we have our Region:
接下来,就是我们的Region。
gfsh> describe region --name=baeldung
..........................................................
Name : baeldung
Data Policy : replicate
Hosting Members : server1
Non-Default Attributes Shared By Hosting Members
Type | Name | Value
------ | ----------- | ---------------
Region | data-policy | REPLICATE
| size | 0
| scope | distributed-ack
Also, we should have some directories on the file system under our temporary directory called “locator” and “server1”.
另外,在我们的临时目录下的文件系统中,应该有一些名为 “定位器 “和 “server1 “的目录。
With this output, we know that we’re ready to move on.
有了这个产出,我们知道我们已经准备好继续前进。
3. Maven Dependency
3.Maven的依赖性
Now that we have a running Geode, let’s start looking at the client code.
现在我们有了一个正在运行的Geode,让我们开始看一下客户端的代码。
To work with Geode in our Java code, we’re going to need to add the Apache Geode Java client library to our pom:
为了在我们的Java代码中使用Geode,我们需要将Apache Geode Java客户端库添加到我们的pom。
<dependency>
<groupId>org.apache.geode</groupId>
<artifactId>geode-core</artifactId>
<version>1.6.0</version>
</dependency>
Let’s begin by simply storing and retrieving some data in a couple of regions.
让我们先简单地在几个区域存储和检索一些数据。
4. Simple Storage and Retrieval
4.简单存储和检索
Let’s demonstrate how to store single values, batches of values as well as custom objects.
让我们演示一下如何存储单个数值、成批的数值以及自定义对象。
To start storing data in our “baeldung” region, let’s connect to it using the locator:
为了开始在我们的 “baeldung “区域存储数据,让我们使用定位器连接到它。
@Before
public void connect() {
this.cache = new ClientCacheFactory()
.addPoolLocator("localhost", 10334)
.create();
this.region = cache.<String, String>
createClientRegionFactory(ClientRegionShortcut.CACHING_PROXY)
.create("baeldung");
}
4.1. Saving Single Values
4.1.保存单个值
Now, we can simply store and retrieve data in our region:
现在,我们可以简单地在我们的区域内存储和检索数据。
@Test
public void whenSendMessageToRegion_thenMessageSavedSuccessfully() {
this.region.put("A", "Hello");
this.region.put("B", "Baeldung");
assertEquals("Hello", region.get("A"));
assertEquals("Baeldung", region.get("B"));
}
4.2. Saving Multiple Values at Once
4.2.一次性保存多个值
We can also save multiple values at once, say when trying to reduce network latency:
我们还可以一次保存多个值,比如说在试图减少网络延迟时。
@Test
public void whenPutMultipleValuesAtOnce_thenValuesSavedSuccessfully() {
Supplier<Stream<String>> keys = () -> Stream.of("A", "B", "C", "D", "E");
Map<String, String> values = keys.get()
.collect(Collectors.toMap(Function.identity(), String::toLowerCase));
this.region.putAll(values);
keys.get()
.forEach(k -> assertEquals(k.toLowerCase(), this.region.get(k)));
}
4.3. Saving Custom Objects
4.3.保存自定义对象
Strings are useful, but sooner rather than later we’ll need to store custom objects.
字符串很有用,但我们迟早会需要存储自定义对象。
Let’s imagine that we have a customer record we want to store using the following key type:
让我们想象一下,我们有一条客户记录,我们想用下面的键类型来存储。
public class CustomerKey implements Serializable {
private long id;
private String country;
// getters and setters
// equals and hashcode
}
And the following value type:
以及以下价值类型。
public class Customer implements Serializable {
private CustomerKey key;
private String firstName;
private String lastName;
private Integer age;
// getters and setters
}
There are a couple of extra steps to be able to store these:
有几个额外的步骤能够存储这些东西。
First, they should implement Serializable. While this isn’t a strict requirement, by making them Serializable, Geode can store them more robustly.
首先,它们应该实现Serializable。虽然这不是一个严格的要求,但是通过让它们实现Serializable,Geode可以更稳健地存储它们。
Second, they need to be on our application’s classpath as well as the classpath of our Geode Server.
其次,它们需要在我们的应用程序的classpath以及我们的Geode Server的classpath。
To get them to the server’s classpath, let’s package them up, say using mvn clean package.
为了让它们进入服务器的classpath,让我们把它们打包,例如使用mvn clean package。
And then we can reference the resulting jar in a new start server command:
然后我们可以在一个新的启动服务器命令中引用产生的jar。
gfsh> stop server --name=server1
gfsh> start server --name=server1 --classpath=../lib/apache-geode-1.0-SNAPSHOT.jar --server-port=0
Again, we have to run these commands from the temporary directory.
同样,我们必须从临时目录运行这些命令。
Finally, let’s create a new Region named “baeldung-customers” on the Server using the same command we used for creating the “baeldung” region:
最后,让我们在服务器上创建一个新的区域,命名为 “baeldung-customers”,使用我们创建 “baeldung “区域时使用的相同命令。
gfsh> create region --name=baeldung-customers --type=REPLICATE
In the code, we’ll reach out to the locator as before, specifying the custom type:
在代码中,我们将像以前一样联系到定位器,指定自定义类型。
@Before
public void connect() {
// ... connect through the locator
this.customerRegion = this.cache.<CustomerKey, Customer>
createClientRegionFactory(ClientRegionShortcut.CACHING_PROXY)
.create("baeldung-customers");
}
And, then, we can store our customer as before:
然后,我们可以像以前一样存储我们的客户。
@Test
public void whenPutCustomKey_thenValuesSavedSuccessfully() {
CustomerKey key = new CustomerKey(123);
Customer customer = new Customer(key, "William", "Russell", 35);
this.customerRegion.put(key, customer);
Customer storedCustomer = this.customerRegion.get(key);
assertEquals("William", storedCustomer.getFirstName());
assertEquals("Russell", storedCustomer.getLastName());
}
5. Region Types
5.地区类型
For most environments, we’ll have more than one copy or more than one partition of our region, depending on our read and write throughput requirements.
对于大多数环境,我们将有一个以上的副本或一个以上的区域分区,这取决于我们的读写吞吐量要求。
So far, we’ve used in-memory replicated regions. Let’s take a closer look.
到目前为止,我们已经使用了内存中的复制区域。让我们仔细看看。
5.1. Replicated Region
5.1.复制的区域
As the name suggests, a Replicated Region maintains copies of its data on more than one Server. Let’s test this.
顾名思义,一个复制区域在一个以上的服务器上保持其数据副本。让我们来测试一下。
From the gfsh console in the working directory, let’s add one more Server named server2 to the cluster:
从工作目录中的gfsh控制台,让我们再添加一个服务器,命名为server2到集群。
gfsh> start server --name=server2 --classpath=../lib/apache-geode-1.0-SNAPSHOT.jar --server-port=0
Remember that when we made “baeldung”, we used –type=REPLICATE. Because of this, Geode will automatically replicate our data to the new server.
记住,当我们制作 “baeldung “时,我们使用了-type=REPLICATE。正因为如此,Geode将自动把我们的数据复制到新的服务器上。
Let’s verify this by stopping server1:
让我们通过停止server1:来验证这一点。
gfsh> stop server --name=server1
And, then let’s execute a quick query on the “baeldung” region.
然后,让我们在 “baeldung “地区执行一个快速查询。
If the data was replicated successfully, we’ll get results back:
如果数据被成功复制,我们就会得到结果。
gfsh> query --query='select e.key from /baeldung.entries e'
Result : true
Limit : 100
Rows : 5
Result
------
C
B
A
E
D
So, it looks like the replication succeeded!
所以,看起来复制成功了!”。
Adding a replica to our region improves data availability. And, because more than one server can respond to queries, we’ll get higher read throughput as well.
在我们的区域中添加一个副本可以提高数据的可用性。而且,由于不止一个服务器可以响应查询,我们也将获得更高的读取吞吐量。
But, what if they both crash? Since these are in-memory regions, the data will be lost. For this, we can instead use –type=REPLICATE_PERSISTENT which also stores the data on disk while replicating.
但是,如果它们都崩溃了怎么办?由于这些是内存区域,数据将丢失。为此,我们可以改用-type=REPLICATE_PERSISTENT,在复制的同时也将数据存储在磁盘上。
5.2. Partitioned Region
5.2.分割的区域
With larger datasets, we can better scale the system by configuring Geode to split a region up into separate partitions, or buckets.
对于更大的数据集,我们可以通过配置Geode将一个区域分割成独立的分区或桶来更好地扩展系统。
Let’s create one partitioned Region named “baeldung-partitioned”:
让我们创建一个名为 “baeldung-partitioned “的分区Region。
gfsh> create region --name=baeldung-partitioned --type=PARTITION
Add some data:
添加一些数据。
gfsh> put --region=baeldung-partitioned --key="1" --value="one"
gfsh> put --region=baeldung-partitioned --key="2" --value="two"
gfsh> put --region=baeldung-partitioned --key="3" --value="three"
And quickly verify:
并迅速核实。
gfsh> query --query='select e.key, e.value from /baeldung-partitioned.entries e'
Result : true
Limit : 100
Rows : 3
key | value
--- | -----
2 | two
1 | one
3 | three
Then, to validate that the data got partitioned, let’s stop server1 again and re-query:
然后,为了验证数据是否被分区,让我们再次停止server1并重新查询。
gfsh> stop server --name=server1
gfsh> query --query='select e.key, e.value from /baeldung-partitioned.entries e'
Result : true
Limit : 100
Rows : 1
key | value
--- | -----
2 | two
We only got some of the data entries back this time because that server only has one partition of the data, so when server1 dropped, its data was lost.
这次我们只拿回了部分数据条目,因为该服务器只有一个数据分区,所以当server1掉线时,它的数据就丢失了。
But what if we need both partitioning and redundancy? Geode also supports a number of other types. The following three are handy:
但是如果我们同时需要分区和冗余呢?Geode也支持其他一些类型。以下三种是很方便的。
- PARTITION_REDUNDANT partitions and replicates our data across different members of the cluster
- PARTITION_PERSISTENT partitions the data like PARTITION, but to disk, and
- PARTITION_REDUNDANT_PERSISTENT gives us all three behaviors.
6. Object Query Language
6.对象查询语言
Geode also supports Object Query Language, or OQL, which can be more powerful than a simple key lookup. It’s a bit like SQL.
Geode还支持对象查询语言,或称OQL,它可以比简单的键查询更强大。它有点像SQL。
For this example, let’s use the “baeldung-customer” region we built earlier.
在这个例子中,让我们使用我们先前建立的 “baeldung-customer “区域。
If we add a couple more customers:
如果我们再增加几个客户。
Map<CustomerKey, Customer> data = new HashMap<>();
data.put(new CustomerKey(1), new Customer("Gheorge", "Manuc", 36));
data.put(new CustomerKey(2), new Customer("Allan", "McDowell", 43));
this.customerRegion.putAll(data);
Then we can use QueryService to find customers whose first name is “Allan”:
然后,我们可以使用QueryService来查找名字为 “Allan “的客户。
QueryService queryService = this.cache.getQueryService();
String query =
"select * from /baeldung-customers c where c.firstName = 'Allan'";
SelectResults<Customer> results =
(SelectResults<Customer>) queryService.newQuery(query).execute();
assertEquals(1, results.size());
7. Function
7.功能
One of the more powerful notions of in-memory data grids is the idea of “taking the computations to the data”.
内存数据网格的一个更强大的概念是 “把计算带到数据上 “的想法。
Simply put, since Geode is pure Java, it’s easy for us to not only send data but also logic to perform on that data.
简单地说,由于Geode是纯Java,我们不仅可以轻松地发送数据,还可以对这些数据执行逻辑。
This might remind us of the idea of SQL extensions like PL-SQL or Transact-SQL.
这可能会让我们想起SQL扩展的想法,如PL-SQL或Transact-SQL。
7.1. Defining a Function
7.1.定义一个函数
To define a unit of work for Geode to do, we implement Geode’s Function interface.
为了给Geode定义一个工作单位,我们实现了Geode的Function接口。
For example, let’s imagine we need to change all the customer’s names to upper case.
例如,让我们设想一下,我们需要将所有客户的名字改为大写。
Instead of querying the data and having our application do the work, we can just implement Function:
我们可以直接实现Function,而不是查询数据并让我们的应用程序做这些工作。
public class UpperCaseNames implements Function<Boolean> {
@Override
public void execute(FunctionContext<Boolean> context) {
RegionFunctionContext regionContext = (RegionFunctionContext) context;
Region<CustomerKey, Customer> region = regionContext.getDataSet();
for ( Map.Entry<CustomerKey, Customer> entry : region.entrySet() ) {
Customer customer = entry.getValue();
customer.setFirstName(customer.getFirstName().toUpperCase());
}
context.getResultSender().lastResult(true);
}
@Override
public String getId() {
return getClass().getName();
}
}
Note that getId must return a unique value, so the class name is typically a good pick.
注意,getId必须返回一个唯一的值,所以类名通常是一个好的选择。
The FunctionContext contains all our region data, and so we can do a more sophisticated query out of it, or, as we’ve done here, mutate it.
FunctionContext包含了我们所有的区域数据,因此我们可以从它那里做一个更复杂的查询,或者像我们在这里做的那样,对它进行变异。
And Function has plenty more power than this, so check out the official manual, especially the getResultSender method.
而Function有很多比这更强大的功能,所以请查看官方手册,特别是getResultSender方法。
7.2. Deploying Function
7.2.部署功能
We need to make Geode aware of our function to be able to run it. Like we did with our custom data types, we’ll package the jar.
我们需要让Geode知道我们的函数,以便能够运行它。就像我们对自定义数据类型所做的那样,我们将打包这个jar。
But this time, we can just use the deploy command:
但是这一次,我们可以直接使用deploy命令。
gfsh> deploy --jar=./lib/apache-geode-1.0-SNAPSHOT.jar
7.3. Executing Function
7.3.执行函数
Now, we can execute the Function from the application using the FunctionService:
现在,我们可以使用FunctionService从应用程序中执行Function。
@Test
public void whenExecuteUppercaseNames_thenCustomerNamesAreUppercased() {
Execution execution = FunctionService.onRegion(this.customerRegion);
execution.execute(UpperCaseNames.class.getName());
Customer customer = this.customerRegion.get(new CustomerKey(1));
assertEquals("GHEORGE", customer.getFirstName());
}
8. Conclusion
8.结论
In this article, we learned the basic concepts of the Apache Geode ecosystem. We looked at simple gets and puts with standard and custom types, replicated and partitioned regions, and oql and function support.
在这篇文章中,我们学习了Apache Geode生态系统的基本概念。我们研究了标准和自定义类型的简单get和put,复制和分区区域,以及oql和函数支持。
And as always, all these samples are available over on GitHub.
像往常一样,所有这些样本都可以在GitHub上找到。