1. Introduction
1.介绍
Apache Ignite is an open source memory-centric distributed platform. We can use it as a database, a caching system or for the in-memory data processing.
Apache Ignite是一个开源的以内存为中心的分布式平台。我们可以把它作为一个数据库、一个缓存系统或用于内存数据处理。
The platform uses memory as a storage layer, therefore has impressive performance rate. Simply put, this is one of the fastest atomic data processing platforms currently in production use.
该平台使用内存作为存储层,因此具有令人印象深刻的性能率。简单地说,这是目前生产中使用的最快的原子数据处理平台之一。
2. Installation and Setup
2.安装和设置
As a beginning, check out the getting started page for the initial setup and installation instructions.
作为一个开端,请查看开始使用页面,了解初始设置和安装说明。
The Maven dependencies for the application we are going to build:
我们要建立的应用程序的Maven依赖项。
<dependency>
<groupId>org.apache.ignite</groupId>
<artifactId>ignite-core</artifactId>
<version>${ignite.version}</version>
</dependency>
<dependency>
<groupId>org.apache.ignite</groupId>
<artifactId>ignite-indexing</artifactId>
<version>${ignite.version}</version>
</dependency>
ignite-core is the only mandatory dependency for the project. As we also want to interact with the SQL, ignite-indexing is also here. ${ignite.version} is the latest version of Apache Ignite.
ignite-core是该项目唯一必须依赖的东西。由于我们还想与SQL交互,ignite-indexing也在这里。${ignite.version}是Apache Ignite的最新版本。
As the last step, we start the Ignite node:
作为最后一步,我们启动Ignite节点。
Ignite node started OK (id=53c77dea)
Topology snapshot [ver=1, servers=1, clients=0, CPUs=4, offheap=1.2GB, heap=1.0GB]
Data Regions Configured:
^-- default [initSize=256.0 MiB, maxSize=1.2 GiB, persistenceEnabled=false]
The console output above shows that we’re ready to go.
上面的控制台输出显示,我们已经准备好了。
3. Memory Architecture
3.存储器结构
The platform is based on Durable Memory Architecture. This enables to store and process the data both on disk and in memory. It increases the performance by using the RAM resources of the cluster effectively.
该平台是基于持久的内存架构。这使得在磁盘和内存中都能存储和处理数据。它通过有效利用集群的RAM资源来提高性能。
The data in memory and on the disk has the same binary representation. This means no additional conversion of the data while moving from one layer to another.
内存中的数据和磁盘上的数据具有相同的二进制表示。这意味着在从一层移动到另一层时不需要对数据进行额外的转换。
Durable memory architecture splits into fixed-size blocks called pages. Pages are stored outside of Java heap and organized in a RAM. It has a unique identifier: FullPageId.
持久的内存架构被分割成固定大小的块,称为页。页存储在Java堆之外,并组织在一个RAM中。它有一个唯一的标识符。FullPageId。
Pages interact with the memory using the PageMemory abstraction.
页面使用PageMemory抽象与内存交互。
It helps to read, write a page, also to allocate a page id. Inside the memory, Ignite associates pages with Memory Buffers.
它有助于读、写一个页面,也有助于分配一个页面ID。在内存中,Ignite将页面与内存缓冲区相关联。
4. Memory Pages
4.内存页
A Page can have the following states:
一个页面可以有以下状态。
- Unloaded – no page buffer loaded in memory
- Clear – the page buffer is loaded and synchronized with the data on disk
- Durty – the page buffer holds a data which is different from the one in disk
- Dirty in checkpoint – there is another modification starts before the first one persists to disk. Here a checkpoint starts and PageMemory keeps two memory buffers for each Page.
Durable memory allocates local a memory segment called Data Region. By default, it has a capacity of 20% of the cluster memory. Multiple regions configuration allows keeping the usable data in a memory.
耐用内存分配给本地一个称为数据区域的内存段。默认情况下,它的容量为集群内存的20%。多区域配置允许在一个内存中保持可用的数据。
The maximum capacity of the region is a Memory Segment. It’s a physical memory or a continuous byte array.
区域的最大容量是一个Memory Segment。它是一个物理内存或一个连续的字节数组。
To avoid memory fragmentations, a single page holds multiple key-value entries. Every new entry will be added to the most optimal page. If the key-value pair size exceeds the maximum capacity of the page, Ignite stores the data in more than one page. The same logic applies to updating the data.
为了避免内存碎片化,一个页面可以容纳多个键值条目。每个新条目都会被添加到最优化的页面。如果键值对的大小超过了页面的最大容量,Ignite会将数据存储在一个以上的页面。同样的逻辑也适用于更新数据。
SQL and cache indexes are stored in structures known as B+ Trees. Cache keys are ordered by their key values.
SQL和缓存索引被存储在被称为B+树的结构中。缓存的键值是按其键值排序的。
5. Lifecycle
5.生命周期
Each Ignite node runs on a single JVM instance. However, it’s possible to configure to have multiple Ignite nodes running in a single JVM process.
每个Ignite节点在一个JVM实例上运行。不过,也可以配置成在一个JVM进程中运行多个Ignite节点。
Let’s go through the lifecycle event types:
让我们来看看生命周期的事件类型。
- BEFORE_NODE_START – before the Ignite node startup
- AFTER_NODE_START – fires just after the Ignite node start
- BEFORE_NODE_STOP – before initiating the node stop
- AFTER_NODE_STOP – after the Ignite node stops
To start a default Ignite node:
要启动一个默认的Ignite节点。
Ignite ignite = Ignition.start();
Or from a configuration file:
或者从一个配置文件中。
Ignite ignite = Ignition.start("config/example-cache.xml");
In case we need more control over the initialization process, there is another way with the help of LifecycleBean interface:
如果我们需要对初始化过程进行更多的控制,还有另外一种方法,就是借助LifecycleBean接口。
public class CustomLifecycleBean implements LifecycleBean {
@Override
public void onLifecycleEvent(LifecycleEventType lifecycleEventType)
throws IgniteException {
if(lifecycleEventType == LifecycleEventType.AFTER_NODE_START) {
// ...
}
}
}
Here, we can use the lifecycle event types to perform actions before or after the node starts/stops.
在这里,我们可以使用生命周期事件类型,在节点启动/停止之前或之后执行动作。
For that purpose, we pass the configuration instance with the CustomLifecycleBean to the start method:
为此,我们将带有CustomLifecycleBean的配置实例传递给start方法。
IgniteConfiguration configuration = new IgniteConfiguration();
configuration.setLifecycleBeans(new CustomLifecycleBean());
Ignite ignite = Ignition.start(configuration);
6. In-Memory Data Grid
6.内存中的数据网格
Ignite data grid is a distributed key-value storage, very familiar to partitioned HashMap. It is horizontally scaled. This means more cluster nodes we add, more data is cached or stored in memory.
Ignite数据网格是一种分布式键值存储,非常熟悉分区的HashMap。它是水平扩展的。这意味着我们增加更多的集群节点,更多的数据被缓存或存储在内存中。
It can provide significant performance improvement to the 3rd party software, like NoSql, RDMS databases as an additional layer for caching.
它可以为第三方软件提供显著的性能改进,如NoSql、RDMS数据库作为缓存的附加层。
6.1. Caching Support
6.1.缓存支持
The data access API is based on JCache JSR 107 specification.
数据访问API是基于JCache JSR 107规范。
As an example, let’s create a cache using a template configuration:
作为一个例子,让我们使用一个模板配置来创建一个缓存。
IgniteCache<Employee, Integer> cache = ignite.getOrCreateCache(
"baeldingCache");
Let’s see what’s happening here for more details. First, Ignite finds the memory region where the cache stored.
让我们来看看这里发生了什么,以了解更多细节。首先,Ignite找到了存储缓冲区的内存区域。
Then, the B+ tree index Page will be located based on the key hash code. If the index exists, a data Page of the corresponding key will be located.
然后,B+树的索引页将根据钥匙的哈希代码被定位。如果该索引存在,则将找到相应密钥的数据页。
When the index is NULL, the platform creates the new data entry by using the given key.
当索引为NULL时,平台通过使用给定的键创建新的数据条目。
Next, let’s add some Employee objects:
接下来,让我们添加一些Employee对象。
cache.put(1, new Employee(1, "John", true));
cache.put(2, new Employee(2, "Anna", false));
cache.put(3, new Employee(3, "George", true));
Again, the durable memory will look for the memory region where the cache belongs. Based on the cache key, the index page will be located in a B+ tree structure.
同样,持久内存将寻找缓存所属的内存区域。基于缓存的键,索引页将被定位在B+树状结构中。
When the index page doesn’t exist, a new one is requested and added to the tree.
当索引页不存在时,就会申请一个新的索引页并添加到树上。
Next, a data page is assigning to the index page.
接下来,一个数据页被分配到索引页。
To read the employee from the cache, we just use the key value:
要从缓存中读取雇员,我们只需使用键值。
Employee employee = cache.get(1);
6.2. Streaming Support
6.2.流媒体支持
In memory data streaming provides an alternative approach for the disk and file system based data processing applications. The Streaming API splits the high load data flow into multiple stages and routes them for processing.
内存中的数据流为基于磁盘和文件系统的数据处理应用提供了一种替代方法。流媒体API将高负载的数据流分割成多个阶段,并将其用于处理。
We can modify our example and stream the data from the file. First, we define a data streamer:
我们可以修改我们的例子,将数据从文件中流出来。首先,我们定义一个数据流程序。
IgniteDataStreamer<Integer, Employee> streamer = ignite
.dataStreamer(cache.getName());
Next, we can register a stream transformer to mark the received employees as employed:
接下来,我们可以注册一个流转化器,将收到的雇员标记为就业。
streamer.receiver(StreamTransformer.from((e, arg) -> {
Employee employee = e.getValue();
employee.setEmployed(true);
e.setValue(employee);
return employee;
}));
As a final step, we iterate over the employees.txt file lines and convert them into Java objects:
作为最后一步,我们遍历employees.txt文件行,并将它们转换成Java对象。
Path path = Paths.get(IgniteStream.class.getResource("employees.txt")
.toURI());
Gson gson = new Gson();
Files.lines(path)
.forEach(l -> streamer.addData(
employee.getId(),
gson.fromJson(l, Employee.class)));
With the use of streamer.addData() put the employee objects into the stream.
使用streamer.addData()将雇员对象放入流中。
7. SQL Support
7.SQL支持
The platform provides memory-centric, fault-tolerant SQL database.
该平台提供以内存为中心的、容错的SQL数据库。
We can connect either with pure SQL API or with JDBC. SQL syntax here is ANSI-99, so all the standard aggregation functions in the queries, DML, DDL language operations are supported.
我们可以用纯SQL API或JDBC连接。这里的SQL语法是ANSI-99,所以在查询、DML、DDL语言操作中支持所有的标准聚合函数。
7.1. JDBC
7.1 JDBC
To get more practical, let’s create a table of employees and add some data to it.
为了更加实际,让我们创建一个雇员表,并向其中添加一些数据。
For that purpose, we register a JDBC driver and open a connection as a next step:
为此,我们注册了一个JDBC驱动程序,并打开了一个连接作为下一步。
Class.forName("org.apache.ignite.IgniteJdbcThinDriver");
Connection conn = DriverManager.getConnection("jdbc:ignite:thin://127.0.0.1/");
With the help of the standard DDL command, we populate the Employee table:
在标准DDL命令的帮助下,我们填充了Employee表。
sql.executeUpdate("CREATE TABLE Employee (" +
" id LONG PRIMARY KEY, name VARCHAR, isEmployed tinyint(1)) " +
" WITH \"template=replicated\"");
After the WITH keyword, we can set the cache configuration template. Here we use the REPLICATED. By default, the template mode is PARTITIONED. To specify the number of copies of the data we can also specify BACKUPS parameter here, which is 0 by default.
在WITH关键字之后,我们可以设置缓存配置模板。这里我们使用REPLICATED。默认情况下,模板模式是PARTITIONED。为了指定数据的拷贝数,我们还可以在这里指定BACKUPS参数,默认为0。
Then, let’s add up some data by using INSERT DML statement:
然后,让我们通过使用INSERT DML语句添加一些数据。
PreparedStatement sql = conn.prepareStatement(
"INSERT INTO Employee (id, name, isEmployed) VALUES (?, ?, ?)");
sql.setLong(1, 1);
sql.setString(2, "James");
sql.setBoolean(3, true);
sql.executeUpdate();
// add the rest
Afterward, we select the records:
之后,我们选择记录。
ResultSet rs
= sql.executeQuery("SELECT e.name, e.isEmployed "
+ " FROM Employee e "
+ " WHERE e.isEmployed = TRUE ")
7.2. Query the Objects
7.2.查询对象
It’s also possible to perform a query over Java objects stored in the cache. Ignite treats Java object as a separate SQL record:
也可以对存储在缓存中的Java对象进行查询。Ignite将Java对象视为一个单独的SQL记录。
IgniteCache<Integer, Employee> cache = ignite.cache("baeldungCache");
SqlFieldsQuery sql = new SqlFieldsQuery(
"select name from Employee where isEmployed = 'true'");
QueryCursor<List<?>> cursor = cache.query(sql);
for (List<?> row : cursor) {
// do something with the row
}
8. Summary
8.总结
In this tutorial, we had a quick look at Apache Ignite project. This guide highlights the advantages of the platform over other simial products such as performance gains, durability, lightweight APIs.
在本教程中,我们对Apache Ignite项目进行了快速了解。本指南强调了该平台相对于其他类似产品的优势,如性能提升、耐久性、轻量级API。
As a result, we learned how to use the SQL language and Java API for to store, retrieve, stream the data inside of the persistence or in-memory grid.
因此,我们学会了如何使用SQL语言和Java API来存储、检索、流化持久性或内存网格中的数据。
As usual, the complete code for this article is available over on GitHub.
像往常一样,本文的完整代码可以在GitHub上找到。