Guide to MicroStream – 微流指南

最后修改: 2022年 9月 23日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

MicroStream is an object graph persistence engine built for the JVM. We can use it for storing Java object graphs and restoring them in memory. Using a custom serialization concept, MicroStream enables us to store any Java type and to load the entire object graph, partial subgraphs, or single objects.

MicroStream是一个为JVM构建的对象图持久化引擎。我们可以使用它来存储Java对象图并在内存中恢复它们。使用一个自定义的序列化概念,MicroStream使我们能够存储任何Java类型,并加载整个对象图、部分子图或单个对象。

In this tutorial, we’ll first look at the reasons for developing such an object graph persistence engine. Then, we’ll compare this approach to traditional relational databases and standard Java serialization. We’ll see how to create an object graph storage and use it to persist, load, and delete data.

在本教程中,我们将首先看一下开发这样一个对象图持久化引擎的原因。然后,我们将把这种方法与传统的关系型数据库和标准的Java序列化进行比较。我们将看到如何创建一个对象图存储并使用它来持久化、加载和删除数据。

Finally, we’ll query the data using our local system memory and plain Java APIs.

最后,我们将使用我们的本地系统内存和普通的Java API来查询数据。

2. Object-Relational Mismatch

2.对象-关系不匹配

Let’s start by looking at the motivation for developing MicroStream. In most Java projects we require some kind of database storage.

让我们先来看看开发MicroStream的动机。在大多数Java项目中,我们需要某种数据库存储。

However, Java and popular relational or NoSQL databases use different data structures. Therefore, we need a way to map Java objects to the database structure and vice-versa. This mapping requires both programming effort and execution time. For example, we can use entities that map to tables and properties that match fields in a relational database.

然而,Java和流行的关系型数据库或NoSQL数据库使用不同的数据结构。因此,我们需要一种方法来将Java对象映射到数据库结构,反之亦然。这种映射既需要编程努力,也需要执行时间。例如,我们可以使用映射到表和属性的实体来匹配关系型数据库中的字段。

To load data from a database, we would often need to execute complex multi-table SQL queries. Although object-relational mapping frameworks such as Hibernate help developers bridge this gap, in many complex scenarios, the framework-generated queries are not fully optimized.

为了从数据库中加载数据,我们往往需要执行复杂的多表SQL查询。尽管诸如Hibernate等对象关系映射框架可以帮助开发人员弥补这一差距,但在许多复杂的场景中,框架生成的查询并没有得到完全优化。

MicroStream looks to solve this data structure mismatch by using the same structure for in-memory operations as for persisting data.

MicroStream希望通过在内存操作中使用与持久化数据相同的结构来解决这种数据结构的不匹配。

3. Using JVM as Storage

3.使用JVM作为存储

MicroStream uses the JVM as its storage to achieve fast, in-memory data processing with pure Java. Instead of using storage separated from the JVM, it provides us with a modern, native data storage library.

MicroStream使用JVM作为其存储,以纯Java实现快速、内存数据处理。它没有使用与JVM分离的存储,而是为我们提供了一个现代的本地数据存储库。

3.1. Database Management Systems

3.1.数据库管理系统

MicroStream is a persistence engine, not a database management system (DBMS). Some standard DBMS features like user management, connection management, and session handling have been left out by design.

MicroStream是一个持久性引擎,而不是一个数据库管理系统(DBMS)。一些标准的DBMS功能,如用户管理、连接管理和会话处理,在设计上被排除在外。

Instead, MicroStream focuses on providing us with an easy way to store and restore our application data.

相反,MicroStream专注于为我们提供一个简单的方法来存储和恢复我们的应用程序数据。

3.2. Java Serialization

3.2.Java序列化

MicroStream uses a custom serialization concept, purposely built to provide a more performant alternative to legacy DBMS.

MicroStream使用一个定制的序列化概念,特意为传统的DBMS提供一个更高性能的替代方案

It doesn’t use Java’s built-in serialization due to several limitations:

由于一些限制,它没有使用Java的内置序列化。

  • Only complete object graphs can be stored and restored
  • Inefficiency in terms of storage size and performance
  • The manual effort required when changing class structures

On the other hand, the custom MicroStream data store can:

另一方面,自定义MicroStream数据存储可以。

  • Persist, load or update object graphs partially and on-demand
  • Efficiently handle storage size and performance
  • Handle changing class structures by mapping data via internal heuristics or a user-defined mapping strategy

4. Object Graph Storage

4.对象图存储

MicroStream tries to simplify software development by using only one data structure with one data model.

MicroStream试图通过只使用一个数据结构和一个数据模型来简化软件开发。

Object instances are stored as a byte stream and references between them are mapped with unique identifiers. Therefore, an object graph can be stored in a simple and quick way. In addition, it can be loaded either wholly or partially.

对象实例被存储为字节流,它们之间的引用被映射为唯一的标识符。因此,一个对象图可以以一种简单而快速的方式存储。此外,它可以被全部或部分加载。

4.1. Dependencies

4.1. 依赖性

Before we can start storing object graphs using MicroStream, we’ll need to add two dependencies:

在我们开始使用MicroStream存储对象图之前,我们需要添加两个依赖项

<dependency>
    <groupId>one.microstream</groupId>
    <artifactId>microstream-storage-embedded</artifactId>
    <version>07.00.00-MS-GA</version>
</dependency>
<dependency>
    <groupId>one.microstream</groupId>
    <artifactId>microstream-storage-embedded-configuration</artifactId>
    <version>07.00.00-MS-GA</version>
</dependency>

4.2. Root Instance

4.2.根实例

When using object graph storage, our entire database is accessed starting at a root instance. This instance is called the root object of an object graph that gets persisted by MicroStream.

当使用对象图存储时,我们的整个数据库从一个根实例开始访问。这个实例被称为对象图的根对象,被MicroStream持久化。

Object graph instances, including the root instance, can be of any Java type. Therefore, a simple String instance can be registered as the entity graph’s root:

对象图实例,包括根实例,可以是任何Java类型。因此,一个简单的String实例可以被注册为实体图的根。

EmbeddedStorageManager storageManager = EmbeddedStorage.start(directory);
storageManager.setRoot("baeldung-demo");
storageManager.storeRoot();

However, as this root instance contains no children, our String instance comprises our entire database. Therefore, we would usually need to define a custom root type specific to our application:

然而,由于这个根实例不包含子实例,我们的String实例包括我们的整个数据库。因此,我们通常需要定义一个特定于我们应用程序的自定义根类型

public class RootInstance {

    private final String name;
    private final List<Book> books;

    public RootInstance(String name) {
        this.name = name;
        books = new ArrayList<>();
    }

    // standard getters, hashcode and equals
}

We can register a root instance using a custom type in a similar way, by calling the setRoot() and storeRoot() methods:

我们可以通过调用setRoot()storeRoot()方法,以类似方式使用自定义类型注册一个根实例。

EmbeddedStorageManager storageManager = EmbeddedStorage.start(directory);
storageManager.setRoot(new RootInstance("baeldung-demo"));
storageManager.storeRoot();

For now, our books list will be empty, but with our custom root, we’ll be able to store book instances later on:

现在,我们的书籍列表将是空的,但有了我们的自定义根,我们以后就可以存储书籍实例了。

RootInstance rootInstance = (RootInstance) storageManager.root();
assertThat(rootInstance.getName()).isEqualTo("baeldung-demo");
assertThat(rootInstance.getBooks()).isEmpty()
storageManager.shutdown();

We should note that once our application has finished working with the storage, it’s recommended to call the shutdown() method for safety.

我们应该注意,一旦我们的应用程序完成了对存储的操作,为了安全起见,建议调用shutdown()方法。

5. Manipulating Data

5.操纵数据

Let’s check how we can perform standard CRUD operations via our object graph persisted by MicroStream.

让我们看看我们如何通过MicroStream持久化的对象图执行标准的CRUD操作。

5.1. Storing

5.1. 储存

When storing new instances, we need to make sure to call the store() method on the correct object. The correct object is the owner of the newly created instances — in our example, a list:

当存储新实例时,我们需要确保在正确的对象上调用store()方法。正确的对象是新创建的实例的所有者–在我们的例子中,是一个列表

RootInstance rootInstance = (RootInstance) storageManager.root();
List<Book> books = rootInstance.getBooks();
books.addAll(booksToStore);
storageManager.store(books);
assertThat(books).hasSize(2);

Storing a new object would also store all instances referenced by this object. Also, executing the store() method guarantees that the data has been physically written to the underlying storage layer, usually a file system.

存储一个新对象也将存储这个对象所引用的所有实例。另外,执行store()方法可以保证数据已经被物理地写入底层存储层,通常是文件系统。

5.2. Eager Loading

5.2.急于加载

Loading data with MicroStream can be done in two ways, eager and lazy. Eager loading is the default way of loading objects from a stored object graph. If an already existing database is found during startup, then all objects of a stored object graph are loaded into memory.

用MicroStream加载数据可以通过两种方式进行,即急切和懒惰。急于加载是默认的从存储对象图加载对象的方式。如果在启动过程中发现一个已经存在的数据库,那么存储对象图的所有对象都被加载到内存中

After starting an EmbeddedStorageManager instance, we can load the data by getting the root instance of our object graph:

在启动一个EmbeddedStorageManager实例后,我们可以通过获取我们对象图的根实例来加载数据。

EmbeddedStorageManager storageManager = EmbeddedStorage.start(directory);
if (storageManager.root() == null) {
    RootInstance rootInstance = new RootInstance("baeldung-demo");
    storageManager.setRoot(rootInstance);
    storageManager.storeRoot();
} else {
    RootInstance rootInstance = (RootInstance) storageManager.root();
    // Use existing root loaded from storage
}

A null value of the root instance indicates a non-existing database in the underlying storage.

根实例的null值表示底层存储中不存在数据库。

5.3. Lazy Loading

5.3.懒惰加载

When we’re dealing with large amounts of data, loading all data directly into the memory at the start might not be a viable option. Therefore, MicroStream also supports lazy loading by wrapping an instance into a Lazy field.

当我们处理大量的数据时,一开始就把所有的数据直接加载到内存中可能不是一个可行的选择。因此,MicroStream也通过将一个实例包装成Lazy字段来支持懒惰加载

Lazy is a simple wrapper class, similar to the JDK’s WeakReference. Its instances internally hold an identifier and a reference to the actual instance:

Lazy是一个简单的封装类,类似于JDK的WeakReference.其实例内部持有一个标识符和对实际实例的引用。

private final Lazy<List<Book>> books;

A new ArrayList wrapped in a Lazy can be instantiated using the Reference() method:

一个新的ArrayList被包裹在Lazy中,可以使用Reference()方法进行实例化。

books = Lazy.Reference(new ArrayList<>());

Just as with WeakReference, to get the actual instance, we need to call a simple get() method:

就像对待WeakReference一样,为了获得实际的实例,我们需要调用一个简单的get()方法。

public List<Book> getBooks() {
    return Lazy.get(books);
}

The get() method call will reload the data when it’s needed, without developers having to deal with any low-level database identifiers.

get()方法调用将在需要时重新加载数据,而开发者无需处理任何低级数据库标识符。

5.4. Deleting

5.4.删除

Deleting data with MicroStream does not require performing explicit deletion actions. Instead, we just need to clear any references to the object in our object graph and store those changes:

用MicroStream删除数据不需要执行明确的删除动作。相反,我们只需要清除我们对象图中对该对象的任何引用,并存储这些变化。

List<Book> books = rootInstance.getBooks();
books.remove(1);
storageManager.store(books);

We should note that the deleted data is not immediately erased from the storage. Rather, a background housekeeping process runs a scheduled cleanup.

我们应该注意到,被删除的数据不会立即从存储器中删除。相反,一个后台内务管理程序运行一个预定的清理工作

6. Query System

6.查询系统

Unlike with standard DBMS, MicroStream queries do not operate on the storage directly but run on data in our local system memory. Therefore, there’s no need to learn any special query languages, as all operations are done with plain Java.

与标准DBMS不同,MicroStream查询不直接在存储上操作,而是在我们本地系统内存的数据上运行。因此,没有必要学习任何特殊的查询语言,因为所有的操作都是用普通的Java完成的。

A common approach may be to use Streams with standard Java collections:

一个常见的方法可能是使用Streams与标准Javacollections

List<Book> booksFrom1998 = rootInstance.getBooks().stream()
    .filter(book -> book.getYear() == 1998)
    .collect(Collectors.toList());

Given that queries run in memory, memory consumption might be high, but queries can run quickly.

鉴于查询在内存中运行,内存消耗可能很高,但查询可以快速运行。

The data storing and loading process can be parallelized by using multiple threads. At the moment, horizontal scaling is not possible, but MicroStream announced they are currently developing an object-graph replication approach. This would enable clustering and data replication over multiple nodes in the future.

数据存储和加载过程可以通过使用多个线程来并行化。目前,横向扩展是不可能的,但MicroStream宣布他们目前正在开发一种对象图复制方法。这将在未来实现多节点的集群和数据复制。

7. Conclusion

7.结语

In this article, we explored MicroStream, an object graph persistence engine for the JVM. We learned how MicroStream solves the object-relational data structure mismatch by applying the same structure for in-memory operations and data persistence.

在这篇文章中,我们探讨了MicroStream,一个JVM的对象图持久化引擎。我们了解到MicroStream是如何通过将相同的结构用于内存操作和数据持久化来解决对象-关系数据结构不匹配的问题。

We explored how to create object graphs using custom root instances. Also, we saw how to store, delete, and load data using the eager and lazy loading approaches. Finally, we looked at MicroStream’s query system based on in-memory operations with plain Java.

我们探讨了如何使用自定义根实例创建对象图。此外,我们还看到了如何使用急切和懒惰的加载方法来存储、删除和加载数据。最后,我们看了MicroStream的查询系统,它是基于普通Java的内存操作。

As always, the complete source code is available over on GitHub.

一如既往,完整的源代码可在GitHub上获得