Storing PostgreSQL JSONB Using Spring Boot and JPA – 使用 Spring Boot 和 JPA 存储 PostgreSQL JSONB

最后修改: 2024年 2月 2日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

This tutorial will provide us with a comprehensive understanding of storing JSON data in a PostgreSQL JSONB column.

本教程将让我们全面了解如何在 PostgreSQL JSONB 列中存储 JSON 数据。

We’ll quickly review how we cope with a JSON value stored in a variable character (VARCHAR) database column using JPA. After that, we’ll compare the differences between the VARCHAR type and the JSONB type, understanding the additional features of JSONB. Finally, we’ll address the mapping JSONB type in JPA.

我们将快速回顾如何使用 JPA 来处理存储在可变字符 (VARCHAR) 数据库列中的 JSON 值。然后,我们将比较 VARCHAR 类型和 JSONB 类型之间的差异,了解 JSONB 的附加功能。最后,我们将讨论 JPA 中的 JSONB 类型映射。

2. VARCHAR Mapping

2.VARCHAR映射

In this section, we’ll explore how to convert a JSON value in VARCHAR type to a custom Java POJO using AttributeConverter.

在本节中,我们将探讨如何使用 AttributeConverterVARCHAR 类型的 JSON 值转换为自定义 Java POJO。

The purpose of it is to facilitate the conversion between entity attribute value in Java data type and its corresponding value in the database column.

其目的是方便 Java 数据类型中的实体属性值与数据库列中的相应值之间的转换。

2.1. Maven Dependency

2.1.Maven 依赖

To create an AttributeConverter, we have to include the Spring Data JPA dependency in the pom.xml:

要创建 属性转换器(AttributeConverter),我们必须在 pom.xml 中包含 Spring Data JPA 依赖关系:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-jpa</artifactId>
    <version>2.7.18</version>
</dependency>

2.2. Table Definition

2.2.表格定义

Let’s illustrate this concept with a simple example using the following database table definition:

让我们用一个简单的例子来说明这个概念,数据库表定义如下:

CREATE TABLE student (
    student_id VARCHAR(8) PRIMARY KEY,
    admit_year VARCHAR(4),
    address VARCHAR(500)
);

The student table has three fields, and we’re expecting the address column to store JSON values with the following structure:

student 表有三个字段,我们希望 address 列存储具有以下结构的 JSON 值:

{
  "postCode": "TW9 2SF",
  "city": "London"
}

2.3. Entity Class

2.3.实体类别

To handle this, we’ll create a corresponding POJO class to represent the address data in Java:

为了处理这个问题,我们将创建一个相应的 POJO 类来用 Java 表示地址数据:

public class Address {
    private String postCode;

    private String city;

    // constructor, getters and setters
}

Next, we’ll create an entity class, StudentEntity, and map it to the student table we created earlier:

接下来,我们将创建一个实体类 StudentEntity, 并将其映射到我们之前创建的 student 表:

@Entity
@Table(name = "student")
public class StudentEntity {
    @Id
    @Column(name = "student_id", length = 8)
    private String id;

    @Column(name = "admit_year", length = 4)
    private String admitYear;

    @Convert(converter = AddressAttributeConverter.class)
    @Column(name = "address", length = 500)
    private Address address;

    // constructor, getters and setters
}

We’ll annotate the address field with @Convert and apply AddressAttributeConverter to convert the Address instance into its JSON representation.

我们将使用 @Convertaddress 字段进行注释,并应用 AddressAttributeConverterAddress 实例转换为 JSON 表示形式。

2.4. AttributeConverter

2.4.属性转换器</em

We map the address field in the entity class to a VARCHAR type in the database. However, JPA cannot perform the conversion between the custom Java type and the VARCHAR type automatically. AttributeConverter comes in to bridge this gap by providing a mechanism to handle the conversion process.

我们将实体类中的 address 字段映射为数据库中的 VARCHAR 类型。但是,JPA 无法自动执行自定义 Java 类型和 VARCHAR 类型之间的转换。AttributeConverter 提供了一种机制来处理转换过程,从而弥补了这一缺陷。

We use AttributeConverter to persist a custom Java data type to a database column. It’s mandatory to define two conversion methods for every AttributeConverter implementation. One converts the Java data type to its corresponding database data type, while the other converts the database data type to the Java data type:

我们使用 AttributeConverter 将自定义 Java 数据类型持久化为数据库列。每个 AttributeConverter 实现都必须定义两个转换方法。其中一个方法将 Java 数据类型转换为相应的数据库数据类型,而另一个方法则将数据库数据类型转换为 Java 数据类型:

@Converter
public class AddressAttributeConverter implements AttributeConverter<Address, String> {
    private static final ObjectMapper objectMapper = new ObjectMapper();

    @Override
    public String convertToDatabaseColumn(Address address) {
        try {
            return objectMapper.writeValueAsString(address);
        } catch (JsonProcessingException jpe) {
            log.warn("Cannot convert Address into JSON");
            return null;
        }
    }

    @Override
    public Address convertToEntityAttribute(String value) {
        try {
            return objectMapper.readValue(value, Address.class);
        } catch (JsonProcessingException e) {
            log.warn("Cannot convert JSON into Address");
            return null;
        }
    }
}

convertToDatabaseColumn() is responsible for converting an entity field value to the corresponding database column value, whereas convertToEntityAttribute() is responsible for converting a database column value to the corresponding entity field value.

convertToDatabaseColumn() 负责将实体字段值转换为相应的数据库列值,而 convertToEntityAttribute() 则负责将数据库列值转换为相应的实体字段值。

2.5. Test Case

2.5.测试案例

Now, let’s create a test case to persist a Student instance in the database:

现在,让我们创建一个测试用例,在数据库中持久化一个 Student 实例:

@Test
void whenSaveAnStudentEntityAndFindById_thenTheRecordPresentsInDb() {
    String studentId = "23876213";
    String postCode = "KT5 8LJ";

    Address address = new Address(postCode, "London");
    StudentEntity studentEntity = StudentEntity.builder()
      .id(studentId)
      .admitYear("2023")
      .address(address)
      .build();

    StudentEntity savedStudentEntity = studentRepository.save(studentEntity);

    Optional<StudentEntity> studentEntityOptional = studentRepository.findById(studentId);
    assertThat(studentEntityOptional.isPresent()).isTrue();

    studentEntity = studentEntityOptional.get();
    assertThat(studentEntity.getId()).isEqualTo(studentId);
    assertThat(studentEntity.getAddress().getPostCode()).isEqualTo(postCode);
}

When we run the test, JPA triggers the following insert SQL:

运行测试时,JPA 会触发以下插入 SQL:

Hibernate: 
    insert 
    into
        "public"
        ."student_str" ("address", "admit_year", "student_id") 
    values
        (?, ?, ?)
binding parameter [1] as [VARCHAR] - [{"postCode":"KT6 7BB","city":"London"}]
binding parameter [2] as [VARCHAR] - [2023]
binding parameter [3] as [VARCHAR] - [23876371]

We’ll see the 1st parameter has been converted successfully from our Address instance by the AddressAttributeConverter and binds as a VARCHAR type.

我们将看到第一个参数已被 AddressAttributeConverterAddress 实例中成功转换,并绑定为 VARCHAR 类型。

3. JSONB Over VARCHAR

JSONB over VARCHAR 3.

We have explored the conversion where we have JSON data stored in the VARCHAR column. Now, let’s change the column definition of address from VARCHAR to JSONB:

我们已经探索了在 VARCHAR 列中存储 JSON 数据的转换。现在,让我们将 address 的列定义从 VARCHAR 更改为 JSONB

CREATE TABLE student (
    student_id VARCHAR(8) PRIMARY KEY,
    admit_year VARCHAR(4),
    address jsonb
);

A commonly asked question often arises when we explore the JSONB data type: What’s the significance of using JSONB to store JSON in PostgreSQL over VARCHAR since it’s essentially a string?

当我们探索 JSONB 数据类型时,经常会遇到一个常见问题:既然 JSON 本质上是字符串,那么使用 JSONB 在 PostgreSQL 中存储 JSON 比使用 VARCHAR 有什么意义?

JSONB is a designated data type for processing JSON data in PostgreSQL. This type stores data in a decomposed binary format, which has a bit of overhead when storing JSON due to the additional conversion.

JSONB 是用于在 PostgreSQL 中处理 JSON 数据的指定数据类型。该类型以分解的二进制格式存储数据,由于需要进行额外的转换,因此在存储 JSON 时会产生一些开销

Indeed, it provides additional features compared to VARCHAR that make JSONB a more favorable choice for storing JSON data in PostgreSQL.

事实上,与VARCHAR相比,JSONB提供了更多的功能,这使得JSONB成为在 PostgreSQL 中存储 JSON 数据的更有利选择。

3.1. Validation

3.1.验证

JSONB type enforces data validation on the stored value that makes sure the column value is a valid JSON. PostgreSQL rejects any attempts to insert or update data with invalid JSON values.

JSONB 类型会对存储值执行数据验证,以确保列值是有效的 JSON。 PostgreSQL 拒绝任何插入或更新带有无效 JSON 值的数据的尝试。

To demonstrate this, we can consider an insert SQL query with an invalid JSON value for the address column where a double quote is missing at the end of the city attribute:

为了演示这一点,我们可以考虑插入 SQL 查询,其中 address 列的 JSON 值无效,city 属性末尾缺少双引号:

INSERT INTO student(student_id, admit_year, address) 
VALUES ('23134572', '2022', '{"postCode": "E4 8ST, "city":"London}');

The execution of this query in PostgreSQL results in a validation error indicating the JSON isn’t valid:

在 PostgreSQL 中执行此查询会出现验证错误,表明 JSON 无效:

SQL Error: ERROR: invalid input syntax for type json
  Detail: Token "city" is invalid.
  Position: 83
  Where: JSON data, line 1: {"postCode": "E4 8ST, "city...

3.2. Querying

3.2.查询

PostgreSQL supports querying using JSON columns in SQL queries. JPA supports using native queries to search for records in the database. In Spring Data, we can define a custom query method that finds a list of Student:

PostgreSQL 支持在 SQL 查询中使用 JSON 列进行查询。JPA 支持使用本地查询在数据库中搜索记录。在 Spring Data 中,我们可以定义一个自定义查询方法来查找 Student 列表:

@Repository
public interface StudentRepository extends CrudRepository<StudentEntity, String> {
    @Query(value = "SELECT * FROM student WHERE address->>'postCode' = :postCode", nativeQuery = true)
    List<StudentEntity> findByAddressPostCode(@Param("postCode") String postCode);
}

This query is a native SQL query that selects all Student instances in the database where the address JSON attribute postCode equals the provided parameter.

此查询是一个本地 SQL 查询,用于选择数据库中 address JSON 属性 postCode 等于所提供参数的所有 Student 实例。

3.3. Indexing

3.3.索引

JSONB supports JSON data indexing. This gives JSONB a significant advantage when we have to query the data by keys or attributes in the JSON column.

JSONB 支持 JSON 数据索引。当我们需要通过 JSON 列中的键或属性来查询数据时,这将为 JSONB 带来显著优势。

Various types of indexes can be applied to a JSON column, including GIN, HASH, and BTREE. GIN is suitable for indexing complex data structures, including arrays and JSON. HASH is important when we only need to consider the equality operator =. BTREE allows efficient queries when we deal with range operators such as < and >=.

可以对 JSON 列应用各种类型的索引,包括 GIN、HASHBTREE 。GIN 适用于索引复杂的数据结构,包括数组和 JSON。当我们只需要考虑相等运算符 = 时,HASH 非常重要。当我们处理范围运算符(如 <>=)时,BTREE 可以实现高效查询。

For example, we could create the following index if we always need to retrieve data according to the postCode attribute in the address column:

例如,如果我们总是需要根据 address 列中的 postCode 属性检索数据,我们可以创建以下索引:

CREATE INDEX idx_postcode ON student USING HASH((address->'postCode'));

4. JSONB Mapping

4.JSONB 映射

We cannot apply the same AttributeConverter when the databases column is defined as JSONB. Our application ]throws the following error upon start-up if we attempt to:

当数据库列定义为 JSONB 时,我们无法应用相同的 属性转换器。如果我们尝试应用属性转换器,我们的应用程序会在启动时抛出以下错误:

org.postgresql.util.PSQLException: ERROR: column "address" is of type jsonb but expression is of type character varying

This is the case even if we change the AttributeConverter class definition to use Object as the converted column value instead of String:

即使我们更改 AttributeConverter 类的定义,使用 Object 代替 String 作为转换后的列值,情况也是如此:

@Converter 
public class AddressAttributeConverter implements AttributeConverter<Address, Object> {
    // 2 conversion methods implementation
}

Our application complains about the unsupported type:

我们的应用程序抱怨不支持该类型:

org.postgresql.util.PSQLException: Unsupported Types value: 1,943,105,171

This indicates that JPA doesn’t support JSONB type natively. However, our underlying JPA implementation, Hibernate, does support JSON custom types that allow us to map a complex type to a Java class.

这表明 JPA 本身不支持 JSONB 类型。不过,我们的底层 JPA 实现 Hibernate 确实支持 JSON 自定义类型,它允许我们将复杂类型映射到 Java 类。

4.1. Maven Dependency

4.1.Maven 依赖

Practically, we have to define a custom type for JSONB conversion. However, we don’t have to reinvent the wheel because of an existing library Hypersistence Utilities.

实际上,我们必须为 JSONB 转换定义一个自定义类型。不过,我们不必重新发明轮子,因为已有一个库 Hypersistence Utilities

Hypersistence Utilities is a general-purpose utility library for Hibernate. One of its features is having the definitions of  JSON column type mapping for different databases such as PostgreSQL and Oracle. Thus, we can simply include this additional dependency in the pom.xml:

Hypersistence Utilities 是 Hibernate 的通用实用程序库。它的功能之一是为 PostgreSQL 和 Oracle 等不同数据库定义 JSON 列类型映射。因此,我们只需在pom.xml中加入这一额外的依赖关系即可:

<dependency>
    <groupId>io.hypersistence</groupId>
    <artifactId>hypersistence-utils-hibernate-55</artifactId>
    <version>3.7.0</version>
</dependency>

4.2. Updated Entity Class

4.2.更新实体类

Hypersistence Utilities defines different custom types that are database-dependent. In PostgreSQL, we’ll use the JsonBinaryType class for the JSONB column type. In our entity class, we define the custom type using Hibernate’s @TypeDef annotation and then apply the defined type to the address field via @Type:

Hypersistence Utilities 定义了不同的自定义类型,这些类型取决于数据库。在 PostgreSQL 中,我们将使用 JsonBinaryType 类来定义 JSONB 列类型。在我们的实体类中,我们使用 Hibernate 的 @TypeDef 注解来定义自定义类型,然后通过 @Type 将定义的类型应用到 address 字段:

@Entity
@Table(name = "student")
@TypeDef(name = "jsonb", typeClass = JsonBinaryType.class)
public class StudentEntity {
    @Id
    @Column(name = "student_id", length = 8)
    private String id;

    @Column(name = "admit_year", length = 4)
    private String admitYear;

    @Type(type = "jsonb")
    @Column(name = "address", columnDefinition = "jsonb")
    private Address address;

    // getters and setters
}

For this case of using @Type, we don’t need to apply the AttributeConverter to the address field anymore. The custom type from Hypersistence Utilities handles the conversion task for us, making our code more neat. But note that @TypeDef and @Type annotations are deprecated in Hibernate 6.

在使用 @Type 的这种情况下,我们不再需要将 AttributeConverter 应用于 address 字段。来自 Hypersistence Utilities 的自定义类型为我们处理了转换任务,使我们的代码更加简洁。但请注意,@TypeDef@Type 注解在 Hibernate 6 中已被弃用。

4.3. Test Case

4.3.测试用例

After all these changes, let’s run the Student persistence test case again:

完成所有这些更改后,让我们再次运行 Student 持久性测试用例:

Hibernate: 
    insert 
    into
        "public"
        ."student" ("address", "admit_year", "student_id") 
    values
        (?, ?, ?)
binding parameter [1] as [OTHER] - [Address(postCode=KT6 7BB, city=London)]
binding parameter [2] as [VARCHAR] - [2023]
binding parameter [3] as [VARCHAR] - [23876371]

We’ll see that JPA triggers the same insert SQL as before, except the first parameter is binding as OTHER instead of VARCHAR. This indicates that Hibernate binds the parameter as a JSONB type this time.

我们将看到,JPA 触发了与之前相同的插入 SQL,只是第一个参数绑定为 OTHER,而不是 VARCHAR。这表明这次 Hibernate 将参数绑定为 JSONB 类型。

5. Conclusion

5.结论

This comprehensive guide equipped us with the knowledge to proficiently store and manage JSON data in PostgreSQL using Spring Boot and JPA.

本综合指南为我们提供了使用 Spring Boot 和 JPA 在 PostgreSQL 中熟练存储和管理 JSON 数据的知识。

It addressed the mapping of JSON value to VARCHAR type and JSONB type. It also highlighted the significance of JSONB in enforcing JSON validation and facilitating querying and indexing.

它涉及 JSON 值到 VARCHAR 类型和 JSONB 类型的映射。它还强调了 JSONB 在执行 JSON 验证以及促进查询和索引方面的重要性。

As always, the sample code is available over on GitHub.

与往常一样,示例代码可在 GitHub 上获取。