1. Overview
1.概述
In Spring Boot applications, we’re often tasked to present tabular data to clients in chunks of 20 or 50 rows at a time. Pagination is a common practice for returning a fraction of data from a large dataset. However, there are scenarios where we need to obtain the entire result at once.
在 Spring Boot 应用程序中,我们经常需要将表格数据以每次 20 或 50 行的形式呈现给客户端。分页是从大型数据集中返回部分数据的常见做法。但是,在某些应用场景中,我们需要一次性获取整个结果。
In this tutorial, we’ll first revisit how to retrieve data in pagination using Spring Boot. Next, we’ll explore how to retrieve all results from one database table at once using pagination. Finally, we’ll dive into a more complex scenario that retrieves data with relationships.
在本教程中,我们将首先重温如何使用 Spring Boot 以分页方式检索数据。接下来,我们将探讨如何使用分页一次检索一个数据库表中的所有结果。最后,我们将深入一个更复杂的场景,利用关系检索数据。
2. Repository
2.存储库</em
The Repository is a Spring Data interface that provides data access abstraction. Depending on the Repository subinterface we have chosen, the abstraction provisions a predefined set of database operations.
Repository是一个Spring Data接口,提供数据访问抽象。根据我们选择的 Repository 子接口,抽象提供了一组预定义的数据库操作。
We don’t need to write code for standard database operations such as select, save, and delete. All we need is to create an interface for our entity and extend it to the chosen Repository subinterface.
我们不需要为选择、保存和删除等标准数据库操作编写代码。我们只需为实体创建一个接口,并将其扩展到所选的 Repository 子接口。
At runtime, Spring Data creates a proxy implementation that handles method invocations for our repository. When we invoke a method on the Repository interface, Spring Data generates the query dynamically based on the method and parameters.
在运行时,Spring Data 会创建一个代理实现来处理存储库的方法调用。当我们调用 Repository 接口上的方法时,Spring Data 会根据方法和参数动态生成查询。
There are three common Repository subinterfaces defined in Spring Data:
Spring Data 中定义了三种常见的 Repository 子接口:
- CrudRepository – The most fundamental Repository interface provided by Spring Data. It provisions CRUD (Create, Read, Update, and Delete) entity operations
- PagingAndSortingRepository – It extends the CrudRepository interface, and it adds additional methods to support pagination access and result sorting with ease
- JpaRepository – It extends the PagingAndSortingRepository interface and introduces JPA-specific operations such as saving and flushing an entity and deleting entities in a batch
3. Fetching Paged Data
3.获取分页数据
Let’s start with a simple scenario that obtains data from a database using pagination. We first create a Student entity class:
让我们从一个使用分页从数据库获取数据的简单场景开始。我们首先创建一个 Student 实体类:
@Entity
@Table(name = "student")
public class Student {
@Id
@Column(name = "student_id")
private String id;
@Column(name = "first_name")
private String firstName;
@Column(name = "last_name")
private String lastName;
}
Subsequently, we’ll create a StudentRepository for retrieving Student entities from the database. The JpaRepository interface contains the method findAll(Pageable pageable) by default. Thus, we don’t need to define additional methods, given that we just want to retrieve data in pages without selecting a field:
JpaRepository接口默认包含方法 findAll(Pageable pageable) 。因此,我们无需定义其他方法,因为我们只想检索页面中的数据,而无需选择字段:
public interface StudentRepository extends JpaRepository<Student, String> {
}
We can get the first page of Student with 10 rows per page by invoking findAll(Pageable) on StudentRepository. The first argument indicates the current page, which is zero-indexing, while the second argument denotes the number of records fetched per page:
我们可以通过调用 StudentRepository 上的 findAll(Pageable) 来获取 Student 的第一页,每页有 10 条记录。第一个参数表示当前页(零索引),第二个参数表示每页获取的记录数:
Pageable pageable = PageRequest.of(0, 10);
Page<Student> studentPage = studentRepository.findAll(pageable);
Often, we have to return a paged result that is sorted by a specific field. In such cases, we provide a Sort instance when we create the Pageable instance. This example shows that we’ll sort the page results by the id field from Student in ascending order:
通常,我们需要返回按特定字段排序的分页结果。在这种情况下,我们会在创建 Pageable 实例时提供一个 Sort 实例。此示例显示,我们将按照 Student 中的 id 字段以升序对页面结果进行排序:
Sort sort = Sort.by(Sort.Direction.ASC, "id");
Pageable pageable = PageRequest.of(0, 10).withSort(sort);
Page<Student> studentPage = studentRepository.findAll(pageable);
4. Fetching All Data
4.获取所有数据
A common question often arises: What if we want to retrieve all data at once? Do we need to call findAll() instead to obtain all the data? The answer is no. The Pageable interface defines a static method unpaged(), which returns a predefined Pageable instance that does not contain pagination information. We fetch all data by calling findAll(Pageable) with that Pageable instance:
经常会出现一个常见问题:如果我们想一次性检索所有数据,该怎么办?我们是否需要调用 findAll() 来获取所有数据?答案是否定的。Pageable接口定义了一个静态方法 unpaged(),该方法将返回一个预定义的不包含分页信息的 Pageable 实例:
Page<Student> studentPage = studentRepository.findAll(Pageable.unpaged());
If we require sorting the results, we can supply a Sort instance as an argument to the unpaged() method from Spring Boot 3.2 onward. For example, suppose we would like to sort the results by the lastName field in ascending order:
如果我们需要对结果进行排序,从 Spring Boot 3.2 开始,我们可以提供一个 Sort 实例作为 unpaged() 方法的参数。例如,假设我们希望按 lastName 字段升序对结果进行排序:
Sort sort = Sort.by(Sort.Direction.ASC, "lastName");
Page<Student> studentPage = studentRepository.findAll(Pageable.unpaged(sort));
However, achieving the same is a bit tricky in versions below 3.2, as unpaged() does not accept any argument. Instead, we have to create a PageRequest with the maximum page size and the Sort parameter:
但是,在 3.2 以下的版本中,实现同样的目标有点困难,因为unpaged()不接受任何参数。相反,我们必须创建一个带有最大页面大小和 Sort 参数的 PageRequest :
Pageable pageable = PageRequest.of(0, Integer.MAX_VALUE).withSort(sort);
Page<Student> studentPage = studentRepository.getStudents(pageable);
5. Fetching Data With Relationships
5.利用关系获取数据
We often define relationships between entities in the object-relational mapping (ORM) framework. Utilizing ORM frameworks such as JPA helps developers quickly model entities and relationships and eliminate the need to write SQL queries.
我们经常在对象关系映射(ORM)框架中定义实体之间的关系。利用 JPA 等 ORM 框架可以帮助开发人员快速为实体和关系建模,无需编写 SQL 查询。
However, there’s a potential issue that arises with data retrieval if we do not thoroughly understand how it works underneath. We must take caution when attempting to retrieve a collection of results from an entity with relationships, as this could lead to a performance impact, especially when fetching all data.
但是,如果我们不彻底了解数据检索的工作原理,就会在数据检索中出现一个潜在的问题。在尝试从具有关系的实体中检索结果集合时,我们必须小心谨慎,因为这可能会影响性能,尤其是在获取所有数据时。
5.1. N+1 Problem
5.1 N+1 问题
Let’s have an example to illustrate the issue. Consider our Student entity with an additional many-to-one mapping:
让我们举个例子来说明这个问题。考虑一下我们的 Student 实体,它带有一个额外的多对一映射:
@Entity
@Table(name = "student")
public class Student {
@Id
@Column(name = "student_id")
private String id;
@Column(name = "first_name")
private String firstName;
@Column(name = "last_name")
private String lastName;
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "school_id", referencedColumnName = "school_id")
private School school;
// getters and setters
}
Every Student now associates with a School, and we define the School entity as:
现在,每个 Student 都与一个 School 关联,我们将 School 实体定义为:
@Entity
@Table(name = "school")
public class School {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
@Column(name = "school_id")
private Integer id;
private String name;
// getters and setters
}
Now, we would like to retrieve all Student records from the database and investigate the actual number of SQL queries issued by JPA. Hypersistence Utilities is a database utility library that provides the assertSelectCount() method to identify the number of select queries executed. Let’s include its Maven dependency in our pom.xml file:
现在,我们希望从数据库中检索所有 Student 记录,并调查 JPA 发出的 SQL 查询的实际次数。Hypersistence Utilities 是一个数据库实用程序库,它提供了 assertSelectCount() 方法来确定执行的选择查询次数。让我们在 pom.xml 文件中加入它的 Maven 依赖关系:
<dependency>
<groupId>io.hypersistence</groupId>
<artifactId>hypersistence-utils-hibernate-62</artifactId>
<version>3.7.0</version>
</dependency>
Now, we create a test case to retrieve all Student records:
现在,我们创建一个测试用例来检索所有 Student 记录:
@Test
public void whenGetStudentsWithSchool_thenMultipleSelectQueriesAreExecuted() {
Page<Student> studentPage = studentRepository.findAll(Pageable.unpaged());
List<StudentWithSchoolNameDTO> list = studentPage.get()
.map(student -> modelMapper.map(student, StudentWithSchoolNameDTO.class))
.collect(Collectors.toList());
assertSelectCount((int) studentPage.getContent().size() + 1);
}
In a complete application, we do not want to expose our internal entities to clients. We would map the internal entity to an external DTO and return it to the client in practice. In this example, we adopt ModelMapper to convert Student to StudentWithSchoolNameDTO, which contains all fields from Student and the name field from School:
在一个完整的应用程序中,我们不想将内部实体暴露给客户端。我们将内部实体映射到外部 DTO 并将其返回给客户端。 在本例中,我们采用 ModelMapper 将 Student 转换为 StudentWithSchoolNameDTO,其中包含来自 Student 的所有字段和来自 School 的名称字段:
public class StudentWithSchoolNameDTO {
private String id;
private String firstName;
private String lastName;
private String schoolName;
// constructor, getters and setters
}
Let’s observe the Hibernate log after executing the test case:
让我们观察一下执行测试用例后的 Hibernate 日志:
Hibernate: select studentent0_.student_id as student_1_1_, studentent0_.first_name as first_na2_1_, studentent0_.last_name as last_nam3_1_, studentent0_.school_id as school_i4_1_ from student studentent0_
Hibernate: select schoolenti0_.school_id as school_i1_0_0_, schoolenti0_.name as name2_0_0_ from school schoolenti0_ where schoolenti0_.school_id=?
Hibernate: select schoolenti0_.school_id as school_i1_0_0_, schoolenti0_.name as name2_0_0_ from school schoolenti0_ where schoolenti0_.school_id=?
...
Consider we have retrieved N Student records from the database. Instead of executing a single select query on the Student table, JPA executes additional N queries on the School table to fetch the associated record for each Student.
假设我们从数据库中检索了 N 条 Student 记录。JPA 不是在 Student 表上执行单个选择查询,而是在 School 表上执行额外的 N 个查询,以获取每个 Student 的相关记录。
This behavior emerges during the conversion by ModelMapper when it attempts to read the school field in the Student instance. This issue in object-relational mapping performance is known as the N+1 problem.
当 ModelMapper 尝试读取 Student 实例中的 school 字段时,这种行为会在 ModelMapper 的转换过程中出现。对象关系映射性能中的这一问题被称为 N+1 问题。
It’s worth mentioning that JPA does not always issue N queries on the School table per Student fetch. The actual count is data-dependent. JPA has a first-level caching mechanism that ensures it does not fetch the cached School instances again from the database.
值得一提的是,JPA 并不总是在每次 Student 抓取时对 School 表发出 N 次查询。实际次数取决于数据。JPA 具有一级缓存机制,可确保它不会再次从数据库中获取缓存的 School 实例。
5.2. Avoid Fetching Relationships
5.2.避免获取关系
When returning a DTO to the client, it’s not always necessary to include all fields in the entity class. Mostly, we only need a subset of them. To avoid triggering additional queries from associated relationships in the entity, we should extract essential fields only.
向客户端返回 DTO 时,并不总是需要包含实体类中的所有字段。大多数情况下,我们只需要其中的一个子集。为了避免从实体中的关联关系触发额外查询,我们应该只提取必要的字段。
In our example, we can create a designated DTO class that includes fields merely from the Student table. JPA will not execute any additional query on School if we do not access the school field:
在我们的示例中,我们可以创建一个指定的 DTO 类,其中包含的字段仅来自 Student 表。如果我们不访问 school 字段,JPA 将不会对 School 执行任何其他查询:
public class StudentDTO {
private String id;
private String firstName;
private String lastName;
// constructor, getters and setters
}
This approach assumes the association fetch type defined on the entity class we’re querying is set to perform a lazy fetch of the associated entity:
这种方法假定我们正在查询的实体类上定义的关联获取类型已设置为对关联实体执行懒取回:
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "school_id", referencedColumnName = "school_id")
private School school;
It’s important to note that if the fetch attribute is set to FetchType.EAGER, JPA will actively execute additional queries upon fetching the Student record despite having no access to the field afterward.
值得注意的是,如果将 fetch 属性设置为 FetchType.EAGER,则尽管之后无法访问字段,但在获取 Student 记录时,JPA 仍会主动执行附加查询。
5.3. Custom Query
5.3.自定义查询
Whenever a field in School is a necessity in the DTO, we can define a custom query to instruct JPA to execute a fetch join to retrieve the associated School entities eagerly in the initial Student query:
只要 School 中的字段在 DTO 中是必需的,我们就可以定义自定义查询,指示 JPA 执行 fetch 连接,以便在初始 Student 查询中急切地检索关联的 School 实体:
public interface StudentRepository extends JpaRepository<Student, String> {
@Query(value = "SELECT stu FROM Student stu LEFT JOIN FETCH stu.school",
countQuery = "SELECT COUNT(stu) FROM Student stu")
Page<Student> findAll(Pageable pageable);
}
Upon executing the same test case, we can observe from the Hibernate log that there is now only one query joining the Student and the School tables executed:
在执行同一测试用例时,我们可以从 Hibernate 日志中观察到,现在只执行了一个连接 Student 表和 School 表的查询:
Hibernate: select s1_0.student_id,s1_0.first_name,s1_0.last_name,s2_0.school_id,s2_0.name
from student s1_0 left join school s2_0 on s2_0.school_id=s1_0.school_id
5.4. Entity Graph
5.4.实体图
A neater solution would be using the @EntityGraph annotation. This helps to optimize the retrieval performance by fetching entities in a single query rather than executing an additional query for each association. JPA uses this annotation to specify which associated entities should be eagerly fetched.
更简洁的解决方案是使用 @EntityGraph 注解。 JPA 使用此注解来指定哪些关联实体应急于获取,从而优化了检索性能。
Let’s look at an ad-hoc entity graph example that defines attributePaths to instruct JPA to fetch the School association when querying the Student records:
让我们来看一个临时实体图示例,该示例定义了 attributePaths 以指示 JPA 在查询 Student 记录时获取 School 关联:
public interface StudentRepository extends JpaRepository<Student, String> {
@EntityGraph(attributePaths = "school")
Page<Student> findAll(Pageable pageable);
}
There’s an alternative way to define an entity graph by placing the @NamedEntityGraph annotation on the Student entity:
还有另一种定义实体图的方法,即在 Student 实体上放置 @NamedEntityGraph 注解:
@Entity
@Table(name = "student")
@NamedEntityGraph(name = "Student.school", attributeNodes = @NamedAttributeNode("school"))
public class Student {
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "school_id", referencedColumnName = "school_id")
private School school;
// Other fields, getters and setters
}
Subsequently, we add the annotation @EntityGraph to the StudentRepository findAll() method and refer to the named entity graph we defined in the Student class:
随后,我们将注解 @EntityGraph 添加到 StudentRepository findAll() 方法中,并引用我们在 Student 类中定义的命名实体图:
public interface StudentRepository extends JpaRepository<Student, String> {
@EntityGraph(value = "Student.school")
Page<Student> findAll(Pageable pageable);
}
We’ll see an identical join query is executed by JPA, compared to the custom query approach, upon executing the test case:
在执行测试用例时,我们会看到 JPA 执行的连接查询与自定义查询方法完全相同:
Hibernate: select s1_0.student_id,s1_0.first_name,s1_0.last_name,s2_0.school_id,s2_0.name
from student s1_0 left join school s2_0 on s2_0.school_id=s1_0.school_id
6. Conclusion
6.结论
In this article, we’ve learned how to paginate and sort our query results in Spring Boot, including retrieval of partial data and full data. We also learned some efficient data retrieval practices in Spring Boot, particularly when dealing with relationships.
在本文中,我们学习了如何在 Spring Boot 中对查询结果进行分页和排序,包括检索部分数据和完整数据。我们还学习了 Spring Boot 中一些高效的数据检索实践,尤其是在处理关系时。
As usual, the sample code is available over on GitHub.
与往常一样,示例代码可在 GitHub 上获取。