Spring Data MongoDB: Projections and Aggregations – Spring Data MongoDB: 投影和聚合

最后修改: 2017年 1月 28日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

Spring Data MongoDB provides simple high-level abstractions to MongoDB native query language. In this article, we will explore the support for Projections and Aggregation framework.

Spring Data MongoDB为MongoDB本地查询语言提供了简单的高层抽象。在这篇文章中,我们将探讨对投影和聚合框架的支持。

If you’re new to this topic, refer to our introductory article Introduction to Spring Data MongoDB.

如果您是该主题的新手,请参考我们的介绍性文章Spring Data MongoDB 介绍

2. Projection

2.投射

In MongoDB, Projections are a way to fetch only the required fields of a document from a database. This reduces the amount of data that has to be transferred from database server to client and hence increases performance.

在MongoDB中,投影是一种从数据库中只获取文档所需字段的方式。这减少了必须从数据库服务器传输到客户端的数据量,从而提高了性能。

With Spring Data MongDB, projections can be used both with MongoTemplate and MongoRepository.

通过Spring Data MongDB,投影可以同时用于MongoTemplateMongoRepository。

Before we move further, let’s look at the data model we will be using:

在我们进一步行动之前,让我们看一下我们将使用的数据模型。

@Document
public class User {
    @Id
    private String id;
    private String name;
    private Integer age;
    
    // standard getters and setters
}

2.1. Projections Using MongoTemplate

2.1.使用MongoTemplate进行预测

The include() and exclude() methods on the Field class is used to include and exclude fields respectively:

include()exclude()方法在Field类中被用来分别包括和排除字段。

Query query = new Query();
query.fields().include("name").exclude("id");
List<User> john = mongoTemplate.find(query, User.class);

These methods can be chained together to include or exclude multiple fields. The field marked as @Id (_id in the database) is always fetched unless explicitly excluded.

这些方法可以被串联起来,以包括或排除多个字段。标记为@Id(数据库中的_id)的字段总是被获取,除非明确排除。

Excluded fields are null in the model class instance when records are fetched with projection. In the case where fields are of a primitive type or their wrapper class, then the value of excluded fields are default values of the primitive types.

当用投影法获取记录时,模型类实例中被排除的字段是null。如果字段是原始类型或其包装类,那么排除的字段的值是原始类型的默认值。

For example, String would be null, int/Integer would be 0 and boolean/Boolean would be false.

例如,String将是nullint/Integer将是0boolean/Boolean将是false

Thus in the above example, the name field would be John, id would be null and age would be 0.

因此,在上述例子中,name字段将是Johnid将是nullage将是0.

2.2. Projections Using MongoRepository

2.2.使用MongoRepository进行预测

While using MongoRepositories, the fields of @Query annotation can be defined in JSON format:

在使用MongoRepositories时,@Query注释的字段可以以JSON格式定义。

@Query(value="{}", fields="{name : 1, _id : 0}")
List<User> findNameAndExcludeId();

The result would be same as using the MongoTemplate. The value=”{}” denotes no filters and hence all the documents will be fetched.

其结果将与使用MongoTemplate相同。value=”{}”表示没有过滤器,因此所有的文档都会被获取。

3. Aggregation

3.Aggregation

Aggregation in MongoDB was built to process data and return computed results. Data is processed in stages and the output of one stage is provided as input to the next stage. This ability to apply transformations and do computations on data in stages makes aggregation a very powerful tool for analytics.

MongoDB中的聚合是为了处理数据并返回计算结果而建立的。数据被分阶段处理,一个阶段的输出被提供给下一个阶段的输入。这种分阶段对数据进行转换和计算的能力使聚合成为一个非常强大的分析工具。

Spring Data MongoDB provides an abstraction for native aggregation queries using the three classes Aggregation which wraps an aggregation query, AggregationOperation which wraps individual pipeline stages and AggregationResults which is the container of the result produced by aggregation.

Spring Data MongoDB使用三个类Aggregation为本地聚合查询提供了一个抽象,该类包裹了一个聚合查询,AggregationOperation包裹了各个管道阶段,AggregationResults是聚合产生的结果容器。

To perform and aggregation, first, create aggregation pipelines using the static builder methods on Aggregation class, then create an instance of Aggregation using the newAggregation() method on the Aggregation class and finally run the aggregation using MongoTemplate:

要执行聚合,首先使用Aggregation类上的静态构建器方法创建聚合管道,然后使用Aggregation类上的newAggregation()方法创建Aggregation实例,最后使用MongoTemplate运行聚合。

MatchOperation matchStage = Aggregation.match(new Criteria("foo").is("bar"));
ProjectionOperation projectStage = Aggregation.project("foo", "bar.baz");
        
Aggregation aggregation 
  = Aggregation.newAggregation(matchStage, projectStage);

AggregationResults<OutType> output 
  = mongoTemplate.aggregate(aggregation, "foobar", OutType.class);

Please note that both MatchOperation and ProjectionOperation implement AggregationOperation. There are similar implementations for other aggregation pipelines. OutType is the data model for expected output.

请注意,MatchOperationProjectionOperation都实现了AggregationOperation。其他聚合管道也有类似的实现。OutType是预期输出的数据模型。

Now, we will look at a few examples and their explanations to cover the major aggregation pipelines and operators.

现在,我们将看几个例子和它们的解释,以涵盖主要的聚合管道和运营商。

The dataset which we will be using in this article lists details about all the zip codes in the US which can be downloaded from MongoDB repository.

我们将在本文中使用的数据集列出了关于美国所有邮政编码的详细信息,可以从MongoDB资源库下载。

Let’s look at a sample document after importing it into a collection called zips in the test database.

让我们看一下导入zips数据库中一个名为zips的集合后的样本文件。

{
    "_id" : "01001",
    "city" : "AGAWAM",
    "loc" : [
        -72.622739,
        42.070206
    ],
    "pop" : 15338,
    "state" : "MA"
}

For the sake of simplicity and to make code concise, in the next code snippets, we will assume that all the static methods of Aggregation class are statically imported.

为了简单起见,使代码简洁,在接下来的代码片段中,我们将假设Aggregation类的所有static方法都是静态导入的。

3.1. Get All the States With a Population Greater Than 10 Million Order by Population Descending

3.1.获取所有人口超过1000万的州,按人口降序排列

Here we will have three pipelines:

这里我们将有三条管道。

  1. $group stage summing up the population of all zip codes
  2. $match stage to filter out states with population over 10 million
  3. $sort stage to sort all the documents in descending order of population

The expected output will have a field _id as state and a field statePop with the total state population. Let’s create a data model for this and run the aggregation:

预期的输出将有一个字段_id作为州,一个字段statePop为州人口总数。让我们为此创建一个数据模型并运行聚合。

public class StatePoulation {
 
    @Id
    private String state;
    private Integer statePop;
 
    // standard getters and setters
}

The @Id annotation will map the _id field from output to state in the model:

@Id注解将把_id字段从输出映射到模型中的state

GroupOperation groupByStateAndSumPop = group("state")
  .sum("pop").as("statePop");
MatchOperation filterStates = match(new Criteria("statePop").gt(10000000));
SortOperation sortByPopDesc = sort(Sort.by(Direction.DESC, "statePop"));

Aggregation aggregation = newAggregation(
  groupByStateAndSumPop, filterStates, sortByPopDesc);
AggregationResults<StatePopulation> result = mongoTemplate.aggregate(
  aggregation, "zips", StatePopulation.class);

The AggregationResults class implements Iterable and hence we can iterate over it and print the results.

AggregationResults类实现了Iterable,因此我们可以对它进行迭代并打印结果。

If the output data model is not known, the standard MongoDB class Document can be used.

如果不知道输出数据模型,可以使用标准MongoDB类Document

3.2. Get Smallest State by Average City Population

3.2.按城市平均人口计算的最小的州

For this problem, we will need four stages:

对于这个问题,我们将需要四个阶段。

  1. $group to sum the total population of each city
  2. $group to calculate average population of each state
  3. $sort stage to order states by their average city population in ascending order
  4. $limit to get the first state with lowest average city population

Although it’s not necessarily required, we will use an additional $project stage to reformat the document as per out StatePopulation data model.

虽然不一定需要,但我们将使用一个额外的$project阶段,按照StatePopulation数据模型重新格式化文件。

GroupOperation sumTotalCityPop = group("state", "city")
  .sum("pop").as("cityPop");
GroupOperation averageStatePop = group("_id.state")
  .avg("cityPop").as("avgCityPop");
SortOperation sortByAvgPopAsc = sort(Sort.by(Direction.ASC, "avgCityPop"));
LimitOperation limitToOnlyFirstDoc = limit(1);
ProjectionOperation projectToMatchModel = project()
  .andExpression("_id").as("state")
  .andExpression("avgCityPop").as("statePop");

Aggregation aggregation = newAggregation(
  sumTotalCityPop, averageStatePop, sortByAvgPopAsc,
  limitToOnlyFirstDoc, projectToMatchModel);

AggregationResults<StatePopulation> result = mongoTemplate
  .aggregate(aggregation, "zips", StatePopulation.class);
StatePopulation smallestState = result.getUniqueMappedResult();

In this example, we already know that there will be only one document in the result since we limit the number of output documents to 1 in the last stage. As such, we can invoke getUniqueMappedResult() to get the required StatePopulation instance.

在这个例子中,我们已经知道结果中只有一个文档,因为我们在最后阶段将输出文档的数量限制为1。因此,我们可以调用getUniqueMappedResult()来获得所需的StatePopulation实例。

Another thing to notice is that, instead of relying on the @Id annotation to map _id to state, we have explicitly done it in projection stage.

另一件需要注意的事情是,我们没有依靠@Id注解来将_id映射到状态,而是在投影阶段明确地完成了这一点。

3.3. Get the State With Maximum and Minimum Zip Codes

3.3.获取具有最大和最小邮编的州

For this example, we need three stages:

对于这个例子,我们需要三个阶段。

  1. $group to count the number of zip codes for each state
  2. $sort to order the states by the number of zip codes
  3. $group to find the state with max and min zip codes using $first and $last operators
GroupOperation sumZips = group("state").count().as("zipCount");
SortOperation sortByCount = sort(Direction.ASC, "zipCount");
GroupOperation groupFirstAndLast = group().first("_id").as("minZipState")
  .first("zipCount").as("minZipCount").last("_id").as("maxZipState")
  .last("zipCount").as("maxZipCount");

Aggregation aggregation = newAggregation(sumZips, sortByCount, groupFirstAndLast);

AggregationResults<Document> result = mongoTemplate
  .aggregate(aggregation, "zips", Document.class);
Document document= result.getUniqueMappedResult();

Here we have not used any model but used the Document already provided with MongoDB driver.

这里我们没有使用任何模型,而是使用了MongoDB驱动中已经提供的Document

4. Conclusion

4.结论

In this article, we learned how to fetch specified fields of a document in MongoDB using projections in Spring Data MongoDB.

在这篇文章中,我们学习了如何使用Spring Data MongoDB中的投影来获取MongoDB中文档的指定字段。

We also learned about the MongoDB aggregation framework support in Spring Data. We covered major aggregation phases – group, project, sort, limit, and match and looked at some examples of its practical applications. The complete source code is available over on GitHub.

我们还了解了Spring Data中的MongoDB聚合框架支持。我们涵盖了主要的聚合阶段–分组、项目、排序、限制和匹配,并查看了其实际应用的一些示例。完整的源代码是在GitHub上提供的