MongoDB Atlas Search Using the Java Driver and Spring Data – 使用 Java 驱动程序和 Spring 数据进行 MongoDB Atlas 搜索

最后修改: 2023年 11月 15日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.导言

In this tutorial, we’ll learn how to use Atlas Search functionalities using the Java MongoDB driver API. By the end, we’ll have a grasp on creating queries, paginating results, and retrieving meta-information. Also, we’ll cover refining results with filters, adjusting result scores, and selecting specific fields to be displayed.

在本教程中,我们将学习如何使用 Java MongoDB 驱动程序 API 使用 Atlas Search 功能。最后,我们将掌握创建查询、结果分页和检索元信息的方法。此外,我们还将介绍使用筛选器细化结果、调整结果分数以及选择要显示的特定字段。

2. Scenario and Setup

2.场景和设置

MongoDB Atlas has a free forever cluster that we can use to test all features. To showcase Atlas Search functionalities, we’ll only need a service class. We’ll connect to our collection using MongoTemplate.

MongoDB Atlas 有一个 永久免费集群,我们可以用它来测试所有功能。要展示 Atlas Search 的功能,我们只需要一个 service 类。我们将使用 MongoTemplate 连接到我们的集合。

2.1. Dependencies

2.1 依赖性

First, to connect to MongoDB, we’ll need spring-boot-starter-data-mongodb:

首先,要连接 MongoDB,我们需要 spring-boot-starter-data-mongodb

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-mongodb</artifactId>
    <version>3.1.2</version>
</dependency>

2.2. Sample Dataset

2.2.样本数据集

Throughout this tutorial, we’ll use the movies collection from MongoDB Atlas’s sample_mflix sample dataset to simplify examples. It contains data about movies since the 1900s, which will help us showcase the filtering capabilities of Atlas Search.

在本教程中,我们将使用 MongoDB Atlas 的 sample_mflix <a href=”https://www.mongodb.com/docs/atlas/sample-data/#std-label-sample-data/?> 它包含自 1900 年代以来的电影数据,这将有助于我们展示 Atlas Search 的过滤功能。

2.3. Creating an Index With Dynamic Mapping

2.3.使用动态映射创建索引

For Atlas Search to work, we need indexes. These can be static or dynamic. A static index is helpful for fine-tuning, while a dynamic one is an excellent general-purpose solution. So, let’s start with a dynamic index.

要让 Atlas Search 正常工作,我们需要索引。这些索引可以是静态的,也可以是动态的。静态索引有助于微调,而动态索引则是一种出色的通用解决方案。

There are a few ways to create search indexes (including programmatically); we’ll use the Atlas UI. There, we can do this by accessing Search from the menu, selecting our cluster, then clicking Go to Atlas Search:

几种方法可以创建搜索索引(包括编程);我们将使用 Atlas UI。在这里,我们可以通过访问菜单中的 搜索,选择我们的群集,然后单击 转到 Atlas 搜索

Creating an index

 

After clicking on Create Search Index, we’ll choose the JSON Editor to create our index, then click Next:

单击创建搜索索引后,我们将选择 JSON 编辑器来创建索引,然后单击下一步

JSON editor

Finally, on the next screen, we choose our target collection, a name for our index, and input our index definition:

最后,在下一个屏幕中,我们选择目标集合、索引名称并输入索引定义

{
    "mappings": {
        "dynamic": true
    }
}

We’ll use the name idx-queries for this index throughout this tutorial. Note that if we name our index default, we don’t need to specify its name when creating queries. Most importantly, dynamic mappings are a simple choice for more flexible, frequently changing schemas.

在本教程中,我们将始终使用 idx-queries 作为该索引的名称。请注意,如果我们将索引命名为 default,则在创建查询时无需指定其名称。最重要的是,动态映射是一种简单的选择,可用于更灵活、更频繁更改的模式

By setting mappings.dynamic to true, Atlas Search automatically indexes all dynamically indexable and supported field types in a document. While dynamic mappings provide convenience, especially when the schema is unknown, they tend to consume more disk space and might be less efficient compared to static ones.

通过将mappings.dynamic设置为true,Atlas Search 将自动索引文档中所有可动态索引的支持的字段类型虽然动态映射提供了便利,尤其是在模式未知的情况下,但与静态映射相比,动态映射往往会消耗更多磁盘空间,而且效率可能会更低。

2.4. Our Movie Search Service

2.4.我们的电影搜索服务

We’ll base our examples on a service class containing some search queries for our movies, extracting interesting information from them. We’ll slowly build them up to more complex queries:

我们的示例将基于一个服务类,其中包含一些电影搜索查询,从中提取有趣的信息。我们将慢慢将它们扩展到更复杂的查询:

@Service
public class MovieAtlasSearchService {

    private final MongoCollection<Document> collection;

    public MovieAtlasSearchService(MongoTemplate mongoTemplate) {
        MongoDatabase database = mongoTemplate.getDb();
        this.collection = database.getCollection("movies");
    }

    // ...
}

All we need is a reference to our collection for future methods.

我们所需要的只是为未来的方法提供我们的集合参考。

3. Constructing a Query

3.构建查询

Atlas Search queries are created via pipeline stages, represented by a List<Bson>. The most essential stage is Aggregates.search(), which receives a SearchOperator and, optionally, a SearchOptions object. Since we called our index idx-queries instead of default, we must include its name with SearchOptions.searchOptions().index(). Otherwise, we’ll get no errors and no results.

Atlas Search 查询是通过管道阶段创建的,由 List<Bson> 表示。最基本的阶段是 Aggregates.search() ,它接收一个 SearchOperator 和一个可选的 SearchOptions 对象。由于我们将索引称为 idx-queries,而不是 default,因此我们必须在 SearchOptions.searchOptions().index() 中包含其名称。否则,我们将不会得到任何错误和结果。

Many search operators are available to define how we want to conduct our query. In this example, we’ll find movies by tags using SearchOperator.text(), which performs a full-text search. We’ll use it to search the contents of the fullplot field with SearchPath.fieldPath(). We’ll omit static imports for readability:

许多搜索操作符都可用于定义我们的查询方式。在本示例中,我们将使用 SearchOperator.text() 通过标签查找电影,它将执行全文搜索。我们将使用 SearchPath.fieldPath() 搜索 fullplot 字段的内容。为了便于阅读,我们将省略静态导入:

public Collection<Document> moviesByKeywords(String keywords) {
    List<Bson> pipeline = Arrays.asList(
        search(
          text(
            fieldPath("fullplot"), keywords
          ),
          searchOptions()
            .index("idx-queries")
        ),
        project(fields(
          excludeId(),
          include("title", "year", "fullplot", "imdb.rating")
        ))
    );

    return collection.aggregate(pipeline)
      .into(new ArrayList<>());
}

Also, the second stage in our pipeline is Aggregates.project(), which represents a projection. If not specified, our query results will include all the fields in our documents. But we can set it and choose which fields we want (or don’t want) to appear in our results. Note that specifying a field for inclusion implicitly excludes all other fields except the _id field. So, in this case, we’re excluding the _id field and passing a list of the fields we want. Note we can also specify nested fields, like imdb.rating.

此外,管道的第二阶段是 Aggregates.project(),它代表一个 投影。如果不指定,我们的查询结果将包括文档中的所有字段。但我们可以对其进行设置,并选择我们希望(或不希望)在结果中出现的字段。请注意,指定包含的字段会隐式地排除除 _id 字段以外的所有其他字段。因此,在本例中,我们排除了 _id 字段,并传递了我们想要的字段列表。请注意,我们还可以指定嵌套字段,例如 imdb.rating

To execute the pipeline, we call aggregate() on our collection. This returns an object we can use to iterate on results. Finally, for simplicity, we call into() to iterate over results and add them to a collection, which we return. Note that a big enough collection can exhaust the memory in our JVM. We’ll see how to eliminate this concern by paginating our results later on.

为了执行管道,我们在集合上调用 aggregate() 。这会返回一个对象,我们可以用它来遍历结果。最后,为简单起见,我们调用 into() 遍历结果并将其添加到集合中,然后返回。请注意,一个足够大的集合可能会耗尽 JVM 的内存。稍后我们将看到如何通过对结果进行分页来消除这种担忧。

Most importantly, pipeline stage order matters. We’ll get an error if we put the project() stage before search().

最重要的是,管道阶段的顺序很重要。如果我们将 project() 阶段放在 search() 之前,就会出错。

Let’s take a look at the first two results of calling moviesByKeywords(“space cowboy”) on our service:

让我们看看在我们的服务中调用 moviesByKeywords(“space cowboy”)的前两个结果:

[
    {
        "title": "Battle Beyond the Stars",
        "fullplot": "Shad, a young farmer, assembles a band of diverse mercenaries in outer space to defend his peaceful planet from the evil tyrant Sador and his armada of aggressors. Among the mercenaries are Space Cowboy, a spacegoing truck driver from Earth; Gelt, a wealthy but experienced assassin looking for a place to hide; and Saint-Exmin, a Valkyrie warrior looking to prove herself in battle.",
        "year": 1980,
        "imdb": {
            "rating": 5.4
        }
    },
    {
        "title": "The Nickel Ride",
        "fullplot": "Small-time criminal Cooper manages several warehouses in Los Angeles that the mob use to stash their stolen goods. Known as \"the key man\" for the key chain he always keeps on his person that can unlock all the warehouses. Cooper is assigned by the local syndicate to negotiate a deal for a new warehouse because the mob has run out of storage space. However, Cooper's superior Carl gets nervous and decides to have cocky cowboy button man Turner keep an eye on Cooper.",
        "year": 1974,
        "imdb": {
            "rating": 6.7
        }
    },
    (...)
]

3.1. Combining Search Operators

3.1.组合搜索操作符

It’s possible to combine search operators using SearchOperator.compound(). In this example, we’ll use it to include must and should clauses. A must clause contains one or more conditions for matching documents. On the other hand, a should clause contains one or more conditions that we’d prefer our results to include.

使用 SearchOperator.compound() 可以组合搜索操作符。在本示例中,我们将使用它来包含 mustshould 子句。must子句包含一个或多个匹配文档的条件。另一方面,should 子句包含一个或多个我们希望结果包含的条件。

This alters the score so the documents that meet these conditions appear first:

这将改变评分,使符合这些条件的文件优先出现:

public Collection<Document> late90sMovies(String keywords) {
    List<Bson> pipeline = asList(
        search(
          compound()
            .must(asList(
              numberRange(
                fieldPath("year"))
                .gteLt(1995, 2000)
            ))
            .should(asList(
              text(
                fieldPath("fullplot"), keywords
              )
            )),
          searchOptions()
            .index("idx-queries")
        ),
        project(fields(
          excludeId(),
          include("title", "year", "fullplot", "imdb.rating")
        ))
    );

    return collection.aggregate(pipeline)
      .into(new ArrayList<>());
}

We kept the same searchOptions() and projected fields from our first query. But, this time, we moved text() to a should clause because we want the keywords to represent a preference, not a requirement.

我们保留了第一次查询中相同的 searchOptions() 和预测字段。但是,这次我们将 text() 移到了 should 子句中,因为我们希望关键字代表的是偏好,而不是要求。

Then, we created a must clause, including SearchOperator.numberRange(), to only show movies from 1995 to 2000 (exclusive) by restricting the values on the year field. This way, we only return movies from that era.

然后,我们创建了一个 must 子句,包括 SearchOperator.numberRange(), ,通过限制 year 字段的值,只显示 1995 年至 2000 年(独家)的电影。这样,我们只能返回该年代的电影。

Let’s see the first two results for hacker assassin:

让我们看看 hacker assassin 的前两个结果:

[
    {
        "title": "Assassins",
        "fullplot": "Robert Rath is a seasoned hitman who just wants out of the business with no back talk. But, as things go, it ain't so easy. A younger, peppier assassin named Bain is having a field day trying to kill said older assassin. Rath teams up with a computer hacker named Electra to defeat the obsessed Bain.",
        "year": 1995,
        "imdb": {
            "rating": 6.3
        }
    },
    {
        "fullplot": "Thomas A. Anderson is a man living two lives. By day he is an average computer programmer and by night a hacker known as Neo. Neo has always questioned his reality, but the truth is far beyond his imagination. Neo finds himself targeted by the police when he is contacted by Morpheus, a legendary computer hacker branded a terrorist by the government. Morpheus awakens Neo to the real world, a ravaged wasteland where most of humanity have been captured by a race of machines that live off of the humans' body heat and electrochemical energy and who imprison their minds within an artificial reality known as the Matrix. As a rebel against the machines, Neo must return to the Matrix and confront the agents: super-powerful computer programs devoted to snuffing out Neo and the entire human rebellion.",
        "imdb": {
            "rating": 8.7
        },
        "year": 1999,
        "title": "The Matrix"
    },
    (...)
]

4. Scoring the Result Set

4.为结果集评分

When we query documents with search(), the results appear in order of relevance. This relevance is based on the calculated score, from highest to lowest. This time, we’ll modify late90sMovies() to receive a SearchScore modifier to boost the relevance of the plot keywords in our should clause:

当我们使用 search() 查询文档时,结果会按相关性顺序显示。这种相关性基于从高到低的 计算得分这一次,我们将修改 late90sMovies() 以接收 SearchScore 修改器,从而提高 should 子句中情节关键词的相关性

public Collection<Document> late90sMovies(String keywords, SearchScore modifier) {
    List<Bson> pipeline = asList(
        search(
          compound()
            .must(asList(
              numberRange(
                fieldPath("year"))
                .gteLt(1995, 2000)
            ))
            .should(asList(
              text(
                fieldPath("fullplot"), keywords
              )
              .score(modifier)
            )),
          searchOptions()
            .index("idx-queries")
        ),
        project(fields(
          excludeId(),
          include("title", "year", "fullplot", "imdb.rating"),
          metaSearchScore("score")
        ))
    );

    return collection.aggregate(pipeline)
      .into(new ArrayList<>());
}

Also, we include metaSearchScore(“score”) in our fields list to see the score for each document in our results. For example, we can now multiply the relevance of our “should” clause by the value of the imdb.votes field like this:

此外,我们还在字段列表中加入了 metaSearchScore(“score”) 字段,以便查看结果中每个文档的得分。例如,我们现在可以将 “should “子句的相关性乘以 imdb.votes 字段的值,如下所示:

late90sMovies(
  "hacker assassin", 
  SearchScore.boost(fieldPath("imdb.votes"))
)

And this time, we can see that The Matrix comes first, thanks to the boost:

这一次,我们可以看到《黑客帝国》排在第一位,这要归功于它的提升:

[
    {
        "fullplot": "Thomas A. Anderson is a man living two lives (...)",
        "imdb": {
            "rating": 8.7
        },
        "year": 1999,
        "title": "The Matrix",
        "score": 3967210.0
    },
    {
        "fullplot": "(...) Bond also squares off against Xenia Onatopp, an assassin who uses pleasure as her ultimate weapon.",
        "imdb": {
            "rating": 7.2
        },
        "year": 1995,
        "title": "GoldenEye",
        "score": 462604.46875
    },
    (...)
]

4.1. Using a Score Function

4.1.使用分数函数

We can achieve greater control by using a function to alter the score of our results. Let’s pass a function to our method that adds the value of the year field to the natural score. This way, newer movies end up with a higher score:

通过使用函数来改变结果的得分,我们可以实现更强的控制。让我们为我们的方法传递一个函数,将年份字段的值添加到自然分数中。这样,较新的电影最终会获得较高的分数:

late90sMovies(keywords, function(
  addExpression(asList(
    pathExpression(
      fieldPath("year"))
      .undefined(1), 
    relevanceExpression()
  ))
));

That code starts with a SearchScore.function(), which is a SearchScoreExpression.addExpression() since we want an add operation. Then, since we want to add a value from a field, we use a SearchScoreExpression.pathExpression() and specify the field we want: year. Also, we call undefined() to determine a fallback value for year in case it’s missing. In the end, we call relevanceExpression() to return the document’s relevance score, which is added to the value of year.

该代码以 SearchScore.function() 开始,由于我们需要进行 add 操作,因此它是一个 SearchScoreExpression.addExpression()然后,由于我们想从一个字段中添加一个值,因此我们使用 SearchScoreExpression.pathExpression() 并指定我们想要的字段:年份。此外,我们还调用 undefined() 来确定 year 的后备值,以防丢失。最后,我们调用 relevanceExpression() 返回文档的相关性得分,并将其添加到 year 的值中。

When we execute that, we’ll see “The Matrix” now appears first, along with its new score:

执行该操作后,我们会看到 “黑客帝国 “和它的新配乐出现在第一位:

[
    {
        "fullplot": "Thomas A. Anderson is a man living two lives (...)",
        "imdb": {
            "rating": 8.7
        },
        "year": 1999,
        "title": "The Matrix",
        "score": 2003.67138671875
    },
    {
        "title": "Assassins",
        "fullplot": "Robert Rath is a seasoned hitman (...)",
        "year": 1995,
        "imdb": {
            "rating": 6.3
        },
        "score": 2003.476806640625
    },
    (...)
]

That’s useful for defining what should have greater weight when scoring our results.

这有助于我们在对结果进行评分时,确定哪些因素的权重更大。

5. Getting Total Rows Count From Metadata

5.从元数据中获取总行数

If we need to get the total number of results in a query, we can use Aggregates.searchMeta() instead of search() to retrieve metadata information only. With this method, no documents are returned. So, we’ll use it to count the number of movies from the late 90s that also contain our keywords.

如果我们需要获得查询结果的总数,我们可以使用 Aggregates.searchMeta() 代替 search() 仅检索元数据信息。使用此方法,不会返回任何文档。因此,我们将使用它来计算同样包含我们的关键字的 90 年代末电影的数量。

For meaningful filtering, we’ll also include the keywords in our must clause:

为了进行有意义的筛选,我们还将在 must 子句中包含 关键字

public Document countLate90sMovies(String keywords) {
    List<Bson> pipeline = asList(
        searchMeta(
          compound()
            .must(asList(
              numberRange(
                fieldPath("year"))
                .gteLt(1995, 2000),
              text(
                fieldPath("fullplot"), keywords
              )
            )),
          searchOptions()
            .index("idx-queries")
            .count(total())
        )
    );

    return collection.aggregate(pipeline)
      .first();
}

This time, searchOptions() includes a call to SearchOptions.count(SearchCount.total()), which ensures we get an exact total count (instead of a lower bound, which is faster depending on the collection size). Also, since we expect a single object in the results, we call first() on aggregate().

这次,searchOptions() 包括调用 SearchOptions.count(SearchCount.total()),这将确保我们获得精确的总数(而不是下限,下限的速度取决于集合的大小)。此外,由于我们希望结果中只有一个对象,因此我们在 aggregate() 上调用 first()

Finally, let’s see what is returned for countLate90sMovies(“hacker assassin”):

最后,让我们看看 countLate90sMovies(“hacker assassin”)返回的结果:

{
    "count": {
        "total": 14
    }
}

This is useful for getting information about our collection without including documents in our results.

这对于在不将文档包含在结果中的情况下获取我们的收藏信息非常有用。

6. Faceting on Results

6.对结果进行分面

In MongoDB Atlas Search, a facet query is a feature that allows retrieving aggregated and categorized information about our search results. It helps us analyze and summarize data based on different criteria, providing insights into the distribution of search results.

在 MongoDB Atlas Search 中,facet查询是一种允许检索搜索结果的聚合和分类信息的功能。它可帮助我们根据不同的标准分析和汇总数据,深入了解搜索结果的分布情况。

Also, it enables grouping search results into different categories or buckets and retrieving counts or additional information about each category. This helps answer questions like “How many documents match a specific category?” or “What are the most common values for a certain field within the results?”

此外,它还可将搜索结果分组为不同的类别或桶,并检索每个类别的计数或附加信息。这有助于回答 “有多少文档符合特定类别?”或 “结果中某个字段最常见的值是什么?”等问题。

6.1. Creating a Static Index

6.1.创建静态索引

In our first example, we’ll create a facet query to give us information about genres from movies since the 1900s and how these relate. We’ll need an index with facet types, which we can’t have when using dynamic indexes.

在第一个示例中,我们将创建一个分面查询,为我们提供自 20 世纪以来的电影流派信息,以及这些流派之间的关系。我们需要一个具有分面类型的索引,而使用动态索引时,我们无法拥有这种索引。

So, let’s start by creating a new search index in our collection, which we’ll call idx-facets. Note that we’ll keep dynamic as true so we can still query the fields that are not explicitly defined:

因此,让我们首先在集合中创建一个新的搜索索引,我们将其称为 idx-facets。请注意,我们将dynamic保持为true,这样我们仍然可以查询未明确定义的字段:

{
  "mappings": {
    "dynamic": true,
    "fields": {
      "genres": [
        {
          "type": "stringFacet"
        },
        {
          "type": "string"
        }
      ],
      "year": [
        {
          "type": "numberFacet"
        },
        {
          "type": "number"
        }
      ]
    }
  }
}

We started by specifying that our mappings aren’t dynamic. Then, we selected the fields we were interested in for indexing faceted information. Since we also want to use filters in our query, for each field, we specify an index of a standard type (like string) and one of a faceted type (like stringFacet).

我们首先指定映射不是动态的。然后,我们选择了我们感兴趣的字段,以便为分面信息建立索引。由于我们还希望在查询中使用筛选器,因此我们为每个字段指定了一个标准类型的索引(如 string)和一个分面类型的索引(如 stringFacet)。

6.2. Running a Facet Query

6.2.运行面查询

Creating a facet query involves using searchMeta() and starting a SearchCollector.facet() method to include our facets and an operator for filtering results. When defining the facets, we have to choose a name and use a SearchFacet method that corresponds to the type of index we created. In our case, we define a stringFacet() and a numberFacet():

创建分面查询包括使用 searchMeta() 并启动 SearchCollector.facet() 方法,以包含我们的分面和用于过滤结果的操作符。在定义分面时,我们必须选择一个名称,并使用与我们创建的索引类型相对应的 SearchFacet 方法。在本例中,我们定义了 stringFacet()numberFacet()

public Document genresThroughTheDecades(String genre) {
    List pipeline = asList(
      searchMeta(
        facet(
          text(
            fieldPath("genres"), genre
          ), 
          asList(
            stringFacet("genresFacet", 
              fieldPath("genres")
            ).numBuckets(5),
            numberFacet("yearFacet", 
              fieldPath("year"), 
              asList(1900, 1930, 1960, 1990, 2020)
            )
          )
        ),
        searchOptions()
          .index("idx-facets")
      )
    );

    return collection.aggregate(pipeline)
      .first();
}

We filter movies with a specific genre with the text() operator. Since films generally contain multiple genres, the stringFacet() will also show five (specified by numBuckets()) related genres ranked by frequency. For the numberFacet(), we must set the boundaries separating our aggregated results. We need at least two, with the last one being exclusive.

我们使用 text() 运算符过滤具有特定类型的电影。由于电影通常包含多种类型,stringFacet() 还将显示按频率排序的五个(由 numBuckets() 指定)相关类型。对于numberFacet(),我们必须设置分隔汇总结果的边界。我们至少需要两个边界,最后一个是排他的。

Finally, we return only the first result. Let’s see what it looks like if we filter by the “horror” genre:

最后,我们只返回第一个结果。让我们看看根据 “恐怖 “类型过滤后的结果:

{
    "count": {
        "lowerBound": 1703
    },
    "facet": {
        "genresFacet": {
            "buckets": [
                {
                    "_id": "Horror",
                    "count": 1703
                },
                {
                    "_id": "Thriller",
                    "count": 595
                },
                {
                    "_id": "Drama",
                    "count": 395
                },
                {
                    "_id": "Mystery",
                    "count": 315
                },
                {
                    "_id": "Comedy",
                    "count": 274
                }
            ]
        },
        "yearFacet": {
            "buckets": [
                {
                    "_id": 1900,
                    "count": 5
                },
                {
                    "_id": 1930,
                    "count": 47
                },
                {
                    "_id": 1960,
                    "count": 409
                },
                {
                    "_id": 1990,
                    "count": 1242
                }
            ]
        }
    }
}

Since we didn’t specify a total count, we get a lower bound count, followed by our facet names and their respective buckets.

由于我们没有指定总计数,因此我们会得到一个下限计数,然后是我们的面名称及其各自的桶。

6.3. Including a Facet Stage to Paginate Results

6.3.将面阶段加入到结果分页中

Let’s return to our late90sMovies() method and include a $facet stage in our pipeline. We’ll use it for pagination and a total rows count. The search() and project() stages will remain unmodified:

让我们返回 late90sMovies() 方法,并在管道中加入 $facet 阶段。我们将使用它进行分页和总行计数。search()project() 阶段将保持不变

public Document late90sMovies(int skip, int limit, String keywords) {
    List<Bson> pipeline = asList(
        search(
          // ...
        ),
        project(fields(
          // ...
        )),
        facet(
          new Facet("rows",
            skip(skip),
            limit(limit)
          ),
          new Facet("totalRows",
            replaceWith("$$SEARCH_META"),
            limit(1)
          )
        )
    );

    return collection.aggregate(pipeline)
      .first();
}

We start by calling Aggregates.facet(), which receives one or more facets. Then, we instantiate a Facet to include skip() and limit() from the Aggregates class. While skip() defines our offset, limit() will restrict the number of documents retrieved. Note that we can name our facets anything we like.

我们首先调用 Aggregates.facet(),它会接收一个或多个面。然后,我们实例化一个 Facet 以包含 Aggregates 类中的 skip()limit()skip()定义了我们的偏移量,而 limit() 将限制检索到的文档数量。请注意,我们可以随意为我们的切面命名。

Also, we call replaceWith(“$$SEARCH_META“) to get metadata info in this field. Most importantly, so that our metadata information is not repeated for each result, we include a limit(1). Finally, when our query has metadata, the result becomes a single document instead of an array, so we only return the first result.

此外,我们还调用 replaceWith(“$SEARCH_META“) 来获取该字段中的元数据信息。最重要的是,为了使我们的元数据信息不会在每个结果中重复,我们加入了limit(1)最后,当我们的查询包含元数据时,查询结果将变成单个文档而不是数组,因此我们只返回第一个结果。

7. Conclusion

7.结论

In this article, we saw how MongoDB Atlas Search provides developers with a versatile and potent toolset. Integrating it with the Java MongoDB driver API can enhance search functionalities, data aggregation, and result customization. Our hands-on examples have aimed to provide a practical understanding of its capabilities. Whether implementing a simple search or seeking intricate data analytics, Atlas Search is an invaluable tool in the MongoDB ecosystem.

在本文中,我们了解了 MongoDB Atlas Search 如何为开发人员提供多功能的强大工具集。将其与 Java MongoDB 驱动程序 API 集成,可以增强搜索功能、数据聚合和结果定制。我们的实践示例旨在提供对其功能的实际理解。无论是实施简单的搜索还是寻求复杂的数据分析,Atlas Search 都是 MongoDB 生态系统中不可多得的工具。

Remember to leverage the power of indexes, facets, and dynamic mappings to make our data work for us. As always, the source code is available over on GitHub.

记住,要充分利用索引、切面和动态映射的力量,让我们的数据为我们工作。与往常一样,源代码可在 GitHub 上获取。