Introduction to Spring Data Elasticsearch – Spring Data Elasticsearch简介

最后修改: 2016年 2月 8日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this tutorial, we’ll explore the basics of Spring Data Elasticsearch in a code-focused and practical manner.

在本教程中,我们将以注重代码和实用的方式探索Spring Data Elasticsearch的基础知识

We’ll learn how to index, search, and query Elasticsearch in a Spring application using Spring Data Elasticsearch. Spring Data Elasticseach is a Spring module that implements Spring Data, thus offering a way to interact with the popular open-source, Lucene-based search engine.

我们将学习如何使用Spring Data Elasticsearch在Spring应用程序中索引、搜索和查询Elasticsearch。Spring Data Elasticseach是一个实现Spring Data的Spring模块,从而提供了一种与流行的开源、基于Lucene的搜索引擎互动的方式。

While Elasticsearch can work without a hardly defined schema, it’s still a common practice to design one and create mappings specifying the type of data we expect in certain fields. When a document is indexed, its fields are processed according to their types. For example, a text field will be tokenized and filtered according to mapping rules. We can also create filters and tokenizers of our own.

虽然Elasticsearch可以在没有定义模式的情况下工作,但设计一个模式并创建映射,指定我们在某些字段中期望的数据类型,仍然是一种常见的做法。当一个文档被索引时,它的字段会根据它们的类型被处理。例如,一个文本字段将被标记化,并根据映射规则进行过滤。我们还可以创建我们自己的过滤器和标记器。

For the sake of simplicity, we’ll use a docker image for our Elasticsearch instance, though any Elasticsearch instance listening on port 9200 will do.

为了简单起见,我们将为我们的Elasticsearch实例使用一个docker镜像,尽管任何监听端口为9200的Elasticsearch实例都可以

We’ll start by firing up our Elasticsearch instance:

我们首先启动我们的Elasticsearch实例。

docker run -d --name es762 -p 9200:9200 -e "discovery.type=single-node" elasticsearch:7.6.2

2. Spring Data

2.Spring Data

Spring Data helps avoid boilerplate code. For example, if we define a repository interface that extends the ElasticsearchRepository interface that Spring Data Elasticsearch provides, CRUD operations for the corresponding document class will become available by default.

Spring Data有助于避免模板代码。例如,如果我们定义了一个扩展了Spring Data Elasticsearch提供的ElasticsearchRepository接口的存储库接口,相应文档类的CRUD操作将默认为可用。

Additionally, method implementations will generate for us simply by declaring methods with names in a predefined format. There’s no need to write an implementation of the repository interface.

此外,方法的实现将为我们生成,只需用预定义的格式声明方法的名称。没有必要写一个存储库接口的实现。

The Baeldung guides on Spring Data provide the essentials to get started on this topic.

关于Spring Data的Baeldung指南提供了开始学习这一主题的基本内容。

2.1. Maven Dependency

2.1.Maven的依赖性

Spring Data Elasticsearch provides a Java API for the search engine. In order to use it, we need to add a new dependency to the pom.xml:

Spring Data Elasticsearch为搜索引擎提供了一个Java API。为了使用它,我们需要在pom.xml中添加一个新的依赖项。

<dependency>
    <groupId>org.springframework.data</groupId>
    <artifactId>spring-data-elasticsearch</artifactId>
    <version>4.0.0.RELEASE</version>
</dependency>

2.2. Defining Repository Interfaces

2.2.定义存储库的接口

In order to define new repositories, we’ll extend one of the provided repository interfaces, replacing the generic types with our actual document and primary key types.

为了定义新的资源库,我们将扩展一个所提供的资源库接口,用我们实际的文档和主键类型替换通用类型。

It’s important to note that ElasticsearchRepository extends from PagingAndSortingRepository. This allows built-in support for pagination and sorting.

值得注意的是,ElasticsearchRepository扩展自PagingAndSortingRepository。这允许对分页和排序的内置支持。

In our example, we’ll use the paging feature in our custom search methods:

在我们的例子中,我们将在我们的自定义搜索方法中使用分页功能。

public interface ArticleRepository extends ElasticsearchRepository<Article, String> {

    Page<Article> findByAuthorsName(String name, Pageable pageable);

    @Query("{\"bool\": {\"must\": [{\"match\": {\"authors.name\": \"?0\"}}]}}")
    Page<Article> findByAuthorsNameUsingCustomQuery(String name, Pageable pageable);
}

With the findByAuthorsName method, the repository proxy will create an implementation based on the method name. The resolution algorithm will determine that it needs to access the authors property, and then search the name property of each item.

通过findByAuthorsName方法,资源库代理将根据方法名称创建一个实现。解析算法将确定它需要访问authors属性,然后搜索每个项目的name属性。

The second method, findByAuthorsNameUsingCustomQuery, uses a custom Elasticsearch boolean query defined using the @Query annotation, which requires strict matching between the author’s name and the provided name argument.

第二个方法,findByAuthorsNameUsingCustomQuery,使用@Query注解定义的自定义Elasticsearch布尔查询,它需要在作者姓名和提供的name参数之间进行严格匹配。

2.3. Java Configuration

2.3. Java配置

When configuring Elasticsearch in our Java application, we need to define how we connect to the Elasticsearch instance. For that, we’ll use a RestHighLevelClient, which the Elasticsearch dependency offers:

在我们的Java应用程序中配置Elasticsearch时,我们需要定义我们如何连接到Elasticsearch实例。为此,我们将使用Elasticsearch依赖关系提供的RestHighLevelClient,

@Configuration
@EnableElasticsearchRepositories(basePackages = "com.baeldung.spring.data.es.repository")
@ComponentScan(basePackages = { "com.baeldung.spring.data.es.service" })
public class Config {

    @Bean
    public RestHighLevelClient client() {
        ClientConfiguration clientConfiguration 
            = ClientConfiguration.builder()
                .connectedTo("localhost:9200")
                .build();

        return RestClients.create(clientConfiguration).rest();
    }

    @Bean
    public ElasticsearchOperations elasticsearchTemplate() {
        return new ElasticsearchRestTemplate(client());
    }
}

We’re using a standard Spring-enabled style annotation. @EnableElasticsearchRepositories will make Spring Data Elasticsearch scan the provided package for Spring Data repositories.

我们使用一个标准的Spring启用的样式注解。@EnableElasticsearchRepositories将使Spring Data Elasticsearch扫描所提供的包中的Spring Data存储库。

In order to communicate with our Elasticsearch server, we’ll use a simple RestHighLevelClient. While Elasticsearch provides multiple types of clients, using the RestHighLevelClient is a good way to future-proof the communication with the server.

为了与我们的Elasticsearch服务器通信,我们将使用一个简单的RestHighLevelClient虽然Elasticsearch提供了多种类型的客户端,但使用RestHighLevelClient是一种很好的方式,以保证与服务器的通信。

Finally, we’ll set up an ElasticsearchOperations bean to execute operations on our server. In this case, we instantiate an ElasticsearchRestTemplate.

最后,我们将设置一个ElasticsearchOperationsbean来执行我们服务器上的操作。在这种情况下,我们实例化一个ElasticsearchRestTemplate

3. Mappings

3.映射

We use mappings to define a schema for our documents. By defining a schema for our documents, we protect them from undesired outcomes, such as mapping to an unwanted type.

我们使用映射来为我们的文档定义一个模式。通过为我们的文档定义一个模式,我们可以保护它们不出现不希望出现的结果,如映射到不希望出现的类型。

Our entity is a simple document, Article, where the id is of the type String. We’ll also specify that such documents must be stored in an index named blog within the article type.

我们的实体是一个简单的文档,Article,其中idString类型。我们还将指定此类文档必须存储在article类型中一个名为blog的索引中。

@Document(indexName = "blog", type = "article")
public class Article {

    @Id
    private String id;
    
    private String title;
    
    @Field(type = FieldType.Nested, includeInParent = true)
    private List<Author> authors;
    
    // standard getters and setters
}

Indexes can have several types, which we can use to implement hierarchies.

索引可以有几种类型,我们可以用它们来实现层次结构。

We’ll mark the authors field as FieldType.Nested. This allows us to define the Author class separately, but still have the individual instances of author embedded in an Article document when it’s indexed in Elasticsearch.

我们将 authors字段标记为FieldType.Nested。这允许我们单独定义Author类,但当Elasticsearch对其进行索引时,仍将作者的各个实例嵌入Article文档中。

4. Indexing Documents

4.为文件编制索引

Spring Data Elasticsearch generally auto-creates indexes based on the entities in the project. However, we can also create an index programmatically via the client template:

Spring Data Elasticsearch通常会根据项目中的实体自动创建索引。然而,我们也可以通过客户端模板以编程方式创建索引。

elasticsearchTemplate.indexOps(Article.class).create();

Then we can add documents to the index:

然后我们就可以向索引添加文件。

Article article = new Article("Spring Data Elasticsearch");
article.setAuthors(asList(new Author("John Smith"), new Author("John Doe")));
articleRepository.save(article);

5. Querying

5 查询[/strong]

5.1. Method Name-Based Query

5.1.基于方法名称的查询

When we use the method name-based query, we write methods that define the query we want to perform. During the setup, Spring Data will parse the method signature and create the queries accordingly:

当我们使用基于方法名的查询时,我们编写方法来定义我们要执行的查询。在设置过程中,Spring Data将解析方法签名并创建相应的查询。

String nameToFind = "John Smith";
Page<Article> articleByAuthorName
  = articleRepository.findByAuthorsName(nameToFind, PageRequest.of(0, 10));

By calling findByAuthorsName with a PageRequest object, we’ll obtain the first page of results (page numbering is zero-based), with that page containing at most 10 articles. The page object also provides the total number of hits for the query, along with other handy pagination information.

通过用PageRequest对象调用findByAuthorsName,我们将获得第一页的结果(页面编号为零),该页面最多包含10篇文章。页面对象还提供了该查询的总点击数,以及其他方便的分页信息。

5.2. A Custom Query

5.2.一个自定义查询

There are a couple of ways to define custom queries for Spring Data Elasticsearch repositories. One way is to use the @Query annotation, as demonstrated in section 2.2.

有几种方法可以为Spring Data Elasticsearch存储库定义自定义查询。一种方法是使用@Query注解,如第2.2节所演示的。

Another option is to use the query builder to create our custom query.

另一个选择是使用查询生成器来创建我们的自定义查询。

If we want to search for articles that have the word “data” in the title, we can just create a NativeSearchQueryBuilder with a Filter on the title:

如果我们想搜索标题中有”data“字样的文章,我们只需创建一个NativeSearchQueryBuilder,在title上设置一个Filter:

Query searchQuery = new NativeSearchQueryBuilder()
   .withFilter(regexpQuery("title", ".*data.*"))
   .build();
SearchHits<Article> articles = 
   elasticsearchTemplate.search(searchQuery, Article.class, IndexCoordinates.of("blog");

6. Updating and Deleting

6.更新和删除

In order to update a document, we must first retrieve it:

为了更新一个文件,我们必须首先检索它。

String articleTitle = "Spring Data Elasticsearch";
Query searchQuery = new NativeSearchQueryBuilder()
  .withQuery(matchQuery("title", articleTitle).minimumShouldMatch("75%"))
  .build();

SearchHits<Article> articles = 
   elasticsearchTemplate.search(searchQuery, Article.class, IndexCoordinates.of("blog");
Article article = articles.getSearchHit(0).getContent();

Then we can make changes to the document by editing the content of the object using its assessors:

然后我们可以通过使用对象的评估器编辑其内容来对文件进行修改。

article.setTitle("Getting started with Search Engines");
articleRepository.save(article);

As for deleting, there are several options. We can retrieve the document and delete it using the delete method:

至于删除,有几种选择。我们可以检索文件,并使用delete方法将其删除。

articleRepository.delete(article);

We can also delete it by id once we know it:

一旦我们知道它,我们也可以通过id来删除它。

articleRepository.deleteById("article_id");

It’s also possible to create custom deleteBy queries and make use of the bulk delete feature offered by Elasticsearch:

也可以创建自定义的deleteBy查询,利用Elasticsearch提供的批量删除功能。

articleRepository.deleteByTitle("title");

7. Conclusion

7.结论

In this article, we explored how to connect and make use of Spring Data Elasticsearch. We discussed how to query, update, and delete documents. Finally, we learned how to create custom queries if what’s offered by Spring Data Elasticsearch doesn’t fit our needs.

在这篇文章中,我们探讨了如何连接和利用Spring Data Elasticsearch。我们讨论了如何查询、更新和删除文档。最后,我们了解到,如果Spring Data Elasticsearch提供的东西不符合我们的需求,如何创建自定义查询。

As usual, the source code used throughout this article can be found over on GitHub.

像往常一样,本文中使用的源代码可以在GitHub上找到