Introduction to Hibernate Search – Hibernate搜索简介

1. Overview


In this article, we’ll discuss the basics of Hibernate Search, how to configure it, and we’ll implement some simple queries.


2. Basics of Hibernate Search


Whenever we have to implement full-text search functionality, using tools we’re already well-versed with is always a plus.


In case we’re already using Hibernate and JPA for ORM, we’re only one step away from Hibernate Search.

如果我们已经在使用Hibernate和JPA进行ORM,那么我们离Hibernate Search只有一步之遥。

Hibernate Search integrates Apache Lucene, a high-performance and extensible full-text search-engine library written in Java. This combines the power of Lucene with the simplicity of Hibernate and JPA.

Hibernate Search集成了Apache Lucene,一个用Java编写的高性能和可扩展的全文搜索引擎库。这将Lucene的力量与Hibernate和JPA的简单性相结合。

Simply put, we just have to add some additional annotations to our domain classes, and the tool will take care of the things like database/index synchronization.


Hibernate Search also provides an Elasticsearch integration; however, as it’s still in an experimental stage, we’ll focus on Lucene here.

Hibernate Search也提供了Elasticsearch的集成;然而,由于它仍处于实验阶段,我们在此将重点讨论Lucene。

3. Configurations


3.1. Maven Dependencies


Before getting started, we first need to add the necessary dependencies to our pom.xml:



For the sake of simplicity, we’ll use H2 as our database:



3.2. Configurations


We also have to specify where Lucene should store the index.


This can be done via the property


We’ll choose filesystem, which is the most straightforward option for our use case. More options are listed in the official documentation. Filesystem-master/filesystem-slave and infinispan are noteworthy for clustered applications, where the index has to be synchronized between nodes.


We also have to define a default base directory where indexes will be stored:

我们还必须定义一个默认的基本目录,索引将被存储在那里。 = filesystem = /data/index/default

4. The Model Classes


After the configuration, we’re now ready to specify our model.


On top of the JPA annotations @Entity and @Table, we have to add an @Indexed annotation. It tells Hibernate Search that the entity Product shall be indexed.

在JPA注解@Entity@Table之上,我们必须添加一个@Indexed注解。它告诉Hibernate Search,实体Product应被索引。

After that, we have to define the required attributes as searchable by adding a @Field annotation:


@Table(name = "product")
public class Product {

    private int id;

    @Field(termVector = TermVector.YES)
    private String productName;

    @Field(termVector = TermVector.YES)
    private String description;

    private int memory;

    // getters, setters, and constructors

The termVector = TermVector.YES attribute will be required for the “More Like This” query later.

termVector = TermVector.YES属性在后面的 “More Like This “查询中是必需的。

5. Building the Lucene Index


Before starting the actual queries, we have to trigger Lucene to build the index initially:


FullTextEntityManager fullTextEntityManager 
  = Search.getFullTextEntityManager(entityManager);

After this initial build, Hibernate Search will take care of keeping the index up to date. I. e. we can create, manipulate and delete entities via the EntityManager as usual.

在这个初始构建之后,Hibernate Search将负责保持索引的更新。也就是说,我们可以像往常一样通过EntityManager创建、操作和删除实体。

Note: we have to make sure that entities are fully committed to the database before they can be discovered and indexed by Lucene (by the way, this also the reason why the initial test data import in our example code test cases comes in a dedicated JUnit test case, annotated with @Commit).


6. Building and Executing Queries


Now, we’re ready for creating our first query.


In the following section, we’ll show the general workflow for preparing and executing a query.


After that, we’ll create some example queries for the most important query types.


6.1. General Workflow for Creating and Executing a Query


Preparing and executing a query in general consists of four steps:


In step 1, we have to get a JPA FullTextEntityManager and from that a QueryBuilder:

在第1步中,我们必须得到一个JPA FullTextEntityManager,并从中得到一个QueryBuilder

FullTextEntityManager fullTextEntityManager 
  = Search.getFullTextEntityManager(entityManager);

QueryBuilder queryBuilder = fullTextEntityManager.getSearchFactory() 

In step 2, we will create a Lucene query via the Hibernate query DSL:

在第二步,我们将通过Hibernate查询DSL创建一个Lucene查询。 query = queryBuilder

In step 3, we’ll wrap the Lucene query into a Hibernate query:

在第3步中,我们将把Lucene查询包装成一个Hibernate查询。 jpaQuery
  = fullTextEntityManager.createFullTextQuery(query, Product.class);

Finally, in step 4 we’ll execute the query:


List<Product> results = jpaQuery.getResultList();

Note: by default, Lucene sorts the results by relevance.


Steps 1, 3 and 4 are the same for all query types.


In the following, we will focus on step 2, i. e. how to create different types of queries.


6.2. Keyword Queries


The most basic use-case is searching for a specific word.


This is what we actually did already in the previous section:


Query keywordQuery = queryBuilder

Here, keyword() specifies that we are looking for one specific word, onField() tells Lucene where to look and matching() what to look for.


6.3. Fuzzy Queries

6.3 模糊查询

Fuzzy queries are working like keyword queries, except that we can define a limit of “fuzziness”, above which Lucene shall accept the two terms as matching.

模糊查询的工作方式与关键词查询类似,只是我们可以定义一个 “模糊性 “的极限,超过这个极限,Lucene将接受两个词的匹配。

By withEditDistanceUpTo(), we can define how much a term may deviate from the other. It can be set to 0, 1, and 2, whereby the default value is 2 (note: this limitation is coming from the Lucene’s implementation).


By withPrefixLength(), we can define the length of the prefix which shall be ignored by the fuzziness:


Query fuzzyQuery = queryBuilder

6.4. Wildcard Queries


Hibernate Search also enables us to execute wildcard queries, i. e. queries for which a part of a word is unknown.


For this, we can use “?” for a single character, and “*” for any character sequence:


Query wildcardQuery = queryBuilder

6.5. Phrase Queries


If we want to search for more than one word, we can use phrase queries. We can either look for exact or for approximate sentences, using phrase() and withSlop(), if necessary. The slop factor defines the number of other words permitted in the sentence:


Query phraseQuery = queryBuilder
  .sentence("with wireless charging")

6.6. Simple Query String Queries


With the previous query types, we had to specify the query type explicitly.


If we want to give some more power to the user, we can use simple query string queries: by that, he can define his own queries at runtime.


The following query types are supported:


  • boolean (AND using “+”, OR using “|”, NOT using “-“)
  • prefix (prefix*)
  • phrase (“some phrase”)
  • precedence (using parentheses)
  • fuzzy (fuzy~2)
  • near operator for phrase queries (“some phrase”~3)

The following example would combine fuzzy, phrase and boolean queries:


Query simpleQueryStringQuery = queryBuilder
  .onFields("productName", "description")
  .matching("Aple~2 + \"iPhone X\" + (256 | 128)")

6.7. Range Queries


Range queries search for a value in between given boundaries. This can be applied to numbers, dates, timestamps, and strings:


Query rangeQuery = queryBuilder

6.8. More Like This Queries


Our last query type is the “More Like This” – query. For this, we provide an entity, and Hibernate Search returns a list with similar entities, each with a similarity score.

我们的最后一个查询类型是”More Like This“–查询。为此,我们提供一个实体,Hibernate Search会返回一个类似实体的列表,每个实体都有一个相似度分数。

As mentioned before, the termVector = TermVector.YES attribute in our model class is required for this case: it tells Lucene to store the frequency for each term during indexing.

如前所述,在这种情况下,我们的模型类中的termVector = TermVector.YES属性是必需的:它告诉Lucene在索引期间为每个术语存储频率。

Based on this, the similarity will be calculated at query execution time:


Query moreLikeThisQuery = queryBuilder
List<Object[]> results = (List<Object[]>) fullTextEntityManager
  .createFullTextQuery(moreLikeThisQuery, Product.class)
  .setProjection(ProjectionConstants.THIS, ProjectionConstants.SCORE)

6.9. Searching More Than One Field


Until now, we only created queries for searching one attribute, using onField().


Depending on the use case, we can also search two or more attributes:


Query luceneQuery = queryBuilder
  .onFields("productName", "description")

Moreover, we can specify each attribute to be searched separately, e. g. if we want to define a boost for one attribute:


Query moreLikeThisQuery = queryBuilder

6.10. Combining Queries


Finally, Hibernate Search also supports combining queries using various strategies:

最后,Hibernate Search还支持使用各种策略组合查询。

  • SHOULD: the query should contain the matching elements of the subquery
  • MUST: the query must contain the matching elements of the subquery
  • MUST NOT: the query must not contain the matching elements of the subquery

The aggregations are similar to the boolean ones AND, OR and NOT. However, the names are different to emphasize that they also have an impact on the relevance.

聚合是类似于布尔式的AND, ORNOT但是,名称不同,以强调它们对相关性也有影响。

For example, a SHOULD between two queries is similar to boolean OR: if one of the two queries has a match, this match will be returned.


However, if both queries match, the match will have a higher relevance compared to if only one query matches:


Query combinedQuery = queryBuilder
    .onField("description").sentence("face id")

7. Conclusion


In this article, we discussed the basics of Hibernate Search and showed how to implement the most important query types. More advanced topics can be found it the official documentation.


