An Advanced Tagging Implementation with JPA – 使用JPA的高级标签实现

最后修改: 2018年 3月 11日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

Tagging is a Design Pattern that allows us to perform advanced filtering and sorting on our data. This article is a continuation of a Simple Tagging Implementation with JPA.

标签是一种设计模式,它允许我们对数据进行高级过滤和排序。本文是a Simple Tagging Implementation with JPA的延续。

Therefore, we’ll pick up where that article left off and cover advanced use cases for Tagging.

因此,我们将接续那篇文章的内容,介绍标签的高级用例。

2. Endorsed Tags

2.认可的标签

Probably the best known advanced tagging implementation is the Endorsement Tag. We can see this pattern on sites like Linkedin.

最著名的高级标签实现可能是认可标签。我们可以在Linkedin等网站上看到这种模式。

Essentially, the tag is a combination of a string name and a numerical value. Then, we can use the number to represent the number of times the tag has been voted or “endorsed”.

本质上,标签是一个字符串名称和一个数值的组合。然后,我们可以用数字来表示该标签被投票或 “认可 “的次数。

Here’s an example of how to create this kind of tag:

下面是一个如何创建这种标签的例子。

@Embeddable
public class SkillTag {
    private String name;
    private int value;

    // constructors, getters, setters
}

To use this tag, we simply add a List of them to our data object:

要使用这个标签,我们只需将它们的List添加到我们的数据对象。

@ElementCollection
private List<SkillTag> skillTags = new ArrayList<>();

We mentioned in the previous article that the @ElementCollection annotation automatically creates a one-to-many mapping for us.

我们在上一篇文章中提到,@ElementCollection注解会自动为我们创建一个一对多的映射。

This is a model use-case for this relationship. Because each tag has personalized data associated with the entity it’s stored on, we can’t save space with a many-to-many storage mechanism.

这是这种关系的一个典型用例。因为每个标签都有与它所存储的实体相关的个性化数据,我们不能用多对多的存储机制来节省空间。

Later in the article, we’ll cover an example of when many-to-many makes sense.

在文章的后面,我们将介绍一个例子,说明什么时候多对多有意义。

Because we’ve embedded the skill tag into our original entity, we can query on it just like any other attribute.

因为我们已经将技能标签嵌入到我们的原始实体中,我们可以像其他属性一样对它进行查询。

Here’s an example query looking for any student with more than a certain number of endorsements:

这里有一个查询的例子,寻找任何有超过一定数量认可的学生。

@Query(
  "SELECT s FROM Student s JOIN s.skillTags t WHERE t.name = LOWER(:tagName) AND t.value > :tagValue")
List<Student> retrieveByNameFilterByMinimumSkillTag(
  @Param("tagName") String tagName, @Param("tagValue") int tagValue);

Next, let’s look at an example of how to use this:

接下来,让我们看一个如何使用的例子。

Student student = new Student(1, "Will");
SkillTag skill1 = new SkillTag("java", 5);
student.setSkillTags(Arrays.asList(skill1));
studentRepository.save(student);

Student student2 = new Student(2, "Joe");
SkillTag skill2 = new SkillTag("java", 1);
student2.setSkillTags(Arrays.asList(skill2));
studentRepository.save(student2);

List<Student> students = 
  studentRepository.retrieveByNameFilterByMinimumSkillTag("java", 3);
assertEquals("size incorrect", 1, students.size());

Now we can search for either the presence of the tag or having a certain number of endorsements for the tag.

现在,我们可以搜索是否存在该标签或是否有一定数量的标签认可。

Consequently, we can combine this with other query parameters to create a variety of complex queries.

因此,我们可以将其与其他查询参数结合起来,创建各种复杂的查询。

3. Location Tags

3.位置标签

Another popular tagging implementation is the Location Tag. We can use a Location Tag in two primary ways.

另一个流行的标签实现是位置标签。我们可以通过两种主要方式使用位置标签。

First of all, it can be used to tag a geophysical location.

首先,它可以用来标记一个地球物理位置。

Also, it can be used to tag a location in media such as a photo or video. The implementation of the model is nearly identical in all of these cases.

此外,它还可以用来标记媒体中的一个位置,如照片或视频。在所有这些情况下,该模型的实现几乎是相同的。

Here’s an example of tagging a photo:

下面是一个标记照片的例子。

@Embeddable
public class LocationTag {
    private String name;
    private int xPos;
    private int yPos;

    // constructors, getters, setters
}

The most noteworthy aspect of Location Tags is how difficult it is to perform a Geolocation Filter using just a database. If we need to search within geographic bounds, a better approach is loading the model into a Search Engine (like Elasticsearch) which has built-in support for geolocations.

位置标签最值得注意的地方是,仅仅使用数据库进行地理位置过滤是多么困难。如果我们需要在地理范围内进行搜索,更好的方法是将模型加载到搜索引擎(如Elasticsearch)中,该引擎对地理位置有内置支持。

Therefore, we should focus on filtering by the tag name for these location tags.

因此,我们应该专注于通过标签名称来过滤这些位置标签。

The query is going to look similar to our simple tagging implementation from the previous article:

该查询将类似于我们在上一篇文章中的简单标签实现。

@Query("SELECT s FROM Student s JOIN s.locationTags t WHERE t.name = LOWER(:tag)")
List<Student> retrieveByLocationTag(@Param("tag") String tag);

The example to use location tags will also look familiar:

使用位置标签的例子也会看起来很熟悉。

Student student = new Student(0, "Steve");
student.setLocationTags(Arrays.asList(new LocationTag("here", 0, 0));
studentRepository.save(student);

Student student2 = studentRepository.retrieveByLocationTag("here").get(0);
assertEquals("name incorrect", "Steve", student2.getName());

If Elasticsearch is out of the question and we still need to search on geographic bounds, using simple geometric shapes will make the query criteria much more readable.

如果Elasticsearch不存在,而我们仍然需要在地理界线上进行搜索,那么使用简单的几何图形会使查询条件更易读。

We’ll leave finding if a point is within a circle or rectangle is straightforward as an exercise for the reader.

我们将把寻找一个点是否在一个圆或矩形内作为一个练习留给读者,这是很直接的。

4. Key-Value Tags

4.键值标签

Sometimes, we need to store tags that are slightly more complicated. We might want to tag an entity with a small subset of key tags, but that can contain a wide variety of values.

有时,我们需要存储稍微复杂一些的标签。我们可能想用一个小的关键标签子集来标记一个实体,但这可能包含各种各样的值。

For instance, we could tag a student with a department tag and set its value to Computer Science. Each student will have the department key, but they could all have different values associated with it.

例如,我们可以用department标签来标记一个学生,并将其值设置为Computer Science。每个学生都会有department键,但他们都可以有不同的值与之关联。

The implementation will look similar to the Endorsed Tags above:

其实现方式将类似于上面的 “认可标签”。

@Embeddable
public class KVTag {
    private String key;
    private String value;

    // constructors, getters and setters
}

We can add it to our model like this:

我们可以像这样把它添加到我们的模型中。

@ElementCollection
private List<KVTag> kvTags = new ArrayList<>();

Now we can add a new query to our repository:

现在我们可以向我们的资源库添加一个新的查询。

@Query("SELECT s FROM Student s JOIN s.kvTags t WHERE t.key = LOWER(:key)")
List<Student> retrieveByKeyTag(@Param("key") String key);

We can also quickly add a query to search by value or by both key and value. This gives us additional flexibility in how we search our data.

我们还可以快速添加一个查询,按价值或按关键和价值进行搜索。这使我们在搜索数据时有了更多的灵活性。

Let’s test this out and verify it all works:

让我们测试一下,验证一下这一切是否有效。

@Test
public void givenStudentWithKVTags_whenSave_thenGetByTagOk(){
    Student student = new Student(0, "John");
    student.setKVTags(Arrays.asList(new KVTag("department", "computer science")));
    studentRepository.save(student);

    Student student2 = new Student(1, "James");
    student2.setKVTags(Arrays.asList(new KVTag("department", "humanities")));
    studentRepository.save(student2);

    List<Student> students = studentRepository.retrieveByKeyTag("department");
 
    assertEquals("size incorrect", 2, students.size());
}

Following this pattern, we can design even more complicated nested objects and use them to tag our data if we need to.

按照这个模式,我们可以设计更复杂的嵌套对象,并在需要时使用它们来标记我们的数据。

Most use cases can be met with the advanced implementations we have talked about today, but the option is there to go as complicated as needed.

大多数用例可以通过我们今天谈到的高级实现来满足,但也可以根据需要进行复杂的选择。

5. Reimplementing Tagging

5.重新实现标签化

Finally, we’re going to explore one last area of tagging. So far, we’ve seen how to use the @ElementCollection annotation to make adding tags to our model easy. While it’s simple to use, it has a pretty significant trade-off. The one-to-many implementation under the hood can lead to a lot of duplicated data in our data store.

最后,我们将探索标签的最后一个领域。到目前为止,我们已经看到了如何使用@ElementCollection注解来使向我们的模型添加标签变得简单。虽然它使用起来很简单,但它有一个相当重要的权衡。引擎盖下的一对多实现会导致我们的数据存储中出现大量重复的数据。

To save space, we need to create another table that will join our Student entities to our Tag entities. Luckily, Spring JPA will do most of the heavy lifting for us.

为了节省空间,我们需要创建另一个表,将我们的Student实体连接到我们的Tag实体。幸运的是,Spring JPA将为我们完成大部分的繁重工作。

We’re going to reimplement our Student and Tag entities to see how this is done.

我们将重新实现我们的StudentTag实体,看看这是如何做到的。

5.1. Define Entities

5.1.定义实体

First of all, we need to recreate our models. We’ll start with a ManyStudent model:

首先,我们需要重新创建我们的模型。我们将从一个ManyStudent模型开始。

@Entity
public class ManyStudent {

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private int id;
    private String name;

    @ManyToMany(cascade = CascadeType.ALL)
    @JoinTable(name = "manystudent_manytags",
      joinColumns = @JoinColumn(name = "manystudent_id", 
      referencedColumnName = "id"),
      inverseJoinColumns = @JoinColumn(name = "manytag_id", 
      referencedColumnName = "id"))
    private Set<ManyTag> manyTags = new HashSet<>();

    // constructors, getters and setters
}

There’re a couple of things to notice here.

这里有几件事情需要注意。

First, we’re generating our ID, so the table linkages are easier to manage internally.

首先,我们要生成我们的ID,这样表的链接就更容易在内部管理。

Next, we’re using the @ManyToMany annotation to tell Spring we want a linkage between the two classes.

接下来,我们使用@ManyToMany注解来告诉Spring我们要在两个类之间建立联系。

Finally, we use the @JoinTable annotation to set up our actual join table.

最后,我们使用@JoinTable注解来设置我们的实际连接表。

Now we can move on to our new tag model which we’ll call ManyTag:

现在我们可以转向我们的新标签模型,我们将称之为ManyTag

@Entity
public class ManyTag {

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private int id;
    private String name;

    @ManyToMany(mappedBy = "manyTags")
    private Set<ManyStudent> students = new HashSet<>();

    // constructors, getters, setters
}

Because we’ve already set up our join table in the student model, all we have to worry about is setting up the reference inside this model.

因为我们已经在学生模型中设置了我们的连接表,我们所要担心的就是在这个模型中设置引用。

We use the mappedBy attribute to tell JPA we want this link to the Join Table we created before.

我们使用mappedBy属性来告诉JPA我们想要这个链接到我们之前创建的Join表。

5.2. Define Repositories

5.2.定义存储库

In addition to the models, we also need to set up two repositories: one for each entity. We’ll let Spring Data do all the heavy lifting here:

除了模型之外,我们还需要建立两个存储库:每个实体一个存储库。我们将让Spring Data在这里完成所有繁重的工作。

public interface ManyTagRepository extends JpaRepository<ManyTag, Long> {
}

Since we don’t need to search on just tags currently, we can leave the repository class empty.

由于我们目前不需要只在标签上进行搜索,我们可以让资源库类为空。

Our student repository is only slightly more complicated:

我们的学生资源库只是稍微复杂一些。

public interface ManyStudentRepository extends JpaRepository<ManyStudent, Long> {
    List<ManyStudent> findByManyTags_Name(String name);
}

Again, we’re letting Spring Data auto-generate the queries for us.

同样,我们让Spring Data为我们自动生成查询。

5.3. Testing

5.3.测试

Finally, let’s see what this all looks like in a test:

最后,让我们看看这一切在测试中是什么样子。

@Test
public void givenStudentWithManyTags_whenSave_theyGetByTagOk() {
    ManyTag tag = new ManyTag("full time");
    manyTagRepository.save(tag);

    ManyStudent student = new ManyStudent("John");
    student.setManyTags(Collections.singleton(tag));
    manyStudentRepository.save(student);

    List<ManyStudent> students = manyStudentRepository
      .findByManyTags_Name("full time");
 
    assertEquals("size incorrect", 1, students.size());
}

The flexibility added by storing the tags in a separate searchable table far outweighs the minor amount of complexity that is added to the code.

将标签存储在一个单独的可搜索表中所增加的灵活性远远超过了代码中所增加的少量复杂性。

This also allows us to reduce the total number of tags we store in the system by removing duplicate tags.

这也使我们能够通过删除重复的标签来减少我们存储在系统中的标签总数。

However, many-to-many isn’t optimized for cases where we want to store state information specific to the entity along with the tag.

然而,”多对多 “并没有针对我们希望与标签一起存储实体的特定状态信息的情况进行优化。

6. Conclusion

6.结论

This article picked up where the previous one left off.

这篇文章从前一篇的内容开始。

First of all, we introduced several advanced models that are useful when designing a tagging implementation.

首先,我们介绍了几个先进的模型,这些模型在设计标签实现时很有用。

Finally, we re-examined the implementation of tagging from the last article in the context of a many-to-many mapping.

最后,我们在多对多映射的背景下重新审视了上一篇文章中标签的实现。

To see working examples of what we talked about today, please check out the code on GitHub.

要查看我们今天所谈的工作实例,请查看GitHub上的代码

Next »

A Simple Tagging Implementation with MongoDB

« Previous

A Simple Tagging Implementation with JPA