Introduction to PCollections – PCollections简介

最后修改: 2017年 8月 1日

中文/混合/英文(键盘快捷键:t)

1. Overview

1.概述

In this article, we will be looking at PCollections, a Java library providing persistent, immutable collections.

在这篇文章中,我们将研究PCollections,一个提供持久、不可变的集合的Java库。

Persistent data structures (collections) can’t be modified directly during the update operation, rather a new object with the result of the update operation is returned. They are not only immutable but also persistent – which means that after modification is performed, previous versions of the collection remain unchanged.

持久化数据结构(集合)在更新操作中不能被直接修改,而是要返回一个带有更新操作结果的新对象。它们不仅是不可变的,而且是持久的–这意味着在进行修改之后,集合的先前版本保持不变。

PCollections is analogous to and compatible with the Java Collections framework.

PCollections类似于并兼容于Java集合框架。

2. Dependencies

2.依赖性

Let’s add the following dependency to our pom.xml for us to use PCollections in our project:

让我们在我们的pom.xml中添加以下依赖项,以便我们在项目中使用PCollections。

<dependency>
    <groupId>org.pcollections</groupId>
    <artifactId>pcollections</artifactId>
    <version>2.1.2</version>
</dependency>

If our project is Gradle based, we can add the same artifact to our build.gradle file:

如果我们的项目是基于Gradle的,我们可以在build.gradle文件中添加同样的工件。

compile 'org.pcollections:pcollections:2.1.2'

The latest version can be found on Maven Central.

最新版本可以在Maven Central上找到。

3. Map Structure (HashPMap)

3.地图结构(HashPMap

HashPMap is a persistent map data structure. It is the analog for java.util.HashMap used for storing non-null, key-value data.

HashPMap是一个持久的地图数据结构。它是java.util.HashMap的类似物,用于存储非空、键值数据。

We can instantiate HashPMap by using convenient static methods in HashTreePMap. These static methods return a HashPMap instance that is backed by an IntTreePMap.

我们可以通过使用HashTreePMap中方便的静态方法来实例化HashPMap这些静态方法返回一个HashPMap实例,该实例由一个IntTreePMap支持。

The static empty() method of the HashTreePMap class creates an empty HashPMap that has no elements – just like using the default constructor of java.util.HashMap:

HashTreePMap类的静态empty()方法创建了一个没有元素的空HashPMap–就像使用java.util.HashMap的默认构造函数。

HashPMap<String, String> pmap = HashTreePMap.empty();

There are two other static methods that we can use to create HashPMap. The singleton() method creates a HashPMap with only one entry:

还有两个静态方法,我们可以用来创建HashPMapsingleton()方法创建一个只有一个条目的HashPMap

HashPMap<String, String> pmap1 = HashTreePMap.singleton("key1", "value1");
assertEquals(pmap1.size(), 1);

The from() method creates a HashPMap from an existing java.util.HashMap instance (and other java.util.Map implementations):

from()方法从现有的java.util.HashMap实例(以及其他java.util.Map实现)创建一个HashPMap

Map map = new HashMap();
map.put("mkey1", "mval1");
map.put("mkey2", "mval2");

HashPMap<String, String> pmap2 = HashTreePMap.from(map);
assertEquals(pmap2.size(), 2);

Although HashPMap inherits some of the methods from java.util.AbstractMap and java.util.Map, it has methods that are unique to it.

尽管HashPMap继承了java.util.AbstractMapjava.util.Map的一些方法,但它也有自己特有的方法。

The minus() method removes a single entry from the map while the minusAll() method removes multiple entries. There’s also the plus() and plusAll() methods that add single and multiple entries respectively:

minus()方法从地图中删除一个条目,而minusAll()方法删除多个条目。还有plus()plusAll()方法,分别添加单个和多个条目。

HashPMap<String, String> pmap = HashTreePMap.empty();
HashPMap<String, String> pmap0 = pmap.plus("key1", "value1");

Map map = new HashMap();
map.put("key2", "val2");
map.put("key3", "val3");
HashPMap<String, String> pmap1 = pmap0.plusAll(map);

HashPMap<String, String> pmap2 = pmap1.minus("key1");

HashPMap<String, String> pmap3 = pmap2.minusAll(map.keySet());

assertEquals(pmap0.size(), 1);
assertEquals(pmap1.size(), 3);
assertFalse(pmap2.containsKey("key1"));
assertEquals(pmap3.size(), 0);

It’s important to note that calling put() on pmap will throw an UnsupportedOperationException. Since PCollections objects are persistent and immutable, every modifying operation returns a new instance of an object (HashPMap).

需要注意的是,在pmap上调用put()将抛出一个UnsupportedOperationException.因为PCollections对象是持久的和不可改变的,每一个修改操作都会返回一个新的对象(HashPMap)的实例。

Let’s move on to look at other data structures.

让我们继续看一下其他的数据结构。

4. List Structure (TreePVector and ConsPStack)

4.列表结构(TreePVector和ConsPStack

TreePVector is a persistent analog of java.util.ArrayList while ConsPStack is the analog of java.util.LinkedList. TreePVector and ConsPStack have convenient static methods for creating new instances – just like HashPMap.

TreePVectorjava.util.ArrayList的持久化类似物,而ConsPStackjava.util.LinkedList的类似物。TreePVectorConsPStack有方便的静态方法来创建新的实例–就像HashPMap

The empty() method creates an empty TreePVector, while the singleton() method creates a TreePVector with only one element. There’s also the from() method that can be used to create an instance of TreePVector from any java.util.Collection.

empty()方法创建一个空的TreePVector,而singleton()方法创建一个只有一个元素的TreePVector。还有一个from()方法,可以用来从任何java.util.Collection中创建一个TreePVector的实例。

ConsPStack has static methods with the same name that achieve the same goal.

ConsPStack有同名的静态方法,可以实现同样的目标。

TreePVector has methods for manipulating it. It has the minus() and minusAll() methods for removal of element(s); the plus(), and plusAll() for addition of element(s).

TreePVector有操纵它的方法。它有minus()minusAll()方法用于移除元素;plus()plusAll()用于添加元素。

The with() is used to replace an element at a specified index, and the subList() gets a range of elements from the collection.

with()用于替换一个指定索引的元素,而subList()从集合中获得一个元素范围。

These methods are available in ConsPStack as well.

这些方法在ConsPStack中也是可用的。

Let’s consider the following code snippet that exemplifies the methods mentioned above:

让我们考虑下面的代码片段,它体现了上述方法。

TreePVector pVector = TreePVector.empty();

TreePVector pV1 = pVector.plus("e1");
TreePVector pV2 = pV1.plusAll(Arrays.asList("e2", "e3", "e4"));
assertEquals(1, pV1.size());
assertEquals(4, pV2.size());

TreePVector pV3 = pV2.minus("e1");
TreePVector pV4 = pV3.minusAll(Arrays.asList("e2", "e3", "e4"));
assertEquals(pV3.size(), 3);
assertEquals(pV4.size(), 0);

TreePVector pSub = pV2.subList(0, 2);
assertTrue(pSub.contains("e1") && pSub.contains("e2"));

TreePVector pVW = (TreePVector) pV2.with(0, "e10");
assertEquals(pVW.get(0), "e10");

In the code snippet above, pSub is another TreePVector object and is independent of pV2. As can be observed, pV2 was not changed by the subList() operation; rather a new TreePVector object was created and filled with elements of pV2 from index 0 to 2.

在上面的代码片段中,pSub是另一个TreePVector对象,并且与pV2无关。可以看到,pV2并没有被subList()操作所改变;而是创建了一个新的TreePVector对象,并将pV2的元素从索引0到2填充。

This is what is meant by immutability and it is what happens with all modifying methods of PCollections.

这就是不变性的意思,这也是PCollections的所有修改方法的情况。

5. Set Structure (MapPSet)

5.集合结构(MapPSet

MapPSet is a persistent, map-backed, analog of java.util.HashSet. It can be conveniently instantiated by static methods of HashTreePSet – empty(), from() and singleton(). They function in the same way as explained in previous examples.

MapPSet是一个持久的、有地图支持的、类似于java.util.HashSet的东西。它可以通过HashTreePSet的静态方法–empty()from()singleton()方便地进行实例化。它们的功能与前面例子中解释的相同。

MapPSet has plus(), plusAll(), minus() and minusAll() methods for manipulating set data. Furthermore, it inherits methods from java.util.Set, java.util.AbstractCollection and java.util.AbstractSet:

MapPSetplus()plusAll()minus()minusAll()方法用于操作集合数据。此外,它继承了java.util.Setjava.util.AbstractCollectionjava.util.AbstractSet的方法。

MapPSet pSet = HashTreePSet.empty()     
  .plusAll(Arrays.asList("e1","e2","e3","e4"));
assertEquals(pSet.size(), 4);

MapPSet pSet1 = pSet.minus("e4");
assertFalse(pSet1.contains("e4"));

Finally, there’s also OrderedPSet – which maintains the insertion order of elements just like java.util.LinkedHashSet.

最后,还有OrderedPSet–它就像java.util.LinkedHashSet一样保持元素的插入顺序。

6. Conclusion

6.结论

In conclusion, in this quick tutorial, we explored PCollections – the persistent data structures that are analogous to core collections we have available in Java. Of course, the PCollections Javadoc provides more insight into the intricacies of the library.

总之,在这个快速教程中,我们探索了PCollections–类似于我们在Java中可用的核心集合的持久化数据结构。当然,PCollectionsJavadoc提供了对该库的复杂性的更多见解。

And, as always, the complete code can be found over on Github.

而且,像往常一样,完整的代码可以在Github上找到