1. Introduction
1.绪论
InputStream is a common abstract class used for processing data. The data can originate from very different sources but using the class allows us to abstract from the origin and process it independently from a specific source.
InputStream是一个用于处理数据的普通抽象类。数据可以来自非常不同的源头,但使用该类可以让我们从源头抽象出来,并从特定的源头独立地处理数据。
However, when we write tests, we need actually to provide some solid implementation. In this tutorial, we’ll learn which of the available implementations we should choose or when it’s better to write our own.
然而,当我们写测试时,我们实际上需要提供一些可靠的实现。在本教程中,我们将学习我们应该选择哪些可用的实现,或者什么时候自己编写更好。
2. InputStream Interface Basics
2.InputStream接口基础知识
Before we jump into writing our own code, it’d be good for us to understand a little about how the InputStream interface is built. Fortunately, it’s pretty straightforward. To implement a simple InputStream, we only need to consider one method – read. It takes no parameters and returns the next byte of the stream as an int. If the InputStream has ended, it returns -1, signaling us to stop the processing.
在我们开始编写自己的代码之前,我们最好先了解一下InputStream接口是如何建立的。幸运的是,它是非常直接的。为了实现一个简单的InputStream,我们只需要考虑一个方法–read。它不需要任何参数,并以int的形式返回流的下一个字节。如果InputStream已经结束,它将返回-1,示意我们停止处理。
2.1. Test Case
2.1.测试案例
In this tutorial, we’ll test one method that processes text messages in the form of InputStream and returns the number of processed bytes. We’ll then assert that the correct number of bytes were read:
在本教程中,我们将测试一个方法,该方法以InputStream的形式处理文本信息,并返回处理的字节数。然后我们将断言读取的字节数是正确的。
int bytesCount = processInputStream(someInputStream);
assertThat(bytesCount).isEqualTo(expectedNumberOfBytes);
What the processInputStream() method does internally is less relevant here, so we’re just using a very simple implementation:
processInputStream()方法在内部的作用在这里不太重要,所以我们只是使用一个非常简单的实现。
public class MockingInputStreamUnitTest {
int processInputStream(InputStream inputStream) throws IOException {
int count = 0;
while(inputStream.read() != -1) {
count++;
}
return count;
}
}
2.2. Using the Naive Implementation
2.2.使用Naive实现
To better understand how InputStream works, we’ll write a simple implementation with a hardcoded message. Apart from the message, our implementation will have an index pointing to what byte of the message we should read next. Every time the read method is invoked, we’ll get one byte from the message and then increment the index.
为了更好地理解InputStream是如何工作的,我们将写一个简单的实现,其中有一个硬编码的消息。除了消息之外,我们的实现将有一个索引,指向我们接下来应该读取消息的哪个字节。每当读取方法被调用时,我们将从消息中获取一个字节,然后增加该索引。
Before we do that, we also need to check if we haven’t already read all the bytes from the message. If so, we need to return -1:
在这样做之前,我们还需要检查我们是否已经从消息中读取了所有的字节。如果是这样,我们需要返回-1。
public class MockingInputStreamUnitTest {
@Test
public void givenSimpleImplementation_shouldProcessInputStream() throws IOException {
int byteCount = processInputStream(new InputStream() {
private final byte[] msg = "Hello World".getBytes();
private int index = 0;
@Override
public int read() {
if (index >= msg.length) {
return -1;
}
return msg[index++];
}
});
assertThat(byteCount).isEqualTo(11);
}
3. Using ByteArrayInputStream
3.使用ByteArrayInputStream
If we are absolutely sure that the whole data payload will fit into the memory, the simplest choice is ByteArrayInputStream. We provide an array of bytes to the constructor, then the stream iterates through it, byte by byte, in a similar fashion to the example from the previous section:
如果我们绝对确定整个数据有效载荷将被装入内存,最简单的选择是ByteArrayInputStream。我们向构造函数提供一个字节数组,然后流以类似于上一节中的例子的方式逐个迭代。
String msg = "Hello World";
int bytesCount = processInputStream(new ByteArrayInputStream(msg.getBytes()));
assertThat(bytesCount).isEqualTo(11);
4. Using FileInputStream
4.使用FileInputStream
If we can save our data as a file, we can also load it in the form of FileInputStream. The advantage of this approach is that data won’t be loaded into memory as a whole but rather read from the disk when needed. If we place the file in the resources folder, we can use a convenient getResourceAsStream method to create InputStream directly from a path in one line of code:
如果我们可以将数据保存为文件,我们也可以以FileInputStream的形式加载它。这种方法的优点是,数据不会被整体加载到内存中,而是在需要时从磁盘上读取。如果我们将文件放在资源文件夹中,我们可以使用方便的getResourceAsStream方法,在一行代码中直接从一个路径中创建InputStream。
InputStream inputStream = MockingInputStreamUnitTest.class.getResourceAsStream("/mockinginputstreams/msg.txt");
int bytesCount = processInputStream(inputStream);
assertThat(bytesCount).isEqualTo(11);
Note that in this example, an actual implementation of the InputStream will be BufferedFileInputStream. As the name suggests, it reads bigger chunks of data and stores them in the buffer. Thus it limits the number of reads from the disk.
注意,在这个例子中,InputStream的实际实现将是BufferedFileInputStream。顾名思义,它读取更大的数据块并将其存储在缓冲区内。因此,它限制了从磁盘上读取数据的数量。
5. Generating Data On the Fly
5.即时生成数据
Sometimes we want to test if our system works properly with a large amount of data. We could just use a big file loaded from a disk, but that approach has some serious drawbacks. It’s not only a potential waste of space, but version control systems like git aren’t made to play nicely with big binary files. Fortunately, we don’t need to have all the data beforehand. Instead, we can generate it on the fly.
有时我们想测试我们的系统是否能在大量的数据下正常工作。我们可以直接使用一个从磁盘上加载的大文件,但这种方法有一些严重的缺点。这不仅是对空间的潜在浪费,而且像git这样的版本控制系统并不是为了与大的二进制文件友好相处而制造的。幸运的是,我们不需要事先拥有所有的数据。相反,我们可以在飞行中生成它。
To achieve that, we need to implement our InputStream. Let’s start with defining fields and constructor:
为了实现这个目标,我们需要实现我们的InputStream。让我们从定义字段和构造函数开始。
public class GeneratingInputStream extends InputStream {
private final int desiredSize;
private final byte[] seed;
private int actualSize = 0;
public GeneratingInputStream(int desiredSize, String seed) {
this.desiredSize = desiredSize;
this.seed = seed.getBytes();
}
}
The “desiredSize” variable will tell us when we should stop generating data. The “seed” variable will be a chunk of data that will be repeated. Finally, the “actualSize” variable will help us track how many bytes we have returned. We need it because we don’t actually save any data. We only return the “current” byte.
desiredSize “变量将告诉我们何时应该停止生成数据。种子 “变量将是一个将被重复的数据块。最后,“actualSize”变量将帮助我们跟踪我们已经返回了多少字节。我们需要它,因为我们实际上并没有保存任何数据。我们只返回 “当前 “的字节数。
Using the variables we defined, we can implement the read method:
使用我们定义的变量,我们可以实现read方法。
@Override
public int read() {
if (actualSize >= desiredSize) {
return -1;
}
return seed[actualSize++ % seed.length];
}
First, we check if we achieved the desired size. If we did, we should return -1 so the stream’s consumer knows to stop reading. If we didn’t, we should return one byte from the seed. To determine which byte it should be, we use the modulo operator to get the remainder of dividing the actual size of generated data by the length of the seed.
首先,我们检查我们是否达到了预期的大小。如果我们做到了,我们应该返回-1,这样流的消费者就知道停止读取。如果没有,我们应该返回种子中的一个字节。为了确定应该是哪个字节,我们使用modulo操作符来获得生成数据的实际大小除以种子长度的余数。
6. Summary
6.归纳总结
In this tutorial, we looked into how we can deal with InputStreams in tests. We learned how the class is built and what implementations we can use for various scenarios. Finally, we learned how to write our own implementation to generate data on the fly.
在本教程中,我们研究了如何在测试中处理InputStreams。我们了解了该类是如何构建的,以及在各种情况下我们可以使用哪些实现。最后,我们学习了如何编写我们自己的实现,以便在飞行中生成数据。
As always, the code examples are available over on GitHub.
一如既往,代码实例可在GitHub上获得。。