Read Excel Cell Value Rather Than Formula With Apache POI – 用Apache POI读取Excel单元格值而不是公式

最后修改: 2020年 1月 4日

中文/混合/英文(键盘快捷键:t)

1. Introduction

1.绪论

When reading an Excel file in Java, we usually want to read the values of cells to perform some computation or generate a report. However, we may encounter one or more cells that contain formulas rather than raw data values. So, how do we get at the actual data values of those cells?

在Java中读取Excel文件时,我们通常想读取单元格的值来进行一些计算或生成一份报告。然而,我们可能会遇到一个或多个包含公式而不是原始数据值的单元格。那么,我们怎样才能获得这些单元格的实际数据值呢?

In this tutorial, we’re going to look at different ways to read Excel cell values – rather than the formula that is calculating the cell values – with the Apache POI Java library.

在本教程中,我们将通过Apache POI Java 库来研究读取 Excel 单元格值(而不是计算单元格值的公式)的不同方法。

There are two ways to solve this problem:

有两种方法来解决这个问题。

  • Fetch the last cached value for the cell
  • Evaluate the formula at runtime to get the cell value

2. Maven Dependency

2.Maven的依赖性

We need to add the following dependency in our pom.xml file for Apache POI:

我们需要在我们的pom.xml文件中为Apache POI添加以下依赖性。

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>5.2.0</version>
</dependency>

The latest version of poi-ooxml can be downloaded from Maven Central.

最新版本的poi-ooxml可以从Maven中心下载。

3. Fetch the Last Cached Value

3.取出最后的缓存值

Excel stores two objects for the cell when a formula calculates its value. One is the formula itself, and the second is the cached value. The cached value contains the last value evaluated by the formula.

当一个公式计算其值时,Excel为单元格存储两个对象。一个是公式本身,另一个是缓存值。缓存值包含公式最后评估的值

So the idea here is we can fetch the last cached value and consider it as cell value. It may not always be true that the last cached value is the correct cell value. However, when we’re working with an Excel file that is saved, and there are no recent modifications to the file, then the last cached value should be the cell value.

因此,这里的想法是我们可以获取最后的缓存值并将其视为单元格值。最后的缓存值可能并不总是正确的单元格值。然而,当我们在处理一个已保存的Excel文件时,如果最近没有对该文件进行修改,那么最后的缓存值应该是单元格值。

Let’s see how to fetch the last cached value for a cell:

让我们看看如何获取一个单元格的最后一个缓存值。

FileInputStream inputStream = new FileInputStream(new File("temp.xlsx"));
Workbook workbook = new XSSFWorkbook(inputStream);
Sheet sheet = workbook.getSheetAt(0);

CellAddress cellAddress = new CellAddress("C2");
Row row = sheet.getRow(cellAddress.getRow());
Cell cell = row.getCell(cellAddress.getColumn());

if (cell.getCellType() == CellType.FORMULA) {
    switch (cell.getCachedFormulaResultType()) {
        case BOOLEAN:
            System.out.println(cell.getBooleanCellValue());
            break;
        case NUMERIC:
            System.out.println(cell.getNumericCellValue());
            break;
        case STRING:
            System.out.println(cell.getRichStringCellValue());
            break;
    }
}

4. Evaluate the Formula to Get the Cell Value

4.评估公式以获得单元格值

Apache POI provides a FormulaEvaluator class, which enables us to calculate the results of formulas in Excel sheets.

Apache POI提供了一个FormulaEvaluator类,它使我们能够计算Excel表格中的公式结果

So, we can use FormulaEvaluator to calculate the cell value at runtime directly. The FormulaEvaluator class provides a method called evaluateFormulaCell, which evaluates the cell value for the given Cell object and returns a CellType object, which represents the data type of the cell value.

因此,我们可以使用FormulaEvaluator 来直接在运行时计算单元格值。FormulaEvaluator类提供了一个名为evaluateFormulaCell的方法,它为给定的Cell对象评估单元格值,并返回一个CellType对象,该对象代表单元格值的数据类型。

Let’s see this approach in action:

让我们看看这种方法的作用。

// existing Workbook setup

FormulaEvaluator evaluator = workbook.getCreationHelper().createFormulaEvaluator(); 

// existing Sheet, Row, and Cell setup

if (cell.getCellType() == CellType.FORMULA) {
    switch (evaluator.evaluateFormulaCell(cell)) {
        case BOOLEAN:
            System.out.println(cell.getBooleanCellValue());
            break;
        case NUMERIC:
            System.out.println(cell.getNumericCellValue());
            break;
        case STRING:
            System.out.println(cell.getStringCellValue());
            break;
    }
}

5. Which Approach to Choose

5.选择哪种方法

The simple difference between the two approaches here is that the first method uses the last cached value, and the second method evaluates the formula at runtime.

这里两个方法的简单区别是,第一个方法使用最后的缓存值,而第二个方法在运行时评估公式。

If we’re working with an Excel file that is already saved and we’re not going to make changes to that spreadsheet at runtime, then the cached value approach is better as we don’t have to evaluate the formula.

如果我们正在处理一个已经保存的Excel文件,并且我们不打算在运行时对该电子表格进行修改,那么缓存值的方法更好,因为我们不需要评估公式。

However, if we know that we’re going to make frequent changes at runtime, then it’s better to evaluate the formula at runtime to fetch the cell value.

然而,如果我们知道我们要在运行时频繁地进行修改,那么最好在运行时评估公式以获取单元格的值。

6. Conclusion

6.结语

In this quick article, we saw two ways to get the value of an Excel cell rather than the formula that calculates it.

在这篇快速文章中,我们看到了两种获取Excel单元格值而不是计算公式的方法。

The complete source code for this article is available over on GitHub.

本文的完整源代码可在GitHub上获得过。