1. Introduction
1.导言
In this tutorial, we’ll discuss the algorithm and code for converting the infix notation of a mathematical expression to a postfix notation.
在本教程中,我们将讨论将数学表达式的后缀符号转换为后缀符号的算法和代码。
2. Expressions in Java
2.Java 中的表达式
A programming language like Java allows us to define and work with different mathematical expressions. An expression can be written through a combination of variables, constants, and operators.
像 Java 这样的编程语言允许我们定义和处理不同的数学表达式。表达式可以通过变量、常量和操作符的组合来编写。
Some common types of expressions in Java include arithmetic and logical expressions.
Java 中常见的表达式类型包括算术表达式和逻辑表达式。
2.1. Arithmetic Expressions
2.1.算术表达式
Arithmetic expressions include operators such as addition(+), subtraction(-), multiplication(*), division(/) and modulus(%). These operators used in conjunction with variables or constants result in an arithmetic evaluation:
算术表达式包括加法(+)、减法(-)、乘法(*)、除法(/)和模数(%)等运算符:
int x = 100;
int y = 50;
int sum = x + y;
int prod = x * y;
int remainder = x % y;
2.2. Logical Expressions
2.2.逻辑表达式
Logical expressions employ logical operators in place of the arithmetic operations used earlier. The most common logical operators include the logical AND, OR, NOT, and XOR:
逻辑表达式使用逻辑操作符来代替前面使用的算术运算。最常用的逻辑操作符包括逻辑 AND、OR、 NOT, 和 XOR:
boolean andResult = (true && false); // Logical AND
boolean orResult = (true || false); // Logical OR
boolean notResult = !false; // Logical NOT
Relational expressions are used mostly in comparison-based logic and produce boolean values true or false:
关系表达式主要用于基于比较的逻辑,并产生布尔值 true 或 false :
int x = 10;
int y = 8;
boolean bigger = x > y; // true
3. Notations
3.记号
There are different possible ways of writing a mathematical expression. These are called notations, and they change based on the placement of the operators and the operands.
数学表达式有不同的书写方式。这些方法称为 注释,它们根据运算符和操作数的位置而变化。
3.1. Infix Notation
3.1 英数符号
In an infix notation expression, the operation sits in between the operands, making it the most common expression notation:
在 infix 符号表达式中,运算位于操作数之间,因此是最常用的表达式符号:
int sum = (a + b) + (c * d);
It should be noted here that operator precedence is a cause for ambiguity in infix expressions and plays a major role. Parenthesis is common in infix notations to enforce precedence.
这里需要指出的是,运算符先例是造成 infix 表达式歧义的一个原因,并起着重要作用。在 infix 符号中,常用括号来强制执行优先级。
3.2. Prefix Notation
3.2.前缀符号
Prefix, also known as Polish Notation, are expressions where the operators precede the operands:
前缀,又称波兰符号,是运算符在操作数之前的表达式:
int result = * + a b - c d;
3.3. Postfix Notation
3.3.后缀符号
Postfix, or Reverse Polish Notation, implies that the operators should come after the operands:
后缀或反波兰语符号表示运算符应在操作数之后:
int result = a b + c d - *;
We should note here that both prefix and postfix notations of the same expression remove the obvious ambiguity with operator precedence and eliminate the need for parenthesis. They are efficient in expression evaluation for the same reason.
在此,我们应该注意到,同一表达式的前缀和后缀符号都消除了运算符优先级的明显歧义,并省去了括号。出于同样的原因,它们在表达式求值时也很高效。
4. Problem Statement
问题陈述
Now that we have reviewed the basics of the different notations of mathematical expressions, let’s move on to the problem statement.
既然我们已经复习了数学表达式不同符号的基础知识,那么接下来就来看看问题陈述。
Given an Infix expression as an input, we should write an algorithm that converts it and returns the Postfix or the Reverse Polish Notation of the same expression.
给定一个 Infix 表达式作为输入,我们应该编写一种算法,将其转换并返回同一表达式的后缀或反波兰符号。
Let’s understand through an example:
让我们通过一个例子来了解一下:
Input: (a + b) * (c - d)
Output: ab+cd-*
Input: a+b*(c^d-e)^(f+g*h)-i
Output: abcd^e-fgh*+^*+i-
The examples above show that the input is an infix expression where the operator is always between a pair of operands. The output is the corresponding postfix expression of the same. We can assume that the input is always a valid infix expression; therefore, there is no further need to validate the same.
上述示例表明,输入是一个 infix 表达式,运算符总是位于一对操作数之间。输出则是与之对应的后缀表达式。我们可以假定输入总是有效的 infix 表达式,因此无需进一步验证。
5. Solution
5.解决方案
Let’s build our solution by breaking down the problem into smaller steps.
让我们把问题分解成更小的步骤,建立我们的解决方案。
5.1. Operators and Operands
5.1.操作符和操作数
The input will be a string representation of the infix expression. Before we implement the conversion logic, it is crucial to determine the operators from the operands.
输入将是 infix 表达式的字符串表示。在执行转换逻辑之前,确定操作数中的运算符至关重要。
Based on the input examples, operands can be lower or upper-case English alphabets:
根据输入示例,操作数可以是小写或大写英文字母:
private boolean isOperand(char ch) {
return (ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z');
}
The input contains 2 parenthesis characters and 5 operators in addition to the above operands.
除上述操作数外,输入还包含 2 个括号字符和 5 个运算符。
5.2. Precedence and Associativity
5.2.优先级和关联性
We should also define the precedence of each operator we might encounter in our input and assign them an integer value. The ^(exponentiation) operator has the highest precedence, followed by *(multiply) and /(division), which have similar precedence. Finally, the +(addition) and -(subtraction) operators have the least precedence.
我们还应该定义输入中可能遇到的每个运算符的优先级,并为它们分配一个整数值。^(幂级数)运算符的优先级最高,其次是*(乘法)和/(除法),它们的优先级相似。最后,+(加法)和-(减法)运算符的优先级最低。
Let’s write a method to mimic the above logic:
让我们编写一个方法来模仿上述逻辑:
int getPrecedenceScore(char ch) {
switch (ch) {
case '^':
return 3;
case '*':
case '/':
return 2;
case '+':
case '-':
return 1;
}
return -1;
}
When unparenthesized operators of the same precedence are scanned, associativity, or the scanning order, is generally left to right. The only exception to this rule is with the exponentiation operator, where the order is assumed to be right to left:
当扫描相同先例的无父运算符时,关联性或扫描顺序通常是从左到右。唯一的例外是幂运算符,它的顺序被假定为从右到左:
char associativity(char ch) {
if (ch == '^') {
return 'R';
}
return 'L';
}
6. Conversion Algorithm
6.转换算法
Each operator in an infix expression refers to the operands surrounding it. In contrast, for a postfix one, each operator refers to the two operands that come before it in the input String.
在下位表达式中,每个操作数都指它周围的操作数。相反,对于后缀表达式,每个操作数指的是输入字符串中排在它前面的两个操作数。
For an expression with multiple infix operations, the expressions within the innermost parenthesis must first be converted to postfix. This gives us the advantage of treating them as single operands for the outer operations. We continue this and successively eliminate parenthesis until the entire expression is converted. Within a group of parenthesis, the last parenthesis pair that is eliminated is the first operation in the group.
对于包含多个后置运算的表达式,最内层括号内的表达式必须首先转换为后置运算。这样做的好处是,我们可以将它们视为外层运算的单操作数。我们将继续这样做,依次消除括号,直到整个表达式转换完毕。
This Last In First Out behavior is suggestive of the use of the Stack data structure.
这种后进先出的行为表明使用了堆栈数据结构。
6.1. The Stack and Precedence Condition
6.1.堆栈和优先级条件
We’ll use a Stack to keep track of our operators. However, we need to define a rule to determine which operator has to be added to the postfix expression and which operator needs to be kept in the stack for the future.
我们将使用堆栈来跟踪运算符。不过,我们需要定义一条规则,以确定哪个运算符必须添加到后缀表达式中,哪个运算符需要保留在堆栈中,以备将来使用。
If the current symbol is an operator, we have two options. We can either push it onto the stack or put it directly in the postfix expression. If our stack is empty, which is the case when encountering the first operator, we can simply push the current operator onto the stack.
如果当前符号是运算符,我们有两种选择。我们可以将其推入堆栈,或者直接将其放入后缀表达式。如果堆栈是空的,也就是遇到第一个运算符时的情况,我们可以直接将当前运算符推入堆栈。
On the other hand, if the stack is not empty, we need to check the precedence to determine, which way the operator will go. If the current character has a higher precedence than that of the top of the stack, we need to push it onto the top. For example, encountering + after * will result in pushing the + onto the stack above *. We’ll do the same if the precedence scores are equal as well, and the associativity is the default left to right.
如果当前字符的优先级高于堆栈顶层的优先级,我们就需要将其推到顶层。例如,在 * 之后遇到 + 将导致把 + 推到 * 的堆栈上方。如果优先级相同,我们也会这样做,并且关联性默认为从左到右。
We can condense the above logic as:
我们可以将上述逻辑浓缩为
boolean operatorPrecedenceCondition(String infix, int i, Stack<Character> stack) {
return getPrecedenceScore(infix.charAt(i)) < getPrecedenceScore(stack.peek())
|| getPrecedenceScore(infix.charAt(i)) == getPrecedenceScore(stack.peek())
&& associativity(infix.charAt(i)) == 'L';
}
6.2. Scanning the Infix Expression and Converting
6.2.扫描 Infix 表达式并进行转换
Now that we have the precedence condition set up, let’s discuss how to perform step-by-step scanning of the infix operation and convert it correctly.
现在,我们已经设置了优先条件,下面让我们讨论一下如何一步一步地扫描下位运算并正确转换。
If the current character is an operand, we add it to our postfix result. If the current character is an operator, we use the comparison logic discussed above and determine if we should add it to the stack or pop. Finally, when we finish scanning the input, we pop everything in the stack to the postfix expression:
如果当前字符是操作符,我们会将其添加到后缀结果中。如果当前字符是操作符,我们会使用上文讨论的比较逻辑,确定是将其添加到堆栈还是弹出。最后,完成输入扫描后,我们会将堆栈中的所有内容弹出到后缀表达式中:
String infixToPostfix(String infix) {
StringBuilder result = new StringBuilder();
Stack<Character> stack = new Stack<>();
for (int i = 0; i < infix.length(); i++) {
char ch = infix.charAt(i);
if (isOperand(ch)) {
result.append(ch);
} else {
while (!stack.isEmpty() && (operatorPrecedenceCondition(infix, i, stack))) {
result.append(stack.pop());
}
stack.push(ch);
}
}
while (!stack.isEmpty()) {
result.append(stack.pop());
}
return result.toString();
}
6.3. Example Dry Run
6.3.模拟运行示例
Let’s understand the algorithm using our first example: a + b * c – d. The first character we encounter, a, can be immediately inserted into the final result as it is an operand. The + operator, however, cannot be inserted without going over the second operand associated with it, which in this case is b. As we need to store the + operator for future reference, we push it onto our stack:
让我们用第一个例子来理解算法:a+b*c-d。我们遇到的第一个字符a可以立即插入到最终结果中,因为它是一个操作数。而 + 运算符在插入时,必须经过与之相关的第二个操作数,在本例中就是 b。由于我们需要存储 + 运算符以供将来参考,因此我们将其推入堆栈:
Symbol Result Stack
a a []
+ a [+]
b ab [+]
As we encounter b operand, we push it onto the postfix result, which is now a b. We cannot pop the operator + from the stack just yet because we have the * operator in our input. As we mentioned in the previous section, the * operator commands a higher precedence over +, which is at the top of the stack. Therefore, this new symbol is pushed on top of the stack:
我们还不能从堆栈中弹出运算符 +,因为我们的输入中还有 * 运算符。 正如我们在上一节中提到的,* 运算符的优先级高于 +,后者位于堆栈顶端。因此,这个新符号会被推到栈顶:
Symbol Result Stack
a a []
+ a [+]
b ab [+]
* ab [+,*]
We continue scanning the input infix expression as we encounter the next operand c, which we add to the result. When we encounter the final symbol -, which has a lower precedence over the operator *, we pop the elements from the stack and append it to the postfix expression until the stack is empty or the top of the stack has a lower precedence. The expression is now abc*+. The current operator – is pushed to the stack:
在遇到下一个操作数 c, 时,我们将继续扫描输入的后缀表达式,并将其添加到结果中。当我们遇到比运算符 * 优先级更低的最后一个符号 – 时,我们会从堆栈中取出元素并将其添加到后缀表达式中,直到堆栈为空或堆栈顶端的优先级更低。当前运算符 – 被推入堆栈:
Symbol Result Stack
a a []
+ a [+]
b ab [+]
* ab [+,*]
c abc [+,*]
- abc*+ [-]
d abc*+d- []
Finally, we append the last operand, d, to the postfix operation and pop the stack. The value of the postfix operation is abc*+d-.
最后,我们将最后一个操作数 d 添加到后缀运算中,并弹出堆栈。后缀运算的值是 abc*+d-. 。
6.4. Conversion Algorithm With Parenthesis
6.4.带括号的转换算法
While the above algorithm is correct, infix expressions use parenthesis to solve the ambiguity that arises with operator precedence. Therefore, it is crucial to handle the occurrence of parenthesis in the input string and modify the algorithm accordingly.
虽然上述算法是正确的,但下位表达式使用括号来解决运算符优先级产生的歧义。因此,处理输入字符串中出现的括号并相应修改算法至关重要。
When we scan an opening parenthesis (, we push it onto the stack. And, when we encounter a closing parenthesis, all the operators need to be popped off from the stack into the postfix expression.
当我们扫描开头括号 ( 时,我们会将其推入堆栈。当我们遇到闭合括号时,需要将所有运算符从堆栈中取出,放入后缀表达式中。
Let’s rewrite the code by adjusting for parenthesis:
让我们通过调整括号来重写代码:
String infixToPostfix(String infix) {
StringBuilder result = new StringBuilder();
Stack<Character> stack = new Stack<>();
for (int i = 0; i < infix.length(); i++) {
char ch = infix.charAt(i);
if (isOperand(ch)) {
result.append(ch);
} else if (ch == '(') {
stack.push(ch);
} else if (ch == ')') {
while (!stack.isEmpty() && stack.peek() != '(') {
result.append(stack.pop());
}
stack.pop();
} else {
while (!stack.isEmpty() && (operatorPrecedenceCondition(infix, i, stack))) {
result.append(stack.pop());
}
stack.push(ch);
}
}
while (!stack.isEmpty()) {
result.append(stack.pop());
}
return result.toString();
}
Let’s take the same example we explored above but with parenthesis and do a dry run:
让我们以上面探讨过的括号为例,进行一次演练:
Input: (a+b)*(c-d)
Symbol Result Stack
( [(]
a a [(]
+ a [(, +]
b ab [(, +]
) ab+ []
* ab+ [*]
( ab+ [*, (]
c ab+c [*, (]
- ab+c [*, (, -]
d ab+cd [*, (, -]
) ab+cd-* []
We should notice how the placement of the parenthesis changes the evaluation and also the corresponding postfix expression.
我们应该注意到括号的位置如何改变求值和相应的后置表达式。
7. Additional Thoughts
7.其他想法
While infix expressions are more common in texts, the postfix notation of an expression has many benefits. The need for parenthesis in postfix expressions is not there as the order of the operations is clear due to the sequential arrangement of operands and operators. Furthermore, the ambiguity that might arise due to operator precedence and associativity in an infix operation is eradicated in its postfix form.
虽然后缀表达式在文本中更为常见,但表达式的后缀符号有很多好处。在后缀表达式中不需要括号,因为操作数和运算符的顺序排列使运算顺序一目了然。此外,在后缀表达式中,由于运算符的优先级和关联性而可能产生的歧义也被消除了。
This also makes them the de-facto choice in computer programs, especially in programming language implementations.
这也使它们成为计算机程序,尤其是编程语言实现中的事实选择。
8. Conclusion
8.结论
In this article, we discussed infix, prefix, and postfix notations of mathematical expressions. We focussed on the algorithm to convert an infix to a postfix operation and saw a few examples of it.
在本文中,我们讨论了数学表达式的后缀、前缀和后缀符号。我们重点讨论了将后缀运算转换为前缀运算的算法,并举了几个例子。
As usual, the source code from this article can be found over on GitHub.