TokenType 词性枚举类
public enum TokenType {
INTEGER
, PLUS
, EOF
}
Token 词法单元
@Data
public class Token {
private TokenType type;
private Object value;
public Token(TokenType type , Object value){
this.type = type;
this.value = value;
}
public Token(TokenType type){
this.type = type;
}
}
词法解析器
public class Lexer {
private String text;
private Integer position;
private Character currentChar;
public Token getNextToken(){
if(this.currentChar == null){
return new Token(TokenType.EOF);
}else if(Character.isDigit(this.currentChar)){
Token token = new Token(TokenType.INTEGER ,Character.getNumericValue(this.currentChar));
this.advance();
return token;
}else if(this.currentChar == '+'){
Token token = new Token(TokenType.PLUS , "+");
this.advance();
return token;
}else {
this.error("未知的词法");
}
return new Token(TokenType.EOF);
}
public void advance(){
this.position += 1;
if(this.position <= this.text.length() - 1){
this.currentChar = text.charAt(this.position);
}else{
this.currentChar = null;
}
}
public void error(String msg){
throw new RuntimeException(msg);
}
public Lexer(String text) {
this.text = text;
this.position = 0;
this.currentChar = text.charAt(this.position);
}
}
- 变量text:存储程序文本。变量position和currentChar:词法解析器是顺序逐个扫描,所以创建position记录当前位置,currentChar记录当前字符。
- 函数advance:每扫描一个字符,往后移动,直到末尾。
- 函数error: 扫描到了未知的字符就会抛出异常信息。
- 函数getNextToken:获取词法单元并向后移动,按照判断逻辑构建词法单元。
单元测试:
public static void main(String[] args) {
Lexer lexer = new Lexer("1+1");
Token token = lexer.getNextToken();
while (token.getType() != TokenType.EOF) {
System.out.println(token);
token = lexer.getNextToken();
}
}
测试结果:
Token(type=INTEGER, value=1)
Token(type=PLUS, value=+)
Token(type=INTEGER, value=1)