前面的链接:
上一节我们讲到了,我们把字符串转换成了Token数组,并且存进了ListLexer中了。那么,这个Token是什么样子呢?
/*
* This file is part of the ByteFerry/Rql-Parser package.
*
* (c) BardoQi <67158925@qq.com>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
declare(strict_types=1);
namespace ByteFerry\RqlParser\Lexer;
use ByteFerry\RqlParser\Exceptions\ParseException;
/**
* Class Token
*
* @package ByteFerry\RqlParser
*/
class Token
{
/**
* 这里存的是symbol的类型,也就是在Symbol类中定义的常量
* symbol type
* @var int
*/
protected $type = 0;
/**
* 这里就是symbol本身的字符串了
* symbol content string
* @var string
*/
protected $symbol = '';
/**
* 下一个Token的类型
* the lexer type of next node
* @var int
*/
protected $next_type = -1;
/**
* 前一个Token的类型
* the lexer type of previous node
* @var int
*/
protected $previous_type = -1;
/**
* 这个层级参数,是用来进行语法较验的
* @var int
*/
protected $level = 0;
/**
* @param $type
* @param $symbol
* @param int $previous_type
* 这是静态创建Token的方法,免去在代码的用new
* @return static
*/
public static function from($type,$symbol,$previous_type = -1)
{
/**
* ensure the syntax of the rql with simple ABNF definition
* 我们定了一套ABNF的规则,那么,我们用它来检验语法是否正确
*/
if(-1 !== $previous_type){
if(!in_array($type,Symbols::$rules[$previous_type])){
throw new ParseException('Syntex error in Node of ' .$symbol);
}
}
$instance = new static(); //下面就是初始化了
$instance->type = $type;
$instance->symbol = $symbol;
$instance->previous_type = $previous_type;
return $instance;
}
/**
* @param $previousType
* RQL中有一个数据是没有函数模式的,那就是数组,这里,特别处理一下
* @return \ByteFerry\RqlParser\Lexer\Token
*/
public static function makeArrayToken($previousType){
$instance = new static();
$instance->type = Symbols::T_WORD;
$instance->symbol = 'arr';
$instance->previous_type = $previousType;
return $instance;
}
/**
* @param $level
*
* @return void
*/
public function setLevel($level){
$this->level = $level;
}
/**
* @param $type
*
* @return void
*/
public function setNextType($type)
{
$this->next_type = $type;
}
/**
* @return int
*/
public function getType()
{
return $this->type;
}
/**
* @param $type
*
* @return void
*/
public function setPrevType($type){
$this->previous_type=$type;
}
/**
* @return string
*/
public function getSymbol()
{
return $this->symbol;
}
/**
* @return bool
*/
public function isClose(){
return ($this->type === Symbols::T_CLOSE_PARENTHESIS);
}
/**
* @return int
*/
public function getPrevType(){
return $this->previous_type;
}
/**
* @return bool
*/
public function isPunctuation(){
return !( ($this->type === Symbols::T_WORD)
|| ($this->type === Symbols::T_STRING)
);
}
}
我们可以看到,此类中有很多单行代码的方法。其实,面向对象是一个方面。很多初学者不会利用单行代码方法函数,从而导致一些函数代码超长。
接下来,我们该看看ListLexer这个类了
/*
* This file is part of the ByteFerry/Rql-Parser package.
*
* (c) BardoQi <67158925@qq.com>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
declare(strict_types=1);
namespace ByteFerry\RqlParser\Lexer;
use ByteFerry\RqlParser\Abstracts\BaseObject;
use ByteFerry\RqlParser\Exceptions\ParseException;
/**
* Class TokenList
*
* @package ByteFerry\RqlParser\ListLexer
*/
class ListLexer extends BaseObject
{
/**
* 就是在这里保存的token数组
* @var array
*/
protected $items = [];
/**
* @var int
*/
protected $level = 0;
/**
* @var int
*/
protected $position = 0;
/**
* @param $token
*
* @return void
*/
public function addItem(Token $token){
if($token->getType() === Symbols::T_OPEN_PARENTHESIS){
/**
* for < ,( > that is the array operator,
* we'd insert a node 'arr'
* 逗号后的无函数名节点,即直接是左括号时,一定是一个数组节点
*/
if($token->getPrevType()===Symbols::T_COMMA){
$this->items[$this->position++] = Token::makeArrayToken(Symbols::T_COMMA); // 所以,用Token的makeArrayToken
}
$token->setPrevType(Symbols::T_WORD); //虚拟出来的arr函数,也要把前一节点类型设置为 T_WORD
$this->level++; //同时,增加一个层数
}
if($token->getType() === Symbols::T_CLOSE_PARENTHESIS){
$this->level--; //当遇到右括号是,层数减掉。(当括号如果匹配,最后,level应当是0,这就是这个校验算法的核心。至于为什么,你自己想吧)
}
$token->setLevel($this->level);
$this->items[$this->position++] = $token;
}
/**
* @param $type
* 这是设置前一节点的NextType,
* @return void
*/
public function setNextType($type){
if(isset($this->items[$this->position-2])){ // 用当前的指针减2,是因为,加上了新的以后,还没有更新position
$this->items[$this->position-2]->setNextType($type);
}
}
/**
* @return mixed
*/
public function current(){
return $this->items[$this->position];
}
/**
* @return bool|mixed
* 这里是Token的消费, 关键的函数
*/
public function consume(){
/**
* if got the end we must return;
*/
if($this->isEnd()){
return false; // 判断是否结束
}
/**
* get the next token
*/
$token = $this->items[++$this->position]; 取下一个token
/**
* we only consume the word or string.
* 仅消费 word 或 string类型的token,所以,我们调用了token的isPunctuation
*/
for(; $token->isPunctuation() && !$this->isEnd(); $token = $this->items[++$this->position]){
/**
* if we meet the close flag we must return.
*/
if($token->isClose()){
return $token;
}
}
return $token;
}
/**
* @return mixed
*/
public function rewind()
{
$this->position = 0;
return $this->items[$this->position];
}
/**
*
* @return int
*/
public function getNextIndex()
{
return ++$this->position;
}
/**
* @return mixed
*/
public function isClose(){
return $this->items[$this->position]->isClose();
}
/**
* @return bool
*/
public function isEnd(){
return $this->position+1 >= count($this->items);
}
/**
* @return int
*/
public function getLevel(){
return $this->level;
}
}
我们发现,Token类中一些单行函数,简化了这里的代码。同样,此类中也有一些单行函数简化了consume函数中的代码,所以,代码行就少多了。 到 这里,词法部分都结束了。接下来就是抽象语法树部分了。我们继续看NodeVisitor
<?php
/*
* This file is part of the ByteFerry/Rql-Parser package.
*
* (c) BardoQi <67158925@qq.com>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
declare(strict_types=1);
namespace ByteFerry\RqlParser\AstBuilder;
use ByteFerry\RqlParser\Exceptions\ParseException;
use ByteFerry\RqlParser\Lexer\Symbols;
/**
* Class NodeVisitor
*
* @package ByteFerry\RqlParser\Ast
*/
class NodeVisitor
{
/**
* @param $name
*
* @return mixed
*/
protected static function fromAlias($name)
{
return Symbols::$type_alias[$name]??$name;
}
/**
* @param $operator
*
* @return mixed
*/
protected static function getNodeType($operator)
{
return Symbols::$type_mappings[$operator]??null;
}
/**
* @param $node_type
*
* @return mixed|null
*/
protected static function getClass($node_type)
{
return Symbols::$class_mapping[$node_type]??Symbols::$class_mapping['N_CONSTANT'];
}
/**
* @param $symbol
*
* @return \ByteFerry\RqlParser\AstBuilder\NodeInterface;
*/
public static function visit($symbol){
$operator = self::fromAlias($symbol);
$node_type = self::getNodeType($operator);
$node_class = self::getClass($node_type);
if(null === $node_class){
throw new ParseException('Node class of ' .$node_type.' not found!');
}
return $node_class::of($operator,$symbol);
}
}
代码相当简单,到这里,我们发现,它其实并不是真正的访问者模式,只是拿了一个symbol,获取到一个实例,仅此而己。 接下来,我们就要理解其抽象语法树当中的内容了。(待续)
继续阅读: