复习向,Crafting Interpreters   中介绍的两种方法
 
表达式定义 简单起见,算数表达式只包含以下元素:
整数 
括号 
一元运算符:- 
二元运算符:+ - * / 
 
比如-1 + 2*(1 + 3 - 2)
Lexer 第一步将输入表达式分成小单位的token,因为表达式内容较少,只用产出如下组件:
表达式-112 + 2*(1 + 3 - 42) 产出的token序列为-, 112, +, 2, *, (, 1, +, 3, -, 42, ), None
此外增加peek功能,方便后续解析,Lexer如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 class  Lexer :     def  __init__ (self, expr: str  ):         self.expr_str = expr         self.peeked = None          self.tokens = self.tokens_iter()     def  tokens_iter (self ):         num = None          for  ch in  self.expr_str:             if  ch.isdigit():                 if  num:                     num = 10  * num + int (ch)                 else :                     num = int (ch)             else :                 if  num is  not  None :                                         yield  num                      num = None                  if  ch != ' ' :                                         yield  ch          if  num:             yield  num     def  next (self ):         """consume next token and return it         None means end of the token stream         """          if  self.peeked is  not  None :             ret = self.peeked             self.peeked = None          else :             try :                 ret = next (self.tokens)             except  StopIteration:                 return  None          return  ret     def  peek (self ):         """peek the next token without consuming it          """          if  self.peeked is  None :             self.peeked = self.next ()         return  self.peeked
 
EvalBase 写一个求值的基类,把Lexer组合进去,复用一点代码
class  Eval :     def  __init__ (self, expr: str  ):         self.lexer = Lexer(expr)     def  next (self ):         return  self.lexer.next ()     def  peek (self ):         return  self.lexer.peek()
 
递归下降 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 class  Descent (Eval ):     def  primary (self ):         token = self.next ()         if  isinstance (token, int ):             return  token         elif  token == '(' :             ret = self.expr()             assert  ')'  == self.next (), 'unbalanced parentheses'              return  ret         else :             raise  ValueError(f'unrecognized token {token!r} ' )     def  unary (self ):         token = self.peek()         if  token == '-' :             self.next ()             return  - self.unary()         else :             return  self.primary()     def  factor (self ):         left = self.unary()         while  self.peek() in  ('*' , '/' ):             op = self.next ()             if  op == '*' :                 left *= self.unary()             elif  op == '/' :                 left //= self.unary()             else :                 raise  ValueError(f'unrecognized operator {op!r} ' )         return  left     def  expr (self ):         left = self.factor()         while  self.peek() in  ('+' , '-' ):             op = self.next ()             if  op == '+' :                 left += self.factor()             elif  op == '-' :                 left -= self.factor()             else :                 raise  ValueError(f'unrecognized operator {op!r} ' )         return  left     def  eval (self ):         return  self.expr()
 
Pratt方法 
代码方面,首先记录运算符的优先级,而递归下降中的类似
from  enum import  Enumclass  Prec (Enum ):       Null = auto()     Term = auto()          Factor = auto()        Unary = auto()         Primary = auto()       def  next (self ):         """return the next level precedence         Prec.Primary will return itself"""          return  Prec(min (self.value + 1 , self.Primary.value))     def  __le__ (self, other: 'Prec'  ):         return  self.value <= other.value
 
新建一个Pratt类,主要构建出根据token进行分发的函数表格。
加入Prec信息和prec_of,是因为合并了一些逻辑,比如binary的参数token可能横跨两个优先级,需要识别出以继续收集。事实上,如果没有合并,每个收集函数,对自己该继续收集什么层级的表达式,都是清楚的,比如之后会看到的unary直接调用parse_with(Prec::Unary),这也是-在表中只记录它在binary中的优先级的原因。 
默认返回的优先级是Null,所以最后的中止token为None,可以用来跳出所有中缀表达式的调用,结束解析 
stack用来求值 
 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 @dataclass class  DispatchEntry :     prefix: MethodWrapperType = None      infix: MethodWrapperType = None      prec: Prec = Prec.Null EmptyEntry = DispatchEntry()class  Pratt (Eval ):     def  __init__ (self, expr: str  ):         super ().__init__(expr)         self.stack = []         self.index_map = {             'num' : DispatchEntry(self.num, None , Prec.Null),             '+' : DispatchEntry(None , self.binary, Prec.Term),             '-' : DispatchEntry(self.unary, self.binary, Prec.Term),             '*' : DispatchEntry(None , self.binary, Prec.Factor),             '/' : DispatchEntry(None , self.binary, Prec.Factor),             '(' : DispatchEntry(self.group, None , Prec.Null)         }     def  fetch_entry (self, token: str  ):         if  token and  isinstance (token, int ):             token = 'num'          return  self.index_map.get(token, EmptyEntry)     def  prefix_fn_call (self, token: str  ):         if  fn := self.fetch_entry(token).prefix:             return  fn(token)         raise  ValueError(f'unkown token to dispatch prefix: {token!r} ' )     def  infix_fn_call (self, token: str  ):         if  fn := self.fetch_entry(token).infix:             return  fn(token)         raise  ValueError(f'unkown token to dispatch infix: {token!r} ' )     def  prec_of (self, token: str  ):         return  self.fetch_entry(token).prec
 
然后加入被分配的收集函数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39      def  parse_with (self, prec: Prec ):         token = self.next ()         self.prefix_fn_call(token)         while  prec <= self.prec_of(self.peek()):             token = self.next ()             self.infix_fn_call(token)     def  unary (self, token: str  ):         self.parse_with(Prec.Unary)         if  token == '-' :             self.stack[-1 ] = -self.stack[-1 ]     def  binary (self, token ):         self.parse_with(self.prec_of(token).next ())         right = self.stack.pop()         left = self.stack.pop()         if  token == '+' :             self.stack.append(left + right)         elif  token == '-' :             self.stack.append(left - right)         elif  token == '*' :             self.stack.append(left * right)         elif  token == '/' :             self.stack.append(left // right)     def  group (self, token ):         self.expr()         assert  ')'  == self.next (), 'unbalanced parentheses'      def  num (self, token: int  ):         self.stack.append(token)     def  expr (self ):         self.parse_with(Prec.Null.next ())     def  eval (self ):         self.expr()         return  self.stack[-1 ]
 
画了一个调用栈来表示解析过程,paser_with都用主色表示,它所引发的prefix调用都用浅色表示,所引发的infix调用使用深色。
结论 
递归下降中的规则从上往下,优先级上升,问题规模下降 
Pratt方法的核心是parse_with(prec :Prec) { 前缀表达式 (中缀表达式)*}将下一个表示式的值入栈,该值的求解过程中使用的操作数,都大于等于prec优先级 
Pratt中的prec_of出现是因为收集函数的逻辑合并 
Pratt方法中的最小逻辑用来跳出中缀匹配