文档介绍:2017-1-30 1第三章词法分析(2) 2017-1-30 编译原理 2 考虑文法 stmt ? if expr then stmt | if expr then stm t else stmt | ? expr ? term relop term | term term ? id | number 2017-1-30 编译原理 3 词法规则 digit ? [0-9] digits ? digit + number ? digits (. digits )? ( E (+| ?)? digits )? letter_ ? [A- Za-z ] id? letter ( letter | digit )* if? if then ? then else ? else relop ? < | < = | = | < > | > | > = 2017-1-30 编译原理 4 过滤空白符? Scan away blank , newline , tab ? blank , newline and tab are abstract symbols used to express the ASCII characters of the same name. ws?( blank | tab | newline ) + 2017-1-30 编译原理 5 Lexmemes Token Name Attribute Value Any ws if then else Any id Any num < <= = < > > >= - if then else id num relop relop relop relop relop relop ---- pointer to table entry pointer to table entry LT LE EQ NE GT GE Note: Each token has a unique token identifier to define category of lexemes 2017-1-30 编译原理 6 状态转换图(Transition Diagrams ) ?状态转换图是正规表达式的一种表示?每个状态转换图有?状态 States : Represented by Circles ?动作 Actions : Represented by Arrows between states ?初始状态 Start State : Beginning of a pattern (Arrowhead) ?终止状态 Final State(s) : End of pattern (Concentric Circles) ?假定状态转换图是确定的(Deterministic) - No need to choose between 2 different actions ! 2017-1-30 编译原理 7 Fig 3-11 >= 和>的转换图?*表示向前指针必须回退一个字符; ? We ’ ve accepted “>” and have read other char that must be unread. start other => 0678* 2017-1-30 编译原理 8 start <0 other =678 return( relop , LE) 5 4 > =123 other > =* * return( relop , NE) return( relop , LT) return( relop , EQ) return( relop , GE) return( relop , GT) Fig 3-12 关系操作算符的转换图 2017-1-30 编译原理 9 Fig 3-13 标识符和保留字的转换图? return( gettoken (), install_id ( )) 返回记号和属性值 lexical_value ; ? install_id ( )首先得到该词素,再对符号表进行操作(查表及填表); ? gettoken ()在符号表中查找单词,若是关键字, 则返回相应的 token ,否则返回 token 类型为 id。 910 11 start letter other * letter 或 digit return( gettoken (), install _ id ( )) 2017-1-30 编译原理 10 Fig 3-14 无符号数的转换