Calculator Description

This version of the calculator is substantially more complex than previous versions. Major changes include control constructs such as if-else and while. In addition a syntax tree is constructed during parsing. After parsing we walk the syntax tree to produce output. Three versions of the tree walk routine are supplied:

To make things more concrete, here is a sample program,

x = 0;
while (x < 3) {
    print x;
    x = x + 1;
}

with output for an interpretive version,

0
1
2

a compiler version,

    push    0
    pop     x
L000:
    push    x
    push    3
    compLT
    jz      L001
    push    x
    print
    push    x
    push    1
    add
    pop     x
    jmp     L000
L001:

and a version that generates a syntax tree.

Graph 0:

    [=]
     |
   |----|
   |    |
 id(X) c(0)

Graph 1:

               while
                 |
     |----------------|
     |                |
    [<]              [;]
     |                |
   |----|     |----------|
   |    |     |          |
 id(X) c(3) print       [=]
              |          |
              |     |-------|
              |     |       |
            id(X) id(X)    [+]
                            |
                          |----|
                          |    |
                        id(X) c(1)

The include file contains declarations for the syntax tree and symbol table. Symbol table (sym) allows for single-character variable names. A node in the syntax tree may hold a constant (conNodeType), an identifier (idNodeType), or an internal node with an operator (oprNodeType). A union encapsulates all three variants and nodeType.type is used to determine which structure we have.

The lex input file contains patterns for VARIABLE and INTEGER tokens. In addition, tokens are defined for 2-character operators such as EQ and NE. Single-character operators are simply returned as themselves.

The yacc input file defines YYSTYPE, the type of yylval, as

%union {
    int iValue;            /* integer value */
    char sIndex;           /* symbol table index */
    nodeType *nPtr;        /* node pointer */
};

This causes the following to be generated in y.tab.h:

typedef union {
    int iValue;            /* integer value */
    char sIndex;           /* symbol table index */
    nodeType *nPtr;        /* node pointer */
} YYSTYPE;
extern YYSTYPE yylval;

Constants, variables, and nodes can be represented by yylval in the parser’s value stack. A more accurate representation of decimal integers is given below. This is similar to C/C++ where integers that begin with 0 are classified as octal.

0           {
                 yylval.iValue = atoi(yytext);
                 return INTEGER;
            }


[1-9][0-9]* {
                 yylval.iValue = atoi(yytext);
                 return INTEGER;
            }

Notice the type definitions

%token <iValue> INTEGER
%type <nPtr> expr

This binds expr to nPtr, and INTEGER to iValue in the YYSTYPE union. This is required so that yacc can generate the correct code. For example, the rule

expr: INTEGER { $$ = con($1); }

should generate the following code. Note that yyvsp[0] addresses the top of the value stack, or the value associated with INTEGER.

yylval.nPtr = con(yyvsp[0].iValue);

The unary minus operator is given higher priority than binary operators as follows:

%left GE LE EQ NE '>' '<'
%left '+' '-'
%left '*' '/'
%nonassoc UMINUS

The %nonassoc indicates no associativity is implied. It is frequently used in conjunction with %prec to specify precedence of a rule. Thus, we have

expr: '-' expr %prec UMINUS { $$ = node(UMINUS, 1, $2); }

indicating that the precedence of the rule is the same as the precedence of token UMINUS. And UMINUS (as defined above) has higher precedence than the other operators. A similar technique is used to remove ambiguity associated with the if-else statement (see If-Else Ambiguity).

The syntax tree is constructed bottom-up allocating the leaf nodes when variables and integers are reduced. When operators are encountered a node is allocated and pointers to previously allocated nodes are entered as operands.

After the tree is built function ex is called to do a depth-first walk of the syntax tree. This results in operators being applied in the order that they were encountered during parsing. Three versions of ex, an interpretive version a compiler version, and a version that generates a syntax tree, are included.