Assignment Two, CS351 S2001

Designing a Parser's API

Objective

In this assignment you will design an API for a simple parser, motivate your design, and document the design. The documentation will include some sample programs that use the API.

Specifications

The basic idea is to write a system that takes a specification for a small ``expression language'' and gives the program the ability to read from a stream and parse it up into the elements of this language, represented in a form suitable for further processing by the program that uses the parser. The range of languages allowed is quite limited: they are simple expression languages, with no precedence ie only left-to-write association of operators.

Grammar

The expression language consists of two sorts of tokens: operators, which are one character long, and variables, which are sequences of characters. The characters allowed in variables (called constituent characters) are disjoint from those allowed in operators. Parenthesis '(' and ')' can be used for grouping. Each operator is either unary (prefix) or binary (infix), and these two classes of operator are disjoint.

Order

The expression "(var1 opA var2 opB var3)" is equivalent to "(var1 opA (var2 opB var3))". Whitespace can occur anywhere except inside a variable. Unary operators bind ``tighter'' than binary ones, so "(op1 var1 op2 var2)" where op1 is unary and op2 is binary is identical to "((op1 var1) op2 var2)".

Which characters are constituents, and the identity and nature of the operators, can change from run to run - so your API must have a way of getting this information from the program using it.

Expressions can be thought of as representing trees whose leaves are "variables" and whose internal nodes are tagged with operators. There would be two kinds of internal nodes: ones with two children which are tagged with binary operators, and ones with one child which are tagged with unary operators.

You will need to be represent the expressions that are read in a fashion that allows the program using the library to examine and manipulate them sensibly. Hint: define a separate API for objects representing expressions and portions thereof.

Here is a formal grammar, in BNF.

EXPR := VAR | EXPR OP2 EXPR | OP1 EXPR | '(' EXPR ')'
OP1 := ... list of possible single chars, eg: '-' | '~'
OP2 := ... list of possible single chars, eg: '+' | '*' | '/'
VAR := CONSTITUENT+
CONSTITUENT := ... list of constituent chars, typically: '1' | '2' | 'a' 
Note that the characters '(', ')', constituent characters, and OP1 characters, and OP2 characters are disjoint. Ie if '-' in a unary operators then '-' is not a binary operator, or a constituent character.

What you should turn in

format

All the below should be ASCII files, handed in by running the cs351-handin program on a directory containing only the files you wish to hand in. The documentation can be html format if you'd like. Please do not use any binary format files. The file parser.h should be legal C++ code. None of the member functions should have bodies, ie none of them should be inline. The bodies will be defined in the companion parser.cc file you will write later.

Design Constraints and Motivation

This explains how you thought about the problem, and gives the criteria you thought the library needs to satisfy. It is motivation for the exact decisions explained below. Eg if the program using the library first has to do some setup before it can do real work, the reason you thought this was reasonable and the reasoning behind why you put the division between the various stages where you did should be explained here.

file: parser.h

This formally sets out the classes and member functions in your API.

Documentation with Examples

This documents what all the (public) stuff in parser.h is for. After reading this, a competent programmer should be able to write a program that uses a library that meets your API.


Due date: 2pm Mon Feb 12.