Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
Removed the LAYOUT matching from tokenRegex, these are not part of the
token and should be removed before. Add an auxilliary function that
tests whether a token that matches ID really is a non-keyword.
Implement remaining expressions.
|
|
Remove keywordRegex, it is not used now. To do: make sure that ID not
match keywords, implement parseExp.
|
|
|
|
Add a next() function that returns the next scanned token and match() to
consume the token from input. next() takes an automaton such that you
can match "begin" in the beginning, but any other string (including
"begin" prefixes) for ID.
|
|
NAT and ID regexes were taken from RegexTest.java. Tests are added
before the implementation (test-driven development).
|
|
In the original change, the EXP in PROD-EXP should have been PROD-EXP,
similarly EXP in SINGLE-EXP should have been SINGLE-EXP. Otherwise the
operator precedence is not applied correctly (consider 1+2*3, it should
be interpreted as 1+(2*3), not (1+2)*3).
Anyway, this change removes left-recursion as described in 2.12.1.
|
|
Needs more verification, but the idea is that the expressions with
lowest precedence (+) should be the first one in the parse tree
(implying that non-terminals with highest precedence should be just
above a terminal).
Left-associativity is achieved by the trick in section 2.3.1 of
Intro2CD which results in the left operand being recursed instead of the
right one.
|
|
|
|
|
|
I guess that this satisfies "A description of the grammar resulting from
the transformation to remove EBNF constructs". It does not mention
something about the left-associativity and priority rules, that is
probably the next step.
|
|
This grammar looks like SDF[0]. Explanation of some constructs:
- "start-symbol" refers to the start symbol (a non-terminal).
- Lines below "context-free syntax" refers to the non-terminals.
- Lines below "lexical syntax" refers to the terminals.
- "context-free priorities" provides disambiguation constructs.
Note that the "EXP ::= " part really belongs to each individual operand,
so it means something like "A > B > C" (where A, B, C are the unary
minus, multiplication and plus expressions respectively).
I have no idea why Mark van den Brand chose to use multiple
"context-free syntax" and "lexical syntax" groups (if that is
appropriate terminology), as far as I can see all such blocks can be
merged into one group. Maybe he wanted to emphasize the difference
between the left-hand side names?
[0]: http://releases.strategoxt.org/strategoxt-manual/unstable/manual/chunk-chapter/tutorial-sdf.html
|
|
|
|
|
|
|
|
Also remove tests with letters "s" and "Id" in it, these are not
interesting to test. Add some more tests to catch capital "E" and reject
a lone "e7".
|
|
|
|
|
|
Signed-off-by: andrea <andrea.evangelista.1989@gmail.com>
|
|
|
|
|
|
|
|
Do not look for longest match, but for the exact match. Documented at
http://www.brics.dk/automaton/doc/dk/brics/automaton/RunAutomaton.html#run%28java.lang.String%29
|
|
Also ensure that "classes" is the default target
|
|
...such that the tests can be reviewed based upon the generated output.
(as required for the assignment).
|
|
ID ::= [a-z] [a-z0-9]*
|
|
I don't know ant well enough to write a build.xml, so let's just use a
Makefile for now.
|
|
Unzipped from OASE (2IMP20), excluding the bin/ folder.
size: 158596 bytes
sha256: 179253d2bfec243bc5294372ce0ad1c5d82a30ef32ad108f6c9fd0aee73fe4df assignment1.zip
|