ParsingExpressionGrammar

Last edit June 27, 2010

Parsing expression grammars (PEGs) are an analytic grammar formalism.

They feature the following operators:

They are:

However, they come with their share of problems:

PEGs can't directly handle left-recursion (not much of a problem in practice)
PEGs suffer from so called "prefix capturing": "a*a" will never match any input
- (this is what you get for defining away ambiguity)
PEGs are bad for specifying languages: just guess what language "S <- aSa / aa" parses
- for the lazy: A string of "a"s with a length which is a power of two!
PEGs can't generate strings, only parse them (due to lookahead operators)
- it seems to me that one could 'enforce' discovery of whatever one looked ahead to find
- Generating "(&A)B" would involve finding the intersection of A and B. This is an undecidable problem: If it would be possible to find the intersection of two PEGs we would not have issues detecting prefix capture. Prefix-capture is present in "A/B" iff the language generated by "(&A)B" is not empty. See [1] for more information.