ECOOP 2026 Pearl
A Simple Recipe
for Decent Recursive
Descent Parsers
Luyu Cheng and Lionel Parreaux
Keep handwritten parsing direct.
Make accepted syntax visible again.
1Token-local rules
2Recursive descent control flow
3Generated diagrams and tables

Two Routes Into Parsing

Theory-first route
1
Context-free grammars
2
Automata and algorithms
3
LL and LR families
Tool-first route
1
Parser generators
2
Parser combinators
3
Working parsers quickly
The missing middle: a quick path to an inspectable parser for a complete language.

The Missing Middle

Recursive descent remains worth learning. It gives a concrete model: consume, choose, build syntax.
Most tutorials stop at expressions. Full languages need statements, declarations, blocks, patterns, and types.
Pratt’s idea is broader than precedence. Tokens and rules can carry parsing behavior.
The recipe is a disciplined realization. Keep recursive descent as execution, with modular, extensible, visible rules.
This paper presents
A tutorial recipe for writing flexible recursive-descent parsers.
In a modern statically typed language, using Pratt parsing as the running example.
LanguageMLscript
Targeta simple language based on Caml Light

Revisiting Pratt Parsing

Binding Power
Placeholder for the revised explanation of left and right binding power.
tokens A+B*C
binding powers 3456
Precedence: * binds tighter than +.
tokens A+B+C
binding powers 3434
Associativity: + groups to the left.

Binding Power Demo

The Pratt-Shaped Loop

fun expr(bp: Int): Tree =
let acc = atom()
exprCont(acc, bp)
fun exprCont(acc, bp) = if peek is
Some(op) and op.lbp > bp then
let rhs = expr(op.rbp)
exprCont(apply(op, acc, rhs), bp)
_ then acc
Loop Steps
1
Parse one atomic phrase.
2
Test whether the next token can continue it.
3
Recurse under that threshold.

Trace One Pratt Loop

Loop body
fun expr(bp: Int): Tree =
let acc = atom()
exprCont(acc, bp)
fun exprCont(acc, bp) = if peek is
Some(op) and op.lbp > bp then
let op = consume()
let rhs = expr(op.rbp)
exprCont(apply(op, acc, rhs), bp)
_ then acc
Input Stream
A+B*Cend
ready
State
acc
A
next
+
bp
0
Call Stack

Expression Parsing Is Not Enough

Original TDOP
print value + increment
The association problem is between neighboring argument positions.
log value base radix
A delimiter documents the role of the following argument.
clear item, item, item
Variable-length argument lists still fit the token-local view.
Whole-Language Syntax
if condition then consequence else alternative
A rule spans several keywords and named subterms.
match subject with pattern -> branch | pattern -> branch
Branches repeat and each branch carries its own syntax.
let rec name argument = body in rest
Declarations and expressions compose in the same parser.
Keep the control flow general and define what counts as a continuation.

From Operators to Rules

Classic Pratt Table
token lbp rbp continuation
+ 3 4 left + expr(4)
* 5 6 left * expr(6)
apply 7 8 left expr(8)
The table maps one token to one continuation action.
Rule Representation
Rule[A]
an array of choices
End(value)
finish the rule and return the value
Keyword(kw, rest)
consume a keyword and continue with rest
Ref(kind, process, bp, rest)
parse a referenced syntax kind, process it, then continue
let ifThenElse = keyword(_if) of
term(name: "condition") of
keyword(_then) of
term(name: "consequent") of
end
keyword(_else) of
term(name: "alternative") of end
Rules describe parseable fragments, not only symbolic operators.

One Rule, Two Views

Source rule
let ifThen =
keyword(_if) of
term(name: "condition") of
keyword(_then) of
term(name: "consequent") of
end
let ifThenElse =
keyword(_if) of
term(name: "condition") of
keyword(_then) of
term(name: "consequent") of
keyword(_else) of
term(name: "alternative") of
end
let printFormat =
keyword("print") of
term(name: "x") of
keyword("format") of
stringLiteral(name: "f") of
end
term of
// Tree.Application: <expr> <expr>
term of
process: (argument, _) => callee =>
Tree.Application(
callee, argument :: Nil)
name: "application argument"
bp: Keyword.appPrec
The parser interprets the rule; the talk, docs, and users see the same rule as a diagram.
Rendered shape

Parsing Terms and Types

Generic machinery executes reified rules with parseRule and dispatches syntax kinds through parseKind.
Terms and types both follow “parse an atom, then greedily extend it.”
Keywords, applications, and ascriptions are delegated to rules.
typeExpr and typeExprCont follow the same pattern, but run over typeRule.
With Infix, the remaining loop becomes a reified continuation.

Two Functions Interpret Rules

Functions in the parser
fun parseKind(kind: Str, bp: Int): Tree
fun parseRule(bp: Int, rule: Rule[Tree])
val afterRef:
Option[[Str, Rule[Option[Tree -> A]]]]
1
parseKind retrieves a named rule or handles built-in syntax kinds.
2
parseRule consumes a Rule and returns a tree.
3
afterRef drives continuations that reference the same syntax kind.
parseRule, with details omitted
fun parseRule(bp: Int, rule: Rule[Tree]) =
if peek is
Some(Token.Identifier(name)) and
Keyword.all |> MutMap.get(name) is Some(keyword) and
rule.keywordChoices |> Map.get(name) is
Some(rest) then
consume
parseRule(bp, rest)
Some(other) and
rule.refChoice is
Some(Ref(kind, f, ebp', rest)) and
let ebp = ebp' |> Option.getOrElse(Keyword.max)
ebp > bp and parseKind(kind, bp) as acc is
¬(Tree.Error | Tree.Empty) and
parseRule(bp, rest) as tree is
Tree.Error then tree
else f(acc, tree)
Some(other) and rule.endChoice is
Some(value) then value
Some(other) then
consume
Tree.Error of None, "unexpected token " + other

Conclusion

Pratt parsing can be a recipe for an inspectable parser, not just an operator table.
Keep Pratt's control flow, but move syntax into first-class Rule and Choice data.
parseKind and parseRule interpret those rules under a binding-power threshold.
The same rules parse, support user extension, and render as syntax diagrams.
Caml Light mostly fits the recipe; ambiguous top-level let still exposes where manual parsing remains.
For MLscript, the goal is to reduce parser-combinator opacity and ad-hoc recursive-descent special cases.