← All docs

The Parser

How tokens become an AST with Pratt parsing.

Source: src/parser/expr/, stmt/, control.rs, attributes.rs, ast/, mod.rs

The parser takes the token stream from the lexer and builds an Abstract Syntax Tree (AST) — a tree structure that represents the program’s meaning, not just its text.

What is an AST?

An AST strips away syntactic noise (parentheses, semicolons, braces) and captures the structure of the program:

echo 1 + 2 * 3;

The tokens are flat: Echo, Int(1), Plus, Int(2), Star, Int(3), Semicolon. But the AST is a tree:

Echo
 └── BinaryOp(Add)
      ├── IntLiteral(1)
      └── BinaryOp(Mul)
           ├── IntLiteral(2)
           └── IntLiteral(3)

The tree encodes that 2 * 3 happens before + 1operator precedence is baked into the structure. The parser is responsible for getting this right.

The AST types

File: src/parser/ast/

The AST has two main node types:

Expressions (Expr)

Things that have a value:

VariantExampleNotes
IntLiteral(i64)42
FloatLiteral(f64)3.14
StringLiteral(String)"hello"Escapes already resolved by lexer
BoolLiteral(bool)true, false
Nullnull
Variable(String)$xName without $
BinaryOp { left, op, right }$a + $bSee operator table below
InstanceOf { value, target }$obj instanceof User, $obj instanceof $classNameClass/interface runtime type check. The target is either a named class/interface target or a dynamic target expression.
Negate(Expr)-$xUnary minus
Not(Expr)!$xLogical NOT
BitNot(Expr)~$xBitwise NOT (complement)
Throw(Expr)throw new Exception("boom")Throw expression node used both in statements and expression positions such as ?? or ternaries
Print(Expr)print $xPHP print expression. It writes the operand and returns 1; statement-form print $x; is represented as ExprStmt(Print(...)).
NullCoalesce { value, default }$x ?? $yReturns $x if non-null, otherwise $y
Pipe { value, callable }`$x> trim(…), $x
Assignment { target, value, result_target, prelude, conditional_value_temp }$x = 1, $arr[$i] ??= "fallback"Assignment expression. Complex targets can carry prelude statements and synthetic temporaries so side effects are evaluated once while the assignment still returns the assigned value.
PreIncrement(String)++$iReturns new value
PostIncrement(String)$i++Returns old value
PreDecrement(String)--$i
PostDecrement(String)$i--
FunctionCall { name, args }strlen($s), Tools\fmt($s), \strlen($s)Parsed as a structured name so later phases can resolve namespace aliases and fully-qualified names
Yield { key, value }yield, yield $v, yield $k => $vYield expression inside a generator body. The parser keeps optional key/value expressions; later checker/codegen turns the enclosing function or closure into a Generator state machine.
YieldFrom(Expr)yield from inner()Contextual yield from delegation. The lexer leaves from as an identifier and the parser recognizes it only immediately after yield.
ArrayLiteral(Vec<Expr>)[1, 2, 3], [...$arr, 4]Indexed array; elements may include Spread expressions
ArrayLiteralAssoc(Vec<(Expr, Expr)>)["a" => 1]Associative array
Match { subject, arms, default }match($x) { 1, 2 => "low", 3 => "high" }Match expression (returns a value). arms is Vec<(Vec<Expr>, Expr)>, so each arm can have multiple comma-separated patterns before =>, and default is optional (Option<Box<Expr>>)
ArrayAccess { array, index }$arr[0], $str[-1]Same AST node is used for indexed arrays, associative-array lookups, and string indexing
Ternary { condition, then_expr, else_expr }$a ? $b : $c
ShortTernary { value, default }$a ?: $fallbackPHP short ternary / Elvis form. Codegen evaluates value once, returns it if truthy, otherwise returns default.
ErrorSuppress(Expr)@file_get_contents("missing.txt")PHP error-control prefix expression. Codegen wraps the operand in a runtime warning-suppression scope.
Cast { target, expr }(int)$x
Closure { params, variadic, return_type, body, is_arrow, is_static, captures, capture_refs }function(int $x = 1) use ($y, &$z): string { ... }, fn(int $x): int => $x * 2, or static function(): int { ... }Anonymous function / arrow function. Params is Vec<(String, Option<TypeExpr>, Option<Expr>, bool)> - name, declared type, default, is_ref. variadic is an optional parameter name. return_type stores the optional declared closure / arrow return TypeExpr. captures stores by-value captures and capture_refs stores use (&$var) captures. Arrow functions are still represented as Closure, parse with is_arrow = true, and do not carry explicit use (...) captures in the AST. is_static is set when the closure is prefixed with the static keyword (PHP static function () {} / static fn () => ...); the type checker rejects any reference to $this inside a static closure.
NamedArg { name, value }foo(name: "Alice")Named call argument. The parser preserves source order; later phases validate names against the declared parameter list and normalize known-signature calls for ABI lowering.
ClosureCall { var, args }$fn(1, 2)Calling a closure stored in a variable
ExprCall { callee, args }$arr[0](1, 2)Calling the result of an expression (e.g., array access returning a callable)
Spread(Expr)...$arrSpread/unpack operator — expands an array into individual arguments or elements
ConstRef(Name)MAX_RETRIES, Config\PORT, \App\Config\PORTReference to a user-defined constant
NewObject { class_name, args }new Point(1, 2), new App\Model\User()Object instantiation
NewScopedObject { receiver, args }new self(), new static(), new parent()Object instantiation against a static receiver. Distinct from NewObject (which carries a fixed Name) so codegen can honour late static binding for static.
PropertyAccess { object, property }$p->xProperty access via ->
DynamicPropertyAccess { object, property }$p->{$name}Dynamic property access where the property name is an expression. Dynamic method calls are intentionally rejected.
NullsafePropertyAccess { object, property }$p?->xNullsafe property access via ?->
NullsafeDynamicPropertyAccess { object, property }$p?->{$name}Nullsafe dynamic property access. If the receiver is null, the property expression and the rest of the chain are skipped.
StaticPropertyAccess { receiver, property }Point::$count, self::$count, parent::$count, static::$countClass-scoped property access via ::, where receiver is a named class, Self_, Static, or Parent
MethodCall { object, method, args }$p->move(1, 2)Instance method call
NullsafeMethodCall { object, method, args }$p?->move(1, 2)Nullsafe instance method call; PHP rejects ?->method(...) closure creation, so elephc reports Cannot combine nullsafe operator with Closure creation for that form
StaticMethodCall { receiver, method, args }Point::origin(), self::boot(), parent::boot(), static::boot()Static-style call via ::, where receiver is a named class, Self_, Static, or Parent
FirstClassCallable(CallableTarget)strlen(...), Tools\fmt(...), Math::twice(...)PHP-style first-class callable syntax; the target is preserved structurally instead of being parsed as a call
This$thisReference to the current object inside a method
PtrCast { target_type, expr }ptr_cast<Point>($p)Pointer-tag cast parsed specially after ptr_cast<T>
BufferNew { element_type, len }buffer_new<int>(256)Compiler extension for contiguous hot-path buffers
MagicConstant(MagicConstant)__DIR__, __CLASS__Parsed from case-insensitive magic-constant tokens. __LINE__ is lowered immediately to IntLiteral; the remaining magic constants are lowered by src/magic_constants.rs before type checking.
ClassConstant { receiver }MyClass::class, \App\C::class, self::class, parent::class, static::classThe PHP ::class reflection literal. Codegen lowers it to a string literal carrying the fully-qualified class name. static::class follows late static binding.
ScopedConstantAccess { receiver, name }MyClass::LIMIT, self::DEFAULT_SIZEUser-declared class constant access through ::; later phases resolve the receiver and constant metadata.

Statements (Stmt)

Things that do something:

Each Stmt also carries a source span and an attributes list. The list is populated only for declaration statements that can legally be decorated with PHP attributes; attributes before non-declaration statements are rejected during parsing.

VariantExample
Echo(Expr)echo $x;; multi-argument echo $a, $b; lowers to a Synthetic sequence of Echo statements
Assign { name, value }$x = 42;
If { condition, then_body, elseif_clauses, else_body }if (...) { } elseif (...) { } else { }
While { condition, body }while (...) { }
DoWhile { body, condition }do { } while (...);
For { init, condition, update, body }for (...; ...; ...) { }init, condition, and update are all optional, so for (;;) { } is valid
Foreach { array, key_var, value_var, value_by_ref, body }foreach ($arr as $v) { }, foreach ($arr as $k => $v) { }, or foreach ($arr as &$v) { }
Switch { subject, cases, default }switch ($x) { case 1: ...; default: ... }
ArrayAssign { array, index, value }$arr[0] = 5;
NestedArrayAssign { target, value }$arr[0][1] = 5;, $obj->items[0] = 5;
ArrayPush { array, value }$arr[] = 5;
TypedAssign { type_expr, name, value }int $x = 42;, buffer<int> $xs = buffer_new<int>(8);
FunctionDecl { name, params, variadic, return_type, body }function foo(int $a, &$b, string $c = "x"): string { } — params is Vec<(String, Option<TypeExpr>, Option<Expr>, bool)> where the tuple stores name, declared type, default value, and is_ref (pass by reference). variadic is Option<String> for variadic parameters (...$args) and return_type is an optional declared TypeExpr
FunctionVariantGroup { name, variants }Internal resolver metadata for include-loaded hidden function implementations behind one public name
FunctionVariantMark { name, variant }Internal include-body marker that activates the hidden function variant loaded at that runtime include point
Return(Option<Expr>)return $x; or return;
Break(usize)break;, break 2;
Continue(usize)continue;, continue 2;
Include { path, once, required }include 'file.php';
IncludeOnceMark { label }Internal resolver lowering for regular include / require, marking the resolved file as loaded for later *_once guards
IncludeOnceGuard { label, body }Internal resolver lowering for include_once / require_once; codegen checks a per-file flag before emitting the guarded body
Throw(Expr)throw new Exception("boom");
Synthetic(Vec<Stmt>)Internal lowering only; a source construct that has already been expanded into one or more ordinary statements before final codegen
Try { try_body, catches, finally_body }try { ... } catch (Exception $e) { ... } finally { ... }
ConstDecl { name, value }const MAX = 100;
IfDef { symbol, then_body, else_body }ifdef DEBUG { ... } else { ... }
NamespaceDecl { name: Option<Name> }namespace App\Core;, namespace;
NamespaceBlock { name: Option<Name>, body }namespace App\Core { ... }, namespace { ... }
UseDecl { imports }use App\Lib\Tool;, use function App\fn as helper;, use Vendor\Pkg\{Thing, Other as Alias};
ListUnpack { vars, value }[$a, $b] = [1, 2]; for simple local positional destructuring; skipped, keyed, nested, and non-local destructuring patterns lower to Synthetic assignment statements
Global { vars }global $x, $y; — declares variables as referencing global storage
StaticVar { name, init }static $count = 0; — declares a variable that persists across function calls
ClassDecl { name, extends, implements, is_abstract, is_final, is_readonly_class, trait_uses, properties, constants, methods }final readonly class Point extends Shape implements Named { use NamedTrait; ... }
EnumDecl { name, backing_type, cases }enum Status: int { case Ok = 1; case Err = 2; }
PackedClassDecl { name, fields }packed class Vec2 { public float $x; public float $y; }
InterfaceDecl { name, extends, properties, methods, constants }interface Named extends Stringable { public string $name { get; } public function name(): string; }
TraitDecl { name, trait_uses, properties, constants, methods }trait Named { public const KIND = "name"; ... }
PropertyAssign { object, property, value }$p->x = 10;
StaticPropertyAssign { receiver, property, value }Counter::$count = 10;, self::$count = 10;
StaticPropertyArrayPush { receiver, property, value }Counter::$items[] = 10;, self::$items[] = 10;
StaticPropertyArrayAssign { receiver, property, index, value }Counter::$items[0] = 10;, self::$items[0] = 10;
PropertyArrayPush { object, property, value }$p->items[] = 10;
PropertyArrayAssign { object, property, index, value }$p->items[0] = 10;
ExternFunctionDecl { name, params, return_type, library }extern function foo(int $x): int; or entries inside extern "lib" { ... }params is Vec<ExternParam>, where each ExternParam stores { name, c_type }, and return_type is a CType
ExternClassDecl { name, fields }extern class Point { public int $x; }
ExternGlobalDecl { name, c_type }extern global ptr $environ; — the declared type is a C-facing CType, not a PhpType
ExprStmt(Expr)my_func(); (expression used as statement)

Constructor property promotion is normalized during class-body parsing. A parameter such as public int $id in __construct becomes a ClassProperty plus a synthetic leading PropertyAssign statement equivalent to $this->id = $id;. Parameter defaults stay on the constructor signature rather than ClassProperty.default, matching PHP’s distinction between promoted parameter defaults and property defaults. By-reference promoted parameters preserve a by_ref flag on the generated property so codegen can bind the property slot to the referenced argument or to a heap reference cell when a default value is used. Later passes otherwise see ordinary properties and ordinary constructor assignments.

Statement dispatch

At statement level, parsing is split between parser/mod.rs and the stmt/ submodules:

  • parse() in mod.rs special-cases extern so one extern "lib" { ... } block can expand into multiple AST statements.
  • Everything else flows through stmt::parse_stmt(), which selects the parser entry point from the current token.
Current tokenParse as
Class / Abstract Class / Final Class / Readonly Class / combined class modifiersClass declaration
EnumEnum declaration
PackedPacked-class declaration
InterfaceInterface declaration
TraitTrait declaration
FunctionFunction declaration
NamespaceNamespace declaration
UseNamespace import declaration
ReturnReturn statement
ThrowThrow statement
EchoEcho statement
PrintGeneric expression statement containing Print(...)
If / While / Do / For / Foreach / Switch / TryControl-flow statement
Const / Global / StaticDeclaration-like statement
Variable / This / Identifier / Backslash / Self_ / Parent / Static::...Assignment, property write, call, or generic expression statement

This is intentionally narrower than full PHP statement syntax. In the current subset, expression statements only enter through the token arms handled by stmt::parse_stmt() above; starting a statement with tokens such as match, new, fn, a literal, (, or a unary operator still produces an “unexpected token at statement position” parser error unless that construct appears inside another statement form.

Error recovery

The parser does not stop at the first syntax error anymore. It now performs conservative synchronization at statement boundaries and block boundaries so one malformed statement does not necessarily prevent later statements from being parsed and reported.

Current recovery behavior is intentionally simple:

  • top-level parsing can skip forward to the next plausible statement boundary after a syntax error
  • block parsing ({ ... }) can resynchronize on ;, }, and EOF
  • the parser still prefers correctness over aggressive recovery, so heavily malformed input may still collapse into fewer diagnostics than an IDE-style parser would produce

Binary operators (BinOp)

Add  Sub  Mul  Div  Mod  Pow  Concat
Eq  NotEq  StrictEq  StrictNotEq  Lt  Gt  LtEq  GtEq  Spaceship
And  Or  Xor
BitAnd  BitOr  BitXor  ShiftLeft  ShiftRight
NullCoalesce

instanceof is represented as ExprKind::InstanceOf rather than BinOp because PHP has special RHS grammar: named targets (User, self, parent, static) are resolved like class names, while variable/property/array targets and parenthesized expressions are evaluated dynamically.

Type expressions (TypeExpr)

Parsed type annotations use TypeExpr before the checker resolves them into PhpType values:

Int  Float  Bool  Str  Void  Never  Iterable
Ptr(Option<Name>)  Buffer(Box<TypeExpr>)  Named(Name)
Nullable(Box<TypeExpr>)  Union(Vec<TypeExpr>)

Iterable represents PHP’s iterable pseudo-type in parameter, return, property, and typed-local annotations. Nullable shorthand (?T) and explicit unions (T|U) are represented separately so the checker can reject invalid forms such as ?T|U and normalize accepted declarations.

ClassDecl uses several supporting types:

TypeFieldsDescription
VisibilityPublic, Protected, PrivateEnum for property/method visibility
Attributename, args, spanA PHP 8 attribute entry from a #[...] group. The parser validates names and optional argument expressions. Class, method, and property names plus supported literal args feed class_attribute_names(), class_attribute_args(), class_get_attributes(), and the supported Reflection getAttributes() APIs; parameter reflection is not implemented yet.
AttributeGroupattributes, spanOne bracketed attribute group. Declaration sites can carry one or more groups.
EnumCaseDeclname, value, span, attributesA backed or unit enum case declaration, with declaration-level attributes preserved in the AST.
ClassConstname, visibility, is_final, value, span, attributesA class, interface, or trait constant declaration.
ClassPropertyname, visibility, type_expr, hooks, readonly, is_final, is_static, is_abstract, by_ref, default, span, attributesA property declaration inside a class, trait, or interface, optionally carrying a parsed property type declaration, hook contract, static-property marker, by-reference promotion marker, or declaration-level attributes
ClassMethodname, visibility, is_static, is_abstract, is_final, has_body, params, variadic, return_type, body, span, attributesA method declaration inside a class, trait, or interface
CatchClauseexception_types, variable, bodyA catch arm. exception_types supports both single-type and PHP-style multi-catch (`TypeA
StaticReceiverNamed(Name), Self_, Static, ParentLeft-hand side of ClassName::method(), self::method(), static::method(), and parent::method()
TraitUsetrait_names, adaptations, spanA use TraitA, TraitB { ... } clause inside a class or trait body
TraitAdaptationAlias { trait_name: Option<Name>, method, alias: Option<String>, visibility: Option<Visibility> }, InsteadOf { trait_name: Option<Name>, method, instead_of: Vec<Name> }PHP-style trait conflict resolution and aliasing
UseItem / UseKindkind, name, aliasNamespace import entries for use, use function, use const, and group-use declarations
CallableTargetFunction(Name), StaticMethod { receiver, method }, Method { object, method }Structured target of first-class callable syntax such as foo(...) or Cls::bar(...)

Every AST node carries a Span (line + column) from the source, so error messages in later phases can point to the right location.

The Pratt parser

File: src/parser/expr/

Parsing expressions with operators is the hardest part. Consider:

1 + 2 * 3 ** 4

This should parse as 1 + (2 * (3 ** 4)) because ** binds tighter than *, which binds tighter than +. And ** is right-associative (2 ** 3 ** 4 = 2 ** (3 ** 4)), while + and * are left-associative.

elephc uses a Pratt parser (also called top-down operator precedence parser) to handle this elegantly. The key idea: every operator has a binding power — a pair of numbers (left, right) that determine how tightly it grabs its operands.

Binding power table

Operator          Left BP    Right BP    Associativity
─────────────────────────────────────────────────────
or                  1          2         left
xor                 3          4         left
and                 5          6         left
assignment          7          6         RIGHT (variable targets)
? : / ?:            7          7         right-ish ternary parse
??                  9          8         RIGHT (null coalescing)
||                 11         12         left
&&                 13         14         left
|  (bitwise OR)    15         16         left
^  (bitwise XOR)   17         18         left
&  (bitwise AND)   19         20         left
== != === !==      21         22         left
< > <= >= <=>      23         24         left
|>                 24         25         left (dedicated Pipe node)
<< >>              25         26         left
.  (concat)        27         28         left
+ -                29         30         left
* / %              31         32         left
instanceof         35         special    left, named-or-dynamic RHS
unary (- ! ~)          35                prefix
**                 37         36         RIGHT (r < l)

Left-associative operators have right_bp > left_bp. This means 1 + 2 + 3 parses as (1 + 2) + 3.

Right-associative operators have right_bp < left_bp. This means 2 ** 3 ** 4 parses as 2 ** (3 ** 4).

For ??, the Pratt table still uses BinOp::NullCoalesce to assign binding power, but the parser builds a dedicated ExprKind::NullCoalesce { value, default } node rather than a generic BinaryOp.

For instanceof, the Pratt loop handles the keyword at expression level and then parses either a class/interface target name or a dynamic target expression. Its binding power matches PHP’s behavior where !$obj instanceof User parses as !($obj instanceof User).

For |>, the Pratt loop handles Token::PipeArrow before the generic BinOp table and builds ExprKind::Pipe { value, callable }. The binding power (24, 25) places it below concatenation, shifts, and arithmetic, but above comparisons, ??, ternary, logical operators, and assignment. This matches PHP 8.5 and keeps pipe-specific validation, such as requiring parenthesized arrow-function targets, out of generic binary-operator lowering.

The word-form logical operators (and, xor, or) have PHP’s lower precedence. The symbolic && and || continue to bind more tightly.

The full ternary form builds ExprKind::Ternary. The omitted-middle form expr ?: fallback builds ExprKind::ShortTernary so later phases can preserve PHP’s single-evaluation rule for the left-hand expression.

Assignment expressions build ExprKind::Assignment { target, value, result_target, prelude, conditional_value_temp }. Their binding power matches PHP’s low-precedence assignment slot, so $x = true and false parses as ($x = true) and false, while $x = $y = 1 remains right-associative. Standalone variable assignment statements still lower to StmtKind::Assign unless a lower-precedence word logical operator requires the whole statement to be represented as an expression statement.

For non-local expression targets, the parser emits hidden assignment prelude statements when a receiver, index, or RHS must be evaluated exactly once before the final write. The lowered target is the write target, result_target is the expression read after the write, and prelude contains temporary assignments such as the captured result of idx() in $items[idx()] = value() or the RHS value in $items[$i] = ($i = 1). This keeps codegen on the normal assignment paths while preserving PHP’s evaluation order for plain and compound assignment expressions. For ??=, conditional_value_temp reserves a hidden temporary that codegen fills only in the null branch, preserving PHP’s conditional RHS evaluation for targets such as $items[$i] ??= ($i = 1).

The algorithm

parse_expr_bp(min_bp):
    1. Parse prefix (literal, variable, unary op, parenthesized expr, ...)
       → this is the "left" node

    2. Loop:
       a. Look at the next token — is it an infix operator?
       b. Get its (left_bp, right_bp)
       c. If left_bp < min_bp → stop (operator doesn't bind tight enough)
       d. Consume the operator
       e. Parse right side: parse_expr_bp(right_bp)
       f. Build BinaryOp(left, op, right) → this becomes the new "left"
       g. Continue loop

    3. The `?` arm handles both full ternary (`? :`) and short ternary (`?:`)

    Return left

Walkthrough: 1 + 2 * 3

parse_expr_bp(0):
  prefix → IntLiteral(1)

  loop iteration 1:
    next token: +  → (left_bp=29, right_bp=30)
    29 >= 0? yes → consume +
    parse_expr_bp(30):
      prefix → IntLiteral(2)
      loop iteration:
        next token: *  → (left_bp=31, right_bp=32)
        31 >= 30? yes → consume *
        parse_expr_bp(32):
          prefix → IntLiteral(3)
          loop: no more operators
          return IntLiteral(3)
        build: Mul(Int(2), Int(3))
      loop: no more operators
      return Mul(Int(2), Int(3))
    build: Add(Int(1), Mul(Int(2), Int(3)))

  loop: no more operators
  return Add(Int(1), Mul(Int(2), Int(3)))

Result: 1 + (2 * 3) — correct!

The beauty of Pratt parsing is that you add a new operator by adding one line to the binding power table. No grammar rules to rewrite, no ambiguity to resolve.

Prefix parsing

Before looking for infix operators, the parser handles prefix constructs — things that start an expression:

PrefixWhat it parses
IntLiteralReturn IntLiteral node
FloatLiteralReturn FloatLiteral node
StringLiteralReturn StringLiteral node
true / falseReturn BoolLiteral node
nullReturn Null node
VariableReturn Variable node (with postfix ++/-- check)
throwParse the following expression at the lowest precedence and wrap it in ExprKind::Throw
printParse the operand at ternary-level precedence (bp=7, above word logical operators) and wrap it in ExprKind::Print
yieldParse yield, yield expr, yield key => value, or contextual yield from expr
- (minus)Parse inner expr at unary precedence (bp=35), return Negate
! (not)Parse inner expr at unary precedence (bp=35), return Not
~ (bitwise not)Parse inner expr at unary precedence (bp=35), return BitNot
@ (error control)Parse inner expr at unary precedence (bp=35), return ErrorSuppress
++ / --Return PreIncrement / PreDecrement
(int) / (float) / …Parse inner expr, return Cast
(Parse inner expr, expect ), return inner expr (and allow a later postfix call like (expr)(args))
[Parse comma-separated exprs, expect ], return ArrayLiteral
match + (Parse match (...) { ... }Match
Identifier / \Identifier / qualified name + (Parse as function call with arguments
Identifier / \Identifier / qualified name + (...)Parse as first-class callable → FirstClassCallable(CallableTarget::Function)
Identifier / \Identifier / qualified name (no ()Parse as constant reference → ConstRef
function + (Parse anonymous function (closure) → Closure
fn + (Parse arrow function → Closure (with is_arrow = true)
static + function / fn + (Parse static closure → Closure (with is_static = true); the type checker rejects $this inside the body
new + qualified nameParse object instantiation → NewObject
new + self / static / parent + (Parse scoped object instantiation → NewScopedObject
<receiver>::classParse MyClass::class, \App\C::class, self::class, parent::class, static::classClassConstant
$thisReturn This node
... + exprParse spread/unpack → Spread
ptr_cast + <Type> + (Parse pointer cast syntax → PtrCast
buffer_new + <Type> + (Parse contiguous-buffer allocation → BufferNew
__DIR__ / __FILE__ / other magic constantsParse magic constants → MagicConstant (__LINE__ becomes IntLiteral)

Postfix: calls, array access, and member access

After parsing a prefix, the parser checks for postfix operators:

  • ( for calling the result of an expression (ExprCall)
  • [ for array access
  • -> for property access or method call
  • ?-> for nullsafe property access or method call
  • :: for enum-case lookup, static method call, or static-method first-class callable (when the prefix is a parsed name)

At statement level, stmt.rs also parses trait declarations and class/trait-body use clauses. That use handling is intentionally context-sensitive so it does not interfere with closure capture lists like function () use ($x) { ... }.

$arr[0]            ArrayAccess { array: Variable("arr"), index: IntLiteral(0) }
$arr[$i + 1]       ArrayAccess { array: Variable("arr"), index: BinaryOp(Add, ...) }
$p->x              PropertyAccess { object: Variable("p"), property: "x" }
$p->move(1, 2)     MethodCall { object: Variable("p"), method: "move", args: [...] }
$p?->x             NullsafePropertyAccess { object: Variable("p"), property: "x" }
$p?->move(1, 2)    NullsafeMethodCall { object: Variable("p"), method: "move", args: [...] }
Point::origin()    StaticMethodCall { receiver: Named("Point"), method: "origin", args: [] }
\Lib\Factory::make()  StaticMethodCall { receiver: Named("\\Lib\\Factory"), method: "make", args: [] }
parent::boot()     StaticMethodCall { receiver: Parent, method: "boot", args: [] }

Statement parsing

Files: src/parser/stmt/, src/parser/control.rs

Statement parsing is simpler — after parse() has peeled off top-level extern blocks, stmt.rs looks at the current token to decide what kind of statement to parse:

Current tokenParse as
EchoEcho statement — parse one or more comma-separated expressions, expect ;
PrintExpression statement — parse Print(...), expect ;
ThrowThrow statement — parse one expression, expect ;
IfDefBuild-time conditional statement
VariableAssignment, compound assignment, array assign/push, or expression statement
IfIf with optional elseif chain and else
TryTry with one or more catch clauses and optional finally
WhileWhile loop
DoDoWhile loop
ForFor loop with init/condition/update
ForeachForeach loop
SwitchSwitch statement with cases and optional default
FunctionFunction declaration with parameters and body
Class / Abstract Class / Final Class / Readonly Class / combined class modifiersClass declaration with properties and methods
EnumEnum declaration
PackedPacked class declaration
InterfaceInterface declaration
TraitTrait declaration with trait uses, properties, and methods
ExternHandled one level up in parser/mod.rs via parse_extern_stmts()
ReturnReturn with optional expression
BreakBreak statement with optional positive integer level
ContinueContinue statement with optional positive integer level
Include / Require / IncludeOnce / RequireOnceInclude statement (path is parsed as an expression and later folded by the resolver when it is a compile-time string)
ConstConstant declaration (const NAME = value;)
NamespaceNamespace declaration (namespace App\Core; or namespace App\Core { ... })
UseNamespace import declaration (use Foo\Bar;, use function Foo\bar as baz;)
GlobalGlobal variable declaration (global $x, $y;)
StaticStatic variable declaration (static $count = 0;)
[List destructuring ([$a, , $c] = expr;, ["id" => $id] = expr;)
Identifier + (Expression statement (function call)
Internal lowering, no source tokenSynthetic statement sequence used for temporary-backed lowering of effectful compound assignment targets

Assignment parsing

When the parser sees a Variable, it looks ahead to decide:

$x = 42;           Assign { name: "x", value: IntLiteral(42) }
$x += 5;           Assign { name: "x", value: BinaryOp(Add, Variable("x"), IntLiteral(5)) }
$x <<= 1;          Assign { name: "x", value: BinaryOp(ShiftLeft, Variable("x"), IntLiteral(1)) }
$x ??= 5;          Assign { name: "x", value: NullCoalesce(Variable("x"), IntLiteral(5)) }
$x = true and false;  ExprStmt(BinaryOp(Assignment(Variable("x"), BoolLiteral(true)), And, BoolLiteral(false)))
echo ($x += 1);    Echo(Assignment(Variable("x"), BinaryOp(Add, Variable("x"), IntLiteral(1))))
$arr[0] = 5;       ArrayAssign { array: "arr", index: IntLiteral(0), value: IntLiteral(5) }
$arr[0] += 5;      ArrayAssign { array: "arr", index: IntLiteral(0), value: BinaryOp(Add, ArrayGet(...), IntLiteral(5)) }
$arr[] = 5;        ArrayPush { array: "arr", value: IntLiteral(5) }
$x++;              ExprStmt(PostIncrement("x"))

Compound assignments (+=, -=, *=, **=, /=, .=, %=, &=, |=, ^=, <<=, >>=) are desugared into regular assignments with binary operations. Null coalescing assignment (??=) is represented as a regular assignment with a NullCoalesce value; codegen recognizes this shape and emits a conditional store so the right-hand side is only evaluated when the current target value is null. In expression form, the type checker and optimizer treat ExprKind::Assignment, its hidden prelude, and its conditional value temp as observable assignment machinery so the assignment cannot be folded away or hidden by stale constant-propagation facts.

Compound assignments can target variables, object properties, static properties, and non-append array elements. For simple targets, the parser lowers directly to the final assignment node. For effectful non-local targets such as $obj->items[next_key()] += 1, the parser emits a StmtKind::Synthetic sequence that stores the receiver or index expressions in hidden temporaries, then performs the read-modify-write using those temporaries. This preserves PHP’s single-evaluation behavior without making codegen duplicate the target expressions.

try / catch / finally

control.rs parses exception handling statements with this general shape:

try {
    // body
} catch (TypeA | TypeB $e) {
    // handler
} catch (Exception) {
    // optional variable binding omitted
} finally {
    // cleanup
}

Each catch becomes a CatchClause { exception_types, variable, body }. exception_types always stores a vector, so single-type catches are just a one-element list.

How it connects

The parser’s output — Program (which is Vec<Stmt>) — first feeds into per-file magic-constant lowering, then elephc’s build-time conditional pass for ifdef, then into the resolver, then into the dedicated name-resolution pass that canonicalizes namespace-aware names, and finally into the type checker:

[(Token, Span), ...] → Parser → Program (Vec<Stmt>) → MagicConstants → Conditional → Resolver → NameResolver → Type Checker