6. LINQ-fold patterns — what _fold(...) recognizes¶
This is a catalog of every chain shape that the _fold macro
splices into a specialized loop. Each row names a chain pattern, the
splice arm in daslib/linq_fold.das it dispatches to, and a short
note on the emitted loop’s shape.
Pipelines that don’t match any row here fall through to the default
fold_linq_default path, which lowers via the standard
iterator-materializing surface (the same code path as the
non-spliced linq calls).
For the corresponding tutorial, see tutorial_linq_decs. For
nanosecond-level cost comparisons across the SQL / Array / Decs lanes,
see benchmarks/sql/results.md in the source tree.
6.1. How dispatch works¶
_fold walks the chain inside-out (terminator first), flattens the
ExprCall spine via flatten_linq, normalizes adjacent
where/select pairs and order |> reverse shapes via the pre-passes
described below, and hands the result to try_splice_patterns in
daslib/linq_fold.das. That dispatcher walks a single global
splice_patterns table (17 rows, one per arm listed in this
document) twice:
Decs adapter pass. Runs only when
extract_decs_bridge(top) != null(i.e. the source isfrom_decs_template(...)). Each row’srequirespredicates gate the match; rows with anarray_sourcepredicate fail here and fall through. First emit that returns non-null wins.Array adapter pass. Runs only when
extract_decs_bridge(top) == null, i.e. the source is not a decs eager bridge. Top is firstpeel_each-unwrapped. Decs chains never reach this pass: if the Decs pass above cascades on every row, control falls through tofold_linq_defaultinstead. (This is deliberate —peel_eachdoes not strip the eager-bridgeExprInvoke, so without this gate thedecs_sourcepredicate would still succeed on a decs source in the Array pass and decs-only rows could match and emit viaSourceAdapter::Array, silently dropping adapter-specific captures likeupstream_join.)
If neither pass emits, the Theme 6 perf-warn fires for
from_decs_template chains (telling the user the bridge will
materialize and tier-2 will run on the buffer), then control falls
through to fold_linq_default (the iterator-materializing tier-2
path) and finally to a raw passthrough.
Note
Labels of the form plan_<X> (e.g. plan_distinct,
plan_decs_reverse) in the catalog below refer to the
pre-PR-E per-planner stubs that were collapsed into the unified
dispatcher. Each label now corresponds to a row (or pair of rows)
in splice_patterns whose requires predicates and emit
function carry the same logic. The names are kept here because
they read more naturally than row names like
order_buffer_helper_dispatch — see daslib/linq_fold.das
for the precise row each catalog entry maps to.
6.2. Pre-dispatch normalizations¶
A few chain rewrites fire at the start of the relevant planners (right
after flatten_linq, before the per-arm pattern-match) so the rest of
that planner’s logic sees the normalized shape and a single arm covers
what would otherwise be many lookalike chains:
_order_by(K).reverse()→_order_by_descending(K)(and the three symmetric flips:order_by_descending → order_by,order → order_descending,order_descending → order). Applied bynormalize_order_reverse, called from everyplan_*order_familyandplan_*reverseplanner right afterflatten_linq. The registry pointer is swapped to the flipped variant in place — theExprCallarg list is identical for ascending/descending order variants, so no AST clone is needed. Iterative: a chain like_order_by(K).reverse().reverse()collapses to_order_by(K)in two passes._select(f) |> _select(g)(N consecutive) →_select(g(f(_))). Applied bycollapse_chained_selects, called fromplan_zip,plan_distinct,plan_decs_distinct,plan_reverse,plan_decs_reverse,plan_decs_join(alsocollapse_chained_wheresper PR D2), and defensively fromplan_order_family/plan_decs_order_family(which don’t currently accept any leading_selectbut would inherit collapse if they ever did). Mirrors how chained_wherealready compose via&&. Composition takes the INNER lambda’s structure (preserves param type), renames its bound param to a freshqn("cs", at)name to avoidapply_templaterecursive substitution when both lambdas share the boost-side_desugar, then overwrites its body with outer’s body where outer’s param is substituted by the renamed-inner body. Chain backlink rewired so subsequent planner passes see the shortened AST. Gated on!has_sideeffects(innerBody)— collapsing would shift evaluation count when outer references its param zero or many times (cascade always evaluates inner once per element). Chains with%/// user-call inner cascade to tier-2 (output remains correct). Also bails (cascades) when either selector is not a peelable single-arg, single-returnExprMakeBlocklambda — multi-statement projection bodies, captured/non-trivial lambda shapes, and function-pointer arguments all skip collapse.plan_loop_or_count,plan_group_by_core, andplan_decs_unrollalready handle chained selects natively via theirintermediateBinds/ chain-info machinery and don’t need the pre-pass.
6.3. Source-side entry points¶
Source |
Recognized by |
Note |
|---|---|---|
|
|
Strips the |
|
pattern |
Two-source zip. The three-argument form |
|
|
Surfaces a |
|
|
Runtime component-name list form. Same decs splices as the template form. |
6.4. Array-source patterns¶
Chain shape |
Splice arm |
Notes |
|---|---|---|
|
|
O(1) length read when the source has a known length. |
|
|
Single counter, no allocation; one pass over the array. |
|
|
Single-pass scalar reduce; |
|
|
Multi-output reduce; one scalar per output kept in the loop’s state. |
|
|
Early-exit terminator; loop breaks on first match ( |
|
|
Boolean fast-path; loop exits on first hit ( |
|
|
Bounded counter/accumulator; loop exits at N matches. |
|
|
Take cap ticks unconditionally; |
|
|
|
|
|
Single |
|
|
|
|
|
Theme 3 Phase 3 (audit C1/C5). The bounded-heap path gains a leading or middle |
|
|
Bounded-heap / streaming-min holds the raw element; projection |
|
|
Materializes + sorts. No bounded-heap shortcut. |
|
|
Single-hash set lane; |
|
|
Dedup table is built unconditionally so |
|
|
Theme 8 (audit 3b). The where_+order fused-loop path generalizes: when upstream |
|
pattern |
Per-key bucket reducer; single hash, one entry per group. PR D1: reducer dispatch is now a |
|
pattern |
HAVING filter on the bucket reference (pre-aggregate); can lift hidden reducer slots referenced by |
|
pattern |
HAVING filter on the constructed post-aggregate tuple (predicate references |
|
pattern |
Theme 3 Phase 2 (audit C2). Inline-cmp |
|
|
Sum archetype sizes, then walk tail-first with skip-counter and early-exit. |
|
|
Projection runs ≤K times at return on the R1-R4 buffer or on the surviving |
|
|
Theme 8 (audit 2a). Array source only. Walks source backward via index ( |
6.5. Decs-source patterns¶
Every array pattern above has a decs mirror that walks each archetype
of T’s template instead of iterating the array. The body shape is
identical — only the source iteration changes.
Note
As of PR C, the four plan_decs_* planners (_reverse /
_distinct / _order_family / _unroll) are thin pattern-table
stubs that reuse the same pattern rows as their array-side siblings
(plan_reverse_patterns / plan_distinct_patterns /
plan_order_family_patterns / plan_loop_or_count_patterns)
with a SourceAdapter::Decs adapter swap. The 7 array-side emit
archetypes consume the adapter (adapter_bind_name selects
it vs decs_tup for lambda peeling; adapter_wrap_source_loop
dispatches for (it in src) vs for_each_archetype + build_decs_inner_for_pruned;
adapter_wrap_invoke dispatches the outer invoke wrap). For
plan_decs_unroll (which feeds emit_loop_or_count_lane), the
Decs-arm dispatch (emit_loop_or_count_lane_decs) reconstructs a
calls array from captures and routes to the existing
emit_decs_* lane fns unchanged (state hoist above
for_each_archetype stays per-adapter; see masterplan D1).
Two decs-specific fast paths preserved: emit_decs_count_archsize
(bare count()) and emit_decs_reverse_skip_into_tail
(reverse |> take(N) |> to_array). Row 4 of
plan_order_family_patterns (buffer_helper_dispatch) is gated
to Array adapter via array_source — decs cascades to Row 3
(fused_prefilter) which materializes the buffer, matching the
imperative decs behavior. reverse |> distinct[_by] on decs
sources cascades to tier-2 (no decs equivalent of the array
backward-walk dset gate; deferred per masterplan D6).
As of PR D3, the GroupBySourceAdapter shim (a parallel adapter
used only by plan_group_by_core) is gone — group_by’s three
source shapes (Array / Decs / DecsJoin) all flow
through the same SourceAdapter variant as every other planner.
plan_group_by_core calls adapter_wrap_source_loop and
adapter_wrap_invoke directly. The decs-join branch of
adapter_wrap_source_loop carries the inline hash-collect +
probe + per-pair result-lam bind body shared with
emit_decs_join.
Chain shape (decs source) |
Splice arm |
Notes |
|---|---|---|
|
|
Sums |
|
|
Counter loop over the per-archetype walk. The bare |
|
|
Per-archetype accumulator; pruner keeps only the components read by |
|
|
Walk lane reads one component per loop iteration; element_at uses cumulative-size short-circuit. |
|
|
Boolean fast-path; walks until first match or end. |
|
|
Concatenates archetype slices; one allocation sized by |
|
|
Single |
|
|
Same heap pattern as the array variant; buffer size N. |
|
|
Decs mirror of the array-side |
|
|
Decs mirror of |
|
|
Streaming-min/max with key. |
|
|
Single-hash set lane mirroring |
|
|
Decs mirror of the array-side predicate-distinct splice. Same dedup-unconditional / counter-gated-on-P shape across archetypes. |
|
|
Whole-archetype skip + partial-archetype skip-counter + early-exit. |
|
|
Decs mirror of |
|
pattern |
Shared bucket-reducer with the array path; differs only in the per-element source. |
|
pattern |
Decs mirror of the array-side post-aggregate HAVING. Same predicate-on-output-tuple semantics. |
|
pattern |
Decs mirror of the array-side ORDER BY splice (Theme 3 Phase 2 C2). Shares the same in-place inline-cmp sort tail; only the bucket-fill source differs. |
|
pattern |
Theme 3 Phase 1 cross-arm composition. |
|
|
Hoists |
6.5.1. Decs-decs equi-join¶
plan_decs_join is the hashed equi-join splice over two
from_decs_template sources. It collects the right side into a
table<KEY; array<TUPB>> in one for_each_archetype pass, then
walks the left side and probes via table.get. The key must be a
primitive (int* / uint* / float / double / bool /
string); tuple keys cascade to the standard join_impl.
Chain shape |
Splice arm |
Notes |
|---|---|---|
|
pattern |
Hash-fill + probe; |
|
pattern |
Hash-fill + probe; |
|
pattern |
Single bind of the join result per matched pair, then projection. |
|
pattern |
Bind join result, evaluate predicate, gate |
|
|
Cross-arm composition. |
6.6. Zip patterns¶
Chain shape |
Splice arm |
Notes |
|---|---|---|
|
pattern |
Fuses to a single index-loop over the shorter side. |
|
pattern |
Three-source zip; same loop shape with three reads per iteration. |
|
pattern |
Theme 1 (audit 7a). The 3-arg form |
|
pattern |
|
|
pattern |
Early-exit terminator on the zipped pair. |
|
pattern |
The 2-arg |
|
pattern |
Theme 8 (audit C4). |
6.7. What falls back¶
The default path (fold_linq_default) fires when none of the
plan_* arms recognize the chain. This is the standard linq surface
— iterators materialize, callbacks fire, and there is no fusion.
Common cases that fall back:
Mixed-source operators like
union(a, b),except(a, b),intersect(a, b),concat(a, b)after the first source has been transformed (e.g.each(a)._select(F).union(b)).Joins other than decs-decs equi-join:
_left_join/_right_join/_full_outer_join/_cross_joindon’t splice; array-source_joinalso falls back. Only the decs-decs primitive-key_joinshape catalogued above splices (viaplan_decs_join); tuple keys, non-primitive keys, mixed array/decs sources, or chain ops beyond a single trailing_where/_selectall cascade tojoin_impl.Aggregations on lazy groupings:
_group_by_lazy(K)._select(F)with a non-bucket-reducing_select.Materialization-only chains that the standard linq surface already lowers efficiently — e.g.
to_table()on a finite array.Chained ``_select(f) |> _select(g)`` with an impure inner (
_ % N,_ / N, user-call inner that the typer can’t prove pure). Thecollapse_chained_selectspre-pass is gated on!has_sideeffects(innerBody)because collapsing would shift evaluation count when outer references its param zero or many times. Pure inner (_._field,_ + K,_ * K, etc.) collapses transparently and the downstream planner sees a single_select.
When a chain falls back, behavior is identical to writing the same
chain without _fold — correct, but not splice-fused.
6.7.1. Decs-bridge fall-off diagnostic¶
When a from_decs_template source survives _fold dispatch
without any tier-1 planner (decs or array-side) claiming it, the
bridge materializes into a temp res array before
fold_linq_default runs on top — an EXTRA allocation beyond
whatever cascade follows. LinqFold.visit detects this case right
before falling through to fold_linq_default: it destructures
flatten_linq(call.arguments[0]) into (top, calls) and fires
only when calls is non-empty (a real cascade is about to run) and
extract_decs_bridge(top) is non-null (the source IS a
from_decs_template bridge). Bare
_fold(from_decs_template(...)) with no chain ops is skipped —
there’s no cascade, just the bridge’s own materialization. When
fired, a *warning* goes to the compiler log naming the call
site:
user.das:42:8: *warning* `_fold`: from_decs_template source
survived dispatch — no `plan_decs_*` arm claimed this chain, so
the bridge materializes a temp `res` array and the tier-2 cascade
runs on the materialized buffer. Rewrite the chain to a
recognized decs shape (see
doc/source/reference/linq_fold_patterns.rst), or suppress with
`options _no_decs_perf_warn = true`.
The fix is usually to reorder ops so the chain matches a row in the
Decs section above (e.g. push _select past _skip_while /
_take_while since their predicates run on the source tuple, not
the projected value). Suppress per file with options
_no_decs_perf_warn = true for tests that intentionally exercise
cascade behavior as regression guards.
6.8. See also¶
tutorial_linq_decs — using
from_decs_templateand_foldover decs entities.tutorial_linq — the underlying linq surface (
where,select,order_by, etc.) and the_foldmacro./stdlib/generated/linq_fold — the
linq_foldmodule reference (the two public macros_foldand_old_fold)./stdlib/generated/linq — the full linq surface.
/stdlib/generated/linq_boost — the shorthand macros (
_where,_select,_order_by, etc.) and the surrounding macro family.