Skip to content

[WIP] Issue 1541: stack allocation escape analysis in ClangGen#1632

Draft
johnynek wants to merge 1 commit into
mainfrom
codex/issue-1541-stack-alloc-checkpoint
Draft

[WIP] Issue 1541: stack allocation escape analysis in ClangGen#1632
johnynek wants to merge 1 commit into
mainfrom
codex/issue-1541-stack-alloc-checkpoint

Conversation

@johnynek

@johnynek johnynek commented Feb 12, 2026

Copy link
Copy Markdown
Owner

Summary

WIP prototype for issue #1541: add a stack-allocation optimization pass in ClangGen that rewrites eligible alloc_enumN/alloc_structN calls to BSTS_STACK_ALLOC_ENUMN/STRUCTN when escape analysis determines the value is local.

Implemented pieces in this checkpoint:

  • Added StackAllocPass in core/src/main/scala/dev/bosatsu/codegen/clang/ClangGen.scala.
  • Added stack allocation macros in c_runtime/bosatsu_runtime.h.
  • Added/updated ClangGenTest coverage for escaping vs non-escaping constructor cases and single-constructor enum/newtype behavior.

Static impact from generated C

Compared generated test_workspace C on main vs this branch:

  • Heap-like alloc calls (alloc_enumN, N>0 and alloc_structN, N>1):
    • 1986 -> 1899 (saved 87, -4.38%)
  • All alloc_* calls:
    • 3661 -> 3574 (saved 87, -2.38%)
  • Changed functions: 24
    • Biggest reductions were in collection test functions:
      • ___bsts_c_Bosatsu_l_Collection_l_Queue_l_tests (-25)
      • ___bsts_c_Bosatsu_l_List_l_tests (-17)
      • ___bsts_c_Bosatsu_l_Collection_l_TreeList_l_tests (-11)

Runtime benchmark findings

I ran A/B timings locally on macOS (main vs this branch), including generated binaries and custom harnesses.

Important finding: FibBench is not a good signal for this optimization; its Bosatsu/FibBench call path has 0 BSTS_STACK_ALLOC_* sites.

Results summary

  • Generated test_exe: no stable effect (small deltas, CI/variance overlaps zero).
  • Collection-focused workload (directly exercises stack-alloc-heavy functions):
    • paired run (iters=10, 24 rounds):
      • branch-main mean delta: -0.14%
      • 95% CI: [-0.79%, +0.51%]
    • additional runs flip sign (~+0.8% to -0.5%), i.e. noise-level impact.

Current recommendation

This is not merge-ready yet from a performance-benefit standpoint.

  • The optimization reduces allocations statically, but current runtime measurements do not show a consistent real-world speedup.
  • I think we should keep this work (do not throw it away), but continue as WIP while we:
    • benchmark on larger/more representative workloads,
    • potentially narrow/simplify the optimization scope,
    • and/or gate it behind a flag until we have clearer wins.

@codecov

codecov Bot commented Feb 12, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 65.44343% with 113 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.67%. Comparing base (3e0b647) to head (cbe8a77).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...ain/scala/dev/bosatsu/codegen/clang/ClangGen.scala 65.44% 113 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1632      +/-   ##
==========================================
- Coverage   83.84%   83.67%   -0.17%     
==========================================
  Files         146      147       +1     
  Lines       26870    27394     +524     
  Branches     6801     6945     +144     
==========================================
+ Hits        22528    22922     +394     
- Misses       4342     4472     +130     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant