Skip to content

GriffinCanCode/bldr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bldr

A high-performance build system for polyglot monorepos, featuring runtime dependency discovery, lock-free parallel execution, and comprehensive incremental compilation. Built in D for maximum performance with zero-cost compile-time abstractions.

Overview

Builder advances the state of build systems through novel architectural approaches: dynamic graph discovery eliminates code generation complexity, a process-based plugin architecture enables true language-agnostic extensibility, and Chase-Lev work-stealing provides optimal parallel execution. The system achieves Bazel-class capabilities while introducing meaningful innovations in programmability, caching granularity, and developer experience.

Feature Status

Feature Status Notes
Dynamic Build Graphs ✅ Production Runtime dependency discovery
Work-Stealing Scheduler ✅ Production Chase-Lev deque, lock-free
Hermetic Builds ✅ Production Linux/macOS/Windows sandboxing
LSP Implementation ✅ Production VS Code extension available
Multi-Level Caching ✅ Production 3 tiers (target, action, remote/distributed)
AST-Level Incremental ✅ Production Symbol-level dependency tracking, class/function granularity
Query Language ✅ Production Bazel-compatible bldrquery
Build System Migration ✅ Production 11 systems supported
Tier 1 DSL ✅ Production Variables, loops, conditionals, 30+ built-ins
Tier 2 D Macros ✅ Production CTFE, template metaprogramming
Content-Defined Chunking ✅ Production Rabin fingerprinting
Bayesian Flaky Detection ✅ Production Statistical modeling
Language Support ✅ Production 27 language handlers
Distributed Execution ⚠️ Beta Architecture complete, production hardening in progress
Plugin System ⚠️ Beta Core works, external SDKs in development
Explain System ✅ Production AI-optimized documentation engine

Performance Note: Performance figures in this document are from internal benchmarks. Hardware: Modern x86_64/ARM64. Your results may vary based on workload, hardware, and configuration.

Core Innovations

1. Dynamic Build Graphs

Builder supports runtime dependency discovery—actions can extend the build graph during execution. Traditional build systems require all dependencies at analysis time, creating friction for code generation workflows.

The Problem: Protobuf compilers, template engines, and code generators produce files whose names depend on their inputs. Static graphs force awkward workarounds.

Our Solution: Actions implement the DiscoverableAction interface and emit DiscoveryMetadata during execution. The graph extends safely with automatic cycle detection and rescheduling.

// Protobuf generates .cpp files, creates compile targets automatically
// Template expands to multiple languages, discovers all outputs
// Dynamic at build-time, type-safe at compile-time

Impact: Eliminates preprocessing hacks, enables natural code generation patterns, maintains build graph correctness guarantees.

2. Process-Based Plugin Architecture ⚠️ Beta

Plugins are standalone executables communicating via JSON-RPC 2.0 over stdin/stdout—a fundamental departure from traditional dynamic library approaches.

Current Status: Core plugin system is functional (discovery, execution, lifecycle management). External SDKs (Python, Go, Rust) are in active development.

Why This Matters:

  • Language Agnostic: Write plugins in Python, Go, Rust, JavaScript—anything
  • Zero ABI Coupling: No shared library compatibility nightmares
  • Fault Isolation: Plugin crashes don't affect bldr
  • Simple Distribution: Each plugin is a separate Homebrew formula
  • Easy Testing: Plugins are just executables with stdin/stdout

vs Dynamic Libraries: Dynamic libraries require matching host language, share address space (crashes cascade), face ABI compatibility hell, and complicate distribution. Process-based plugins eliminate all these issues.

Current Status: Core plugin system is functional (discovery, execution, lifecycle management). External SDKs (Python, Go, Rust) are in active development.

3. Lock-Free Work-Stealing Scheduler

Implements Chase-Lev deque algorithm for optimal parallel task distribution. Owner threads operate without locks (zero contention fast path), while idle workers steal using lock-free CAS operations.

Architecture:

  • Each worker has a local deque (push/pop from bottom, O(1) uncontended)
  • Stealers take from top using atomic operations
  • Random victim selection prevents systemic imbalance
  • Exponential backoff reduces contention under high load

Performance: Designed for minimal latency, near-linear scaling to 64+ cores, and high CPU utilization on parallel workloads.

vs Standard Thread Pools: Traditional work queues use global locks (contention bottleneck). Work-stealing achieves lock-free hot paths and automatic load balancing.

Note: Performance targets (e.g., <50ns latency) are design goals based on algorithm characteristics, not measured benchmarks.

4. Three-Tier Programmability

Unique layered approach to build file programmability:

Tier 1 - Functional DSL (90% of use cases):

let packages = ["core", "api", "cli"];

for pkg in packages {
    target(pkg) {
        type: library;
        sources: glob("lib/" + pkg + "/**/*.d");
    }
}

Variables, loops, conditionals, functions, 30+ built-ins. Type-safe, not Turing-complete (prevents abuse).

Tier 2 - D Macros (9% of advanced cases):

// Full D language power, compile-time generation
Target[] generateMicroservices() {
    return services.map!(svc => 
        TargetBuilder.create(svc.name)
            .sources(["services/" ~ svc.name ~ "/**/*.go"])
            .build()
    ).array;
}

CTFE evaluation, template metaprogramming, zero runtime overhead.

Tier 3 - Plugins (1% of integrations): ⚠️ Beta External tool integration (Docker, Kubernetes, SonarQube, etc.). Language-agnostic, fault-isolated. Core system functional, SDKs in development.

Why Three Tiers: Most users need simple scripting (Tier 1). Power users get full language access (Tier 2). Integrations use plugins (Tier 3). Each tier has appropriate power and complexity.

5. Content-Defined Chunking

Rabin fingerprinting with rolling hash enables efficient network transfers—only changed chunks transmitted.

Algorithm: Rolling polynomial hash identifies content-defined boundaries. Inserting bytes shifts boundaries naturally; only affected chunks retransmit.

Performance: Typical bandwidth savings of 40-90% for modified large files. SIMD-accelerated rolling hash, BLAKE3 chunk hashing.

Applications: Artifact store uploads, distributed cache, remote execution inputs, graph cache synchronization.

6. Bayesian Flaky Test Detection

Statistical modeling with temporal pattern analysis identifies flaky tests automatically.

Method: Bayesian inference computes flakiness probability from pass/fail history. Temporal analysis detects time-of-day, day-of-week, and load-based patterns.

Actions: Automatic quarantine, confidence-based adaptive retries, test health metrics.

vs Simple Heuristics: "Failed twice = flaky" produces false positives. Bayesian modeling uses statistical confidence, temporal patterns, and historical context.

7. Set-Theoretic Hermetic Builds

Mathematical foundation for provable correctness using set operations.

Model:

  • I = Input paths (read-only)
  • O = Output paths (write-only)
  • T = Temp paths (read-write)
  • N = Network operations

Invariants:

  1. I ∩ O = ∅ (inputs and outputs disjoint)
  2. N = ∅ (no network for hermetic builds)
  3. Same I → Same O (deterministic)
  4. |T| = ∅ after build (no temp leaks)

Platform Implementation:

  • Linux: Namespaces (mount, PID, network, IPC, UTS, user) + cgroup v2 resource monitoring
  • macOS: sandbox-exec with SBPL + rusage monitoring
  • Windows: Job objects with resource limits + I/O accounting

Overhead: Measured overhead is minimal compared to build times—negligible for reproducible builds.

8. Complete LSP Implementation

Full Language Server Protocol for Builderfile editing with bundled binaries for all platforms.

Features: Autocomplete, diagnostics, go-to-definition, find-references, hover info, rename refactoring, document symbols.

Performance: Designed for sub-10ms response times for interactive editing.

Distribution: VS Code extension with bundled LSP binaries (macOS ARM64/x64, Linux x64, Windows x64). Zero setup—works out of the box.

vs Syntax Highlighting Only: Most build systems stop at syntax highlighting. Full LSP provides IDE-quality editing with semantic understanding.

Performance Optimizations

SIMD Acceleration

Hardware-agnostic runtime dispatch with fallback chains:

  • x86/x64: AVX-512 → AVX2 → SSE4.1 → SSE2 → Portable
  • ARM: NEON → Portable

BLAKE3 Hashing: 3-5x faster than SHA-256. Typical throughput: 600 MB/s (portable) → 3.6 GB/s (AVX-512 on modern hardware).

Implementation: C implementations with intrinsics, runtime CPU detection, D bindings with zero-copy dispatch.

Multi-Level Caching

Three distinct cache tiers, each optimized for its domain:

  1. Target Cache: Complete build outputs per target
  2. Action Cache: Individual build steps (compile, link, codegen)
  3. Remote/Distributed Cache: Shared cache across machines and CI

Binary Storage: Custom SIMD-accelerated format with schema versioning, ~10x faster than JSON, ~40% smaller.

Eviction: Hybrid strategy—LRU + age-based + size-based.

Incremental Everything

Analysis: Content-addressable cache reuses analysis for unchanged files. Savings: 5-10s on 10K-file monorepos.

Compilation: File-level dependency tracking rebuilds only affected sources. Reduction: 70-99% of files skip recompilation.

Test Selection: Dependency-aware test selection runs only affected tests. Typical: 90-99% of tests skipped.

Watch Mode: Native file watching (FSEvents on macOS, inotify on Linux) with proactive cache invalidation.

Language Support

27 languages with unified handler architecture. Centralized registry in source/languages/registry.d ensures consistency.

Compiled: C, C++, D, Rust, Go, Zig, Nim, OCaml, Haskell, Swift JVM: Java, Kotlin, Scala .NET: C#, F# Scripting: Python, JavaScript, TypeScript, Ruby, Perl, PHP, Lua, R, Elixir Web: JavaScript (esbuild/webpack/rollup), TypeScript (tsc/swc/esbuild), CSS, Elm Data: Protocol Buffers

Extensibility: Implement LanguageHandler interface (~150-200 lines), register in central registry, automatic CLI/wizard integration.

Distributed Execution ⚠️ Beta - Active Development

Remote execution with native OS sandboxing—no container runtime overhead.

Status: Architecture is complete and functional. Core components implemented (coordinator, worker, protocol, transport). Currently under production hardening. Suitable for testing and feedback, not recommended for production workloads yet.

Architecture:

  1. Build SandboxSpec from action
  2. Upload inputs to artifact store (chunked if >1MB)
  3. Send ActionRequest + SandboxSpec to coordinator
  4. Worker executes hermetically using native backend (namespaces/sandbox-exec/job objects)
  5. Worker uploads outputs (chunked)
  6. Return results with resource usage

Caching: Action cache integration—cache hits skip execution entirely.

Design Goal: Eliminate container overhead. Containers typically add 50-200ms per action. Native sandboxing aims for lower overhead while maintaining isolation guarantees.

Testing Infrastructure

Enterprise-grade test execution beyond industry standards:

Test Sharding: Adaptive strategy uses historical execution time for optimal load balancing. Content-based sharding ensures consistent distribution across CI runs.

Test Caching: Multi-level with hermetic environment verification. Cache keys include environment hash—prevents false cache hits.

Flaky Detection: Bayesian statistical modeling with temporal pattern analysis. Automatic quarantine and confidence-based retries.

Test Analytics: Health metrics, trend analysis, bottleneck identification, flakiness scoring.

JUnit XML: CI/CD integration (Jenkins, GitHub Actions, GitLab CI, CircleCI).

Query Language

Bazel-compatible query DSL for exploring dependency graphs:

bldr query 'deps(//src:app)'              # All dependencies
bldr query 'rdeps(//lib:utils)'           # Reverse dependencies  
bldr query 'shortest(//a:x, //b:y)'       # Shortest path
bldr query 'kind(test, //...)'            # Filter by type
bldr query 'deps(//...) & kind(library)'  # Set operations

Implementation: Algebraic query language with visitor pattern AST, optimized graph algorithms (BFS/DFS), multiple output formats (pretty, JSON, DOT).

Build System Migration

Comprehensive migration tools support moving from any major build system to Builder:

Supported Systems:

  • Bazel (BUILD, BUILD.bazel) - Rules, dependencies, compiler flags
  • CMake (CMakeLists.txt) - Executables, libraries, target properties
  • Maven (pom.xml) - Java projects, dependencies, plugins
  • Gradle (build.gradle, build.gradle.kts) - Java/Kotlin/Groovy projects
  • Make (Makefile) - Variables, targets, dependencies
  • Cargo (Cargo.toml) - Rust projects, dependencies
  • npm (package.json) - JavaScript/TypeScript projects
  • Go Modules (go.mod) - Go projects, module dependencies
  • DUB (dub.json) - D projects, configurations
  • SBT (build.sbt) - Scala projects
  • Meson (meson.build) - C/C++ projects

Features:

  • Intelligent auto-detection from file name and content
  • Preserves target structure and dependencies
  • Language-specific configuration translation
  • Detailed warnings for unsupported features
  • Dry-run mode for safe preview

Usage:

# Auto-detect and migrate
bldr migrate --auto BUILD

# Specify source system
bldr migrate --from=cmake --input=CMakeLists.txt

# Preview without writing
bldr migrate --auto pom.xml --dry-run

# List all supported systems
bldr migrate list

# Get system-specific info
bldr migrate info bazel

Architecture: Composable parser architecture with unified intermediate representation. Each migrator implements IMigrator interface, registered via central MigratorRegistry (follows LanguageRegistry pattern). Clean separation: parse → transform → emit.

Design: Parse-once strategy extracts targets to system-agnostic IR, then emits idiomatic Builderfile DSL. Warnings categorized by severity (info/warning/error). Metadata preservation for manual review of complex features.

AI-Native Documentation

Integrated explanation engine designed for the LLM era, providing semantic understanding of build concepts directly in the terminal:

bldr explain caching              # Explain multi-tier architecture
bldr explain determinism          # Understanding reproducibility
bldr explain "remote execution"   # Fuzzy search for concepts

Features:

  • Semantic Concepts: Explains why and how, not just flag syntax (e.g., hermetic vs sandboxed).
  • RAG-Optimized: Underlying data stored in granular, context-rich YAML optimized for LLM retrieval.
  • Smart Resolution: Handles aliases, fuzzy matching, and related topic suggestions.
  • Architecture-Aware: Documentation stays in sync with code via strict validation.

Observability

Distributed Tracing: OpenTelemetry-compatible with W3C Trace Context. Span tracking, context propagation, multiple exporters (Jaeger, Zipkin, Console).

Structured Logging: Thread-safe with configurable levels, JSON output option, performance overhead <0.5%.

Telemetry: Real-time metrics collection, bottleneck identification, regression detection, build analytics with binary storage (4-5x faster than JSON).

Visualization: Flamegraph generation (SVG), build replay for debugging, health monitoring.

CLI Architecture

Event-driven rendering with lock-free progress tracking:

Design: Build events published to subscribers (decoupled rendering), atomic operations for progress (zero contention), adaptive output based on terminal capabilities.

Modes: Interactive (progress bars, real-time updates), Plain (simple text), Verbose (detailed logging), Quiet (errors only), Auto (capability detection).

Performance: Zero-allocation hot paths, pre-allocated buffers, efficient ANSI sequences.

Installation

# macOS
brew install ldc dub
git clone https://github.com/GriffinCanCode/bldr.git
cd bldr
dub build --build=release

# Linux
sudo apt install ldc dub  # or equivalent
dub build --build=release

# Verify
./bin/bldr --version

Quick Start

# Initialize new project
bldr init

# Use interactive wizard
bldr wizard

# Migrate from other build systems
bldr migrate --from=bazel --input=BUILD --output=Builderfile
bldr migrate --auto CMakeLists.txt  # Auto-detect build system

# Build all targets
bldr build

# Build specific target
bldr build //path/to:target

# Run tests with JUnit output
bldr test --junit results.xml

# Watch mode for development
bldr build --watch

# Query dependencies
bldr query 'deps(//src:app)'

# View analytics
bldr telemetry recent 10

# Explain concepts
bldr explain caching
bldr explain "remote execution"

# Install VS Code extension
code --install-extension tools/vscode/builder-lang-2.0.0.vsix

Builderfile Example

// Modern DSL with full scripting support
let version = "1.0.0";
let buildFlags = ["-O2", "-Wall"];

target("core-lib") {
    type: library;
    language: d;
    sources: ["src/core/**/*.d"];
    flags: buildFlags;
}

target("app-${version}") {
    type: executable;
    language: d;
    sources: ["src/main.d"];
    deps: [":core-lib"];
    flags: buildFlags;
}

target("tests") {
    type: test;
    language: d;
    sources: ["tests/**/*.d"];
    deps: [":core-lib"];
}

Industry-Leading Innovations

World-First Features

Economic Cost Optimization - First build system to optimize for cost, not just time. Computes Pareto-optimal build plans across cost-time tradeoffs with budget constraints. At scale, this saves real money—a 2-minute build at $12 vs. an 8-minute build at $2 are both valid depending on your constraints.

SIMD-Accelerated Serialization - Custom binary format with C SIMD hot paths (AVX2/NEON). Zero-copy deserialization, ~10x faster than JSON, ~40% smaller. Cap'n Proto-inspired with schema evolution and compile-time codegen.

Three-Tier Caching - Target cache, action cache, and distributed cache work together for comprehensive build output caching. Combined with incremental compilation and dependency tracking for maximum reuse.

Enterprise-Grade Distributed Systems

Circuit Breakers & Resilience - Netflix Hystrix-quality fault tolerance with rolling window failure tracking, exponential backoff with jitter, and adaptive rate limiting. Prevents cascading failures in distributed builds.

Arena Allocators & Memory Pooling - Systems-level memory management with O(1) bump-pointer allocation (~5ns vs ~100ns GC), zero fragmentation, and specialized pools for network I/O. C++ performance in D.

OpenTelemetry Distributed Tracing - W3C Trace Context compliant with context propagation across threads. Zero overhead when disabled. Enabled by default—confidence in performance.

Technical Specs

Rust-Quality Error Handling - Full Result monad with map, andThen, orElse, traverse, sequence. Composable error pipelines prevent error loss. Specialized void handling for side effects.

Bayesian Flaky Detection - Statistical modeling with Beta distribution inference, not simple heuristics. Temporal pattern analysis (time-of-day, load-based) with confidence-based quarantine.

Unified Toolchain System - Single 2,850-line system replaces ~5,000 lines of fragmented per-language detection. Semver constraint solving, cross-compilation, remote toolchain providers.

Security Architecture - Command injection prevention, BLAKE3-HMAC cache integrity, TOCTOU-resistant temp directories, documented threat model. Most build systems have no security model.

Industry Comparison

Feature Bazel Buck2 CMake bldr
Dynamic Build Graphs
Cost Optimization
SIMD Serialization ⚠️ Protobuf ⚠️ bincode
Three-Tier Caching ❌ 2 tiers ❌ 2 tiers
Result Monads ❌ Exceptions ✅ Rust style ❌ Codes ✅ Advanced
Circuit Breakers
Distributed Tracing ⚠️ Basic ⚠️ Some ✅ OpenTelemetry
Security Model ⚠️ Basic ⚠️ Basic ✅ Documented
Statistical Flaky Detection ❌ Manual ✅ Bayesian

Verdict: Builder combines Bazel's hermetic builds, Buck2's performance focus, and adds genuinely novel features (economics, dynamic graphs, statistical testing). More innovations per line of code than any competitor.

Why D?

Compile-Time Metaprogramming: True CTFE, templates, and mixins generate optimized code at compile time. Not preprocessor tricks—actual language evaluation during compilation.

Zero-Cost Abstractions: Strong typing with Result monads, LanguageHandler interfaces, and domain objects compiled to optimal machine code. Runtime cost: zero.

Performance: LLVM backend (LDC) generates code comparable to C++. Native compilation, SIMD support, no garbage collection in hot paths.

Memory Safety: @safe by default with compile-time verification. Selective @trusted for C interop with documentation.

Modern Features: Ranges (lazy evaluation), UFCS (uniform function call syntax), templates, mixins, static introspection, compile-time function execution.

C/C++ Interop: Seamless integration with BLAKE3 C implementation, SIMD intrinsics, and existing build tools.

Project Statistics

  • Lines of Code: ~48,000 (D), ~3,000 (C for SIMD/BLAKE3)
  • Modules: 517 documented modules
  • Test Coverage: Comprehensive unit and integration tests
  • Languages Supported: 27 language handlers
  • Architecture Quality: Result monads throughout, zero any types, arena allocators, circuit breakers
  • Genuine Innovations: Dynamic build graphs, economic optimization, SIMD serialization, statistical flaky detection, process-based plugins

Architecture

The codebase follows clean architectural principles with modular separation:

  • source/runtime/ - Execution engine with service architecture
  • source/caching/ - Multi-tier caching with distributed support
  • source/analysis/ - Dependency analysis and incremental tracking
  • source/languages/ - Language handlers (27 languages)
  • source/config/ - DSL parsing, AST, scripting, macros
  • source/cli/ - Event-driven CLI rendering
  • source/testframework/ - Advanced test execution
  • source/distributed/ - Distributed build coordination
  • source/telemetry/ - Observability and analytics
  • source/plugins/ - Process-based plugin system
  • source/query/ - Query language implementation
  • source/errors/ - Type-safe error handling with Result monads
  • source/graph/ - Build graph with dynamic discovery
  • source/utils/ - SIMD, crypto (BLAKE3), concurrency primitives
  • source/lsp/ - Complete Language Server Protocol

Documentation

Contributing

See CONTRIBUTING.md for development guidelines.

License

Griffin License v1.0—See LICENSE for complete terms.

Key Terms:

  • ✅ Free to use, modify, and distribute
  • ✅ Commercial use permitted
  • ⚠️ Attribution required in derivative works
  • 🚫 No patents or trademarks on concepts herein

What Makes Builder Different

Economics-Aware: First build system to treat compute as economic asset. Optimize for cost, time, or both with Pareto frontiers.

Research-Quality Engineering: SIMD serialization with schema evolution. Statistical flaky detection with Bayesian inference. OpenTelemetry observability. Circuit breaker resilience patterns.

Three-Tier Caching: Target cache, action cache, and distributed cache provide comprehensive build output reuse at multiple granularities.

Modern Type Systems: Rust-style Result monads with full monadic operations. No exceptions in hot paths. Composable error handling prevents information loss.

Systems-Level Performance: Arena allocators, object pooling, zero-copy deserialization, SIMD everywhere possible. C++ performance with D safety.

Production Observability: Distributed tracing enabled by default. Structured logging. Flamegraph generation. Build replay. Circuit breakers with metrics.


Builder represents a generational advancement in build system architecture: dynamic graphs eliminate code generation complexity, economic optimization treats compute as an asset, SIMD-accelerated serialization provides exceptional performance, multi-tier caching maximizes reuse, and enterprise resilience patterns prevent cascading failures. Built for modern polyglot monorepos with production-grade distributed systems engineering.

About

Polyglot build framework - 5x less boilerplate than Bazel, easier learning curve than Buck2.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published