Skip to content

jespera/diffract

Repository files navigation

diffract

An OCaml library and CLI tool for parsing source files using tree-sitter and pattern matching with concrete syntax.

Features

  • Parse source files to S-expressions using tree-sitter grammars
  • Pattern matching with concrete syntax and metavariables
  • Semantic patch transforms (find-and-replace at the AST level)
  • Expansion transforms: join or restructure each element of a matched sequence
  • Support for TypeScript, Kotlin, PHP, Scala, and other languages (extensible)

Building

Prerequisites

  • OCaml 5.2+
  • opam
  • tree-sitter library (libtree-sitter) - see below
  • npm (to fetch grammar sources)

Installing tree-sitter

macOS:

brew install tree-sitter

Ubuntu/Debian:

sudo apt install libtree-sitter-dev

Arch Linux:

sudo pacman -S tree-sitter

Build Steps

# Install OCaml dependencies (add --with-test to include test/benchmark deps)
opam install . --deps-only --with-test

# Build grammar libraries (TypeScript, Kotlin)
cd grammars && ./build-grammars.sh && cd ..

# Build the project
dune build

# Run tests
dune test

# Format code (requires ocamlformat: opam install ocamlformat)
dune fmt

Formatting is managed via ocamlformat. Run dune fmt to reformat all OCaml and dune files before committing.

Note: dune fmt emits "Stray '@'" warnings for @@ delimiters appearing in doc comments. These are harmless and can be ignored.

Usage

# Parse and print parsed tree
diffract example.ts

# Parse with explicit language
diffract --language kotlin example.kt

# Match a pattern against a single file
diffract --match pattern.txt source.ts

# Scan a directory for pattern matches
diffract --match pattern.txt --include '*.ts' src/

# Scan with custom directory exclusions
diffract --match pattern.txt --include '*.ts' -e vendor -e dist src/

# Apply a semantic patch (preview diff)
diffract --apply --match patch.txt source.ts

# Apply a semantic patch in place
diffract --apply --in-place --match patch.txt source.ts

# Apply across a directory
diffract --apply --match patch.txt --include '*.ts' src/

# List available languages
diffract --list-languages

Transforms (Semantic Patches)

Patterns can include -/+ prefixed lines to describe code transformations. For example, to rename console.log to logger.info:

patch.txt:

@@
match: strict
metavar $MSG: single
@@
- console.log($MSG)
+ logger.info($MSG)
$ diffract --apply --match patch.txt source.ts
--- a/source.ts
+++ b/source.ts
@@ -1,3 +1,3 @@
 function greet(name: string) {
-    console.log(name);
+    logger.info(name);
 }

Lines prefixed with - are matched and removed; lines with + are inserted. Unprefixed (or space-prefixed) lines are context that appears in both match and replace. Metavariables carry values from the match side to the replace side.

Expansion transforms

Use a separator character as a line prefix (instead of + ) to expand each element of a sequence metavar and join the results. Any punctuation character that isn't a reserved spatch marker or identifier character works; ~ stands for newline, every other character is used literally as the join string:

patch.txt — move one export, comma-join the rest:

@@
match: strict
metavar $BEFORE: sequence
metavar $AFTER: sequence
@@
- import { $BEFORE Stack $AFTER } from "@mui/system";
+ import {
,   $BEFORE $AFTER
+ } from "@mui/system";
+ import { Stack } from "@mui/not.system";

For per-element transforms (e.g. converting a match expression to a method chain), use a two-section pattern and Match.transform_nested:

@@
match: strict
metavar $TAG: single
metavar $CASES: sequence
@@
- matchStringExhaustive($TAG, {
-   $CASES
- });
+ match($TAG)
~   $CASES
+   .exhaustive();
@@
match: field
on $CASES
metavar $KEY: single
metavar $VAL: single
@@
- $KEY: $VAL
+ .with("$KEY", $VAL)

Applied to:

matchStringExhaustive(tag, { A: () => 1, B: () => 2 });

Produces:

match(tag)
.with("A", () => 1)
.with("B", () => 2)
.exhaustive();

See Transform documentation for partial-mode, field-mode, and expansion transforms.

Directory Scanning

When the target is a directory, use --include to specify which files to scan:

Option Description
--include GLOB / -i Glob pattern for files (e.g., *.ts, *.py). Required for directories.
--exclude DIR / -e Directory names to skip (repeatable). Defaults: node_modules, .git, _build, target, __pycache__, .hg, .svn

Supported glob patterns:

  • *.ts - files ending with .ts
  • prefix* - files starting with prefix
  • *suffix - files ending with suffix

Example output:

src/api/auth.ts:15: console.log("login")
  $msg = "login"

src/utils/logger.ts:8: console.log("initialized")
  $msg = "initialized"

Found 2 match(es) in 2 file(s) (scanned 47 files)

Documentation

Match Architecture (Quick Overview)

The matching pipeline is split into focused modules:

  • match_parse handles @@ preambles, metavars, ellipsis expansion, and spatch line classification.
  • match_engine performs the structural matching (strict, field, partial) and sequence metavars (except in partial mode).
  • match_search drives traversal, nested pattern contexts, indexing, and formatting.
  • match_transform computes edits from match results and applies them to source text.
  • match exposes the public API surface.

License

GPL-3.0-or-later

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors