LLM Controlled Generation

This homework explores three complementary approaches to controlling the output of Large Language Models, progressing from token-level constraints to multi-sample selection to iterative self-correction with formal verification feedback.

Overview

The homework is organized in three parts, each building on concepts from the previous one:

Part	Folder	Topic	Key idea
1	`structured_generation/`	Structured Generation	Constrain each token at generation time using regex-compiled FSMs
2	`meta_generation/`	Meta-Generation	Generate multiple samples and select the best using ranking strategies
3	`self_correction/`	Self-Correction	Iteratively refine code using verifier error feedback and tree search

📄 Complete instructions: llm_controlled_generation.pdf

Part 1 — Structured Generation

LLMs generate text token by token, sampling from the full vocabulary at each step. Structured generation guarantees that every generated token is part of a valid output by constraining the sampling at each step. You will build a pipeline that compiles a regex into a token-level finite state machine, then use it to mask logits during generation with a pure-NumPy GPT-2 implementation.

Modules: naive vs. DFA masking, character-to-token FSM compilation, coalescence & forced-token optimization, end-to-end constrained generation with picoGPT.

Part 2 — Meta-Generation

When a single LLM sample may or may not be correct, a natural idea is to generate multiple samples and select the best one. You will implement and compare four selection strategies — pass@1, best-of-n by log-probability, and two Minimum Bayes Risk (MBR) decoders using edit similarity and execution similarity — evaluated on the MBPP coding benchmark.

Modules: parallel sample generation, ranking by model confidence, MBR with Levenshtein distance, MBR with execution agreement.

Part 3 — Self-Correction

Some tasks require provably correct code. You will build a tree-search refinement pipeline that generates formally verified Rust code using Verus. The LLM generates candidates, Verus checks them, and compiler error messages drive iterative refinement. A REBASE sampling strategy focuses search effort on the most promising branches.

Modules: parallel initial generation, Verus integration and scoring, code validation and trivial-proof detection, iterative refinement with tree search and REBASE.

Setup

Requirements

Python >= 3.10
Ollama (for Parts 2 and 3)
Verus (for Part 3 only)

Installation

Clone the repo and then run:

cd llm_controlled-generation
uv sync

Dependencies

Installed automatically via uv sync:

numpy, scipy — numerical computation
regex, interegular — regex parsing and DFA construction (Part 1)
transformers — tokenizer and model weights (Part 1)
datasets — MBPP benchmark (Part 2)
jellyfish — Levenshtein distance (Part 2)
openai — OpenAI-compatible API client for Ollama (Parts 2 and 3)

Ollama setup (Parts 2 and 3)

ollama pull qwen2.5-coder:7b
ollama serve

Running

Part 1 — Structured Generation

Each module can be run independently:

uv run python structured_generation/deterministic_finite_automaton.py   # Naive vs. DFA masking benchmark
uv run python structured_generation/fsm_token.py                        # Character-to-token FSM demo
uv run python structured_generation/coalescence.py                      # Coalescence benchmark on GPT-2 vocab
uv run python structured_generation/picoGPT_generate.py                 # End-to-end constrained generation

Part 2 — Meta-Generation

uv run python meta_generation/meta_generation.py   # Evaluate selection strategies on MBPP

Part 3 — Self-Correction

uv run python self_correction/self_correction.py --verus-path /path/to/verus

Tests

Run all required tests:

uv run pytest tests/test_structured_generation_required.py   # Part 1
uv run pytest tests/test_meta_generation_required.py          # Part 2
uv run pytest tests/test_self_correction_required.py          # Part 3

Or run all benchmarks and tests at once:

./test_and_submit.sh

Project Structure

llm_controlled-generation/
├── structured_generation/
│   ├── structured_generation.md           # Part 1 description
│   ├── deterministic_finite_automaton.py  # Naive vs. DFA masking
│   ├── fsm_token.py                      # Character-level DFA to token-level FSM
│   ├── coalescence.py                    # Mask deduplication and forced tokens
│   ├── picoGPT_generate.py              # GPT-2 in NumPy + constrained generation
│   └── utils.py                          # GPT-2 weight loading
├── meta_generation/
│   ├── meta_generation.md                 # Part 2 description
│   ├── meta_generation.py               # Sample generation and ranking strategies
│   └── utils.py                          # Prompt construction and safe execution
├── self_correction/
│   ├── self_correction.md                 # Part 3 description
│   ├── self_correction.py                # Tree-search refinement pipeline
│   ├── utils.py                          # Verus integration, validation, scoring
│   └── examples/
│       └── incr_list.rs                  # Example Verus specification
├── tests/                                 # Test suite
├── pyproject.toml
└── README.md                              # This file

License

Apache 2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Controlled Generation

Overview

Part 1 — Structured Generation

Part 2 — Meta-Generation

Part 3 — Self-Correction

Setup

Requirements

Installation

Dependencies

Ollama setup (Parts 2 and 3)

Running

Part 1 — Structured Generation

Part 2 — Meta-Generation

Part 3 — Self-Correction

Tests

Project Structure

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
meta_generation		meta_generation
self_correction		self_correction
structured_generation		structured_generation
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
llm_controlled_generation.pdf		llm_controlled_generation.pdf
pyproject.toml		pyproject.toml
test_and_submit.sh		test_and_submit.sh
uv.lock		uv.lock

License

dataflowr/llm_controlled-generation

Folders and files

Latest commit

History

Repository files navigation

LLM Controlled Generation

Overview

Part 1 — Structured Generation

Part 2 — Meta-Generation

Part 3 — Self-Correction

Setup

Requirements

Installation

Dependencies

Ollama setup (Parts 2 and 3)

Running

Part 1 — Structured Generation

Part 2 — Meta-Generation

Part 3 — Self-Correction

Tests

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages