feat: Epiplexity Demo 1 - Game of Life Emergent Objects#10
Conversation
|
You have run out of free Bugbot PR reviews for this billing cycle. This will reset on March 7. To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 229ed4d2d7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| _try_run_with_pynetlogo(args.ticks) | ||
| _generate_baseline_if_missing(rows=args.ticks, force=args.refresh_baseline) |
There was a problem hiding this comment.
Skip baseline generation after a successful live run
The script ignores the boolean result of _try_run_with_pynetlogo and always calls _generate_baseline_if_missing, which can silently replace real NetLogo outputs with simulated baseline data. In a clean results directory, a live run creates bounded-output.csv and persistent-output.csv but not demo-output.csv, so the baseline path runs and rewrites both files, corrupting the experiment summary and any downstream analysis.
Useful? React with 👍 / 👎.
| link.load_model(str(MODEL_PATH)) | ||
| link.command(f"set episode-length {ticks}") | ||
| link.command("run-episode-bounded") | ||
| link.command("run-episode-persistent") |
There was a problem hiding this comment.
Reapply episode-length before persistent NetLogo run
The live path sets episode-length only once before run-episode-bounded, but run-episode calls setup, and setup-defaults resets episode-length to 50 in the NetLogo model. That means run-episode-persistent runs at 50 ticks whenever --ticks is not 50, producing unequal run lengths and biasing the bounded-vs-persistent comparison.
Useful? React with 👍 / 👎.
Code Review: PR #10 - Game of Life Emergent ObjectsWhat Looks Good
Issues Found
Suggestions
ApprovalApproved with minor fix recommended. Core logic correctly validates Paradox 1. |
229ed4d to
1e3e843
Compare
Major changes:
- Convert from .nlogo to .nlogox (NetLogo 7.0.3 XML format)
- Replace random observer walk with deterministic waypoint navigation
visiting glider(10,10), blinker(30,30), block(20,20), random(25,25)
- Rich temporal memory: window-history-buffer stores (tick, grid, label)
tuples instead of flat text labels
- Fair comparison: both episodes re-seed identically for same GoL state
- Add real-time GUI: prediction accuracy plot, label/prediction monitors,
output widget with per-tick narration
- Add stop button for interrupting long episodes
- Configure for Ollama (local) instead of OpenAI
- Remove simulated baseline from test harness; require real CSV data
- Replace lower-case (not a NetLogo 7 builtin) with to-lower-case reporter
- Fix 'let label' shadowing NetLogo builtin turtle variable
- Update macro_predict.yaml template header for window history format
Known issues and improvements needed:
1. BLOCKING: Qwen 3 thinking mode — Qwen 3 models return empty content
field and put all output in a 'thinking' field. The LLM extension
reads message.content which is empty, so llm:choose falls back to
random selection. Fix options:
a) Use non-thinking models (qwen2.5, llama3, gemma2)
b) Patch OllamaProvider.scala to pass "think": false in request body
c) Patch OllamaProvider.parseProviderResponse to read message.thinking
when message.content is empty
2. Glider drift — The glider moves away from its seed position (10,10)
over GoL ticks, so the observer may see empty space at the waypoint.
Options: track glider position dynamically, increase observation
region, or accept it as part of the bounded-observer narrative.
3. BehaviorSpace headless — NetLogo 7.0.3 has a known bug where
BehaviorSpace headless mode fails with "head of empty list" even
on bundled sample models. Cannot use headless for automated testing.
Workaround: use sbt test for compilation checks, NetLogo GUI for
functional testing.
4. Config not tracked — config.txt is untracked (may contain API keys).
Currently set to provider=ollama model=qwen3:4b. Should be updated
to a non-thinking model (e.g., qwen2.5:3b) before running.
…-full-context # Conflicts: # .bundledFiles
…e per-tick LLM observer Completely rewrites the Game of Life demo so the LLM acts as a scientific observer — describing patterns in free text, predicting next grid states, and building theories over time. Memory vs no-memory is a live toggle. Key changes: - Per-tick LLM calls in `go` (describe + predict + periodic reflect) - New seeds: R-pentomino, Gosper glider gun, pulsar (dramatic evolution) - Interactive controls: memory-mode chooser, show-observations switch, reflect-every and episode-length sliders - 3 new YAML templates (describe, predict_grid, reflect) - Removed batch procedures (run-episode, run-comparison, analyze-results) - Removed constrained-choice labeling (llm:choose from fixed list) - Removed Python test harness (batch workflow obsolete) - History management: bounded clears per tick, persistent caps at 20 - Grid prediction accuracy scored cell-by-cell (0-100%) - Config monitor shows active file and model
Epiplexity Demo 1: Emergent Object Discovery
Validates Paradox 1 from From Entropy to Epiplexity (Finzi et al., 2026): arXiv:2601.03220
What This Does
Results
Files
demos/epiplexity-01-emergent-objects/game_of_life.nlogo- NetLogo modeldemos/epiplexity-01-emergent-objects/templates/- YAML prompts for LLMdemos/epiplexity-01-emergent-objects/results/- CSV output and plotsdemos/epiplexity-01-emergent-objects/tests/- Unit testsHow to Run
game_of_life.nlogoin NetLogo 6.4+config.txtsetupthengoresults/folder for CSV and SVG outputs