CI (main) Live Reports Python Install License
Let's Bring Brain To Robots
Visible robotics demos driven by VLM policies, OpenClaw, and AI coding agents.
Roboclaws is a thin demo repo for making AI-driven robotics behavior reviewable: frames, maps, tool traces, scores, and public/private evaluation boundaries are published as HTML reports instead of buried in terminal logs.
It answers three practical questions:
- How can an AI agent drive a robot?
- What context and tools does the agent need?
- What did the agent actually do in the simulated or robot-backed world?
Roboclaws treats reusable robot behavior as skills first and MCP tools as a bounded public robot capability surface.
| Principle | Practice |
|---|---|
| Start from open-ended goals | A user asks for work such as "clean the room" or "take useful photos"; an agent selects or creates a skill to do it. |
| Keep tasks as run surfaces | Public commands such as semantic-map-build and household-cleanup own parameters, reports, and acceptance gates. |
| Keep strategy in skills | Skills own prompt strategy, scripts, examples, checks, and task-specific loops such as photo capture or cleanup. |
| Keep MCP bounded | MCP tools expose semantic robot capabilities like observe, move, pick, place, and done; they should not hide a whole task behind one opaque call. |
| Profile public capabilities | Semantic profiles describe reusable capability environments that skills can require; profiles compose by requirement, not by copying another profile's tools. |
| Label privileged help | Simulator or demo helpers such as full object inventory and target-relative teleport are useful, but they stay labeled as privileged tools, not canonical robot abilities. |
| Protect private evaluation truth | Hidden mess sets, acceptable destinations, private manifests, and scoring truth stay out of public profile metadata and agent-facing skill inputs. |
| Let reports improve skills | Traces, artifacts, and evals feed the skill lifecycle: improve, split, merge, prune, or promote behavior only when the boundary is stable. |
The working abstraction ladder is:
open-ended goal
-> runnable task
-> agent skill
-> capability profile requirements
-> MCP capability tools
-> backend variant
Default decision: improve or add a skill when behavior changes; add or rename a runnable task only when the public command, parameters, report shape, or acceptance gates change. Promote behavior into MCP only when multiple skills need it, the input/output shape is stable, public/private boundaries are clear, and traces can preserve the important substeps. The detailed profile and skill reference is docs/human/mcp-skills-and-semantic-profiles.md.
Install the project once:
uv sync --extra dev --extra openclawFor MolmoSpaces/MuJoCo cleanup demos, include the heavier extra:
uv sync --extra dev --extra molmospacesThe public command grammar is:
just task::run <task> <driver> [report|profile] [key=value ...]For full command routing, profiles, and maintainer-only recipes, read just/README.md.
To monitor and launch the supported local coding-agent household routes from a standalone browser console, run:
just console::runThe console supports route metadata for local coding-agent drivers such as Codex and Claude Code; it does not accept arbitrary browser-submitted shell commands.
GitHub Actions publishes the report site at
miaodx.com/roboclaws. If a link looks stale,
check the CI workflow:
Pages republishes from successful main runs.
| Demo | What it proves | Run it locally | Live CI report |
|---|---|---|---|
| AI2-THOR territory | Multiple robots compete for reachable cells in an iTHOR scene. | Local VLM route is being repaired; use mock/OpenClaw reports for now. | mock, Kimi smoke, OpenClaw |
| AI2-THOR coverage | Multiple robots cooperate to cover as much of the room as possible. | just task::run coverage vlm visual agents=2 steps=100 |
mock, Kimi smoke, OpenClaw |
| OpenClaw navigation | OpenClaw Gateway agents control robots through the shared Roboclaws APIs. | just task::run ai2thor-nav openclaw visual |
openclaw/demo/report.html |
| Coding-agent MCP control | Docker-backed Codex or Claude Code drives the robot directly through MCP tools. | just task::run ai2thor-nav codex visual or just task::run ai2thor-nav claude visual |
Local-only today; reports write to output/runs/<stamp>/. |
| Photo task | A robot navigates the room and photographs chairs/sofas. | just task::run photo-chairs openclaw visual |
Local/OpenClaw report artifact. |
| Semantic map build | A no-cleanup sweep starts from the minimal navigation map and builds public runtime map evidence. Online runtime_metric_map.json output and converted Agibot navigation_memory.json can both feed the canonical Actionable Semantic Map Snapshot contract. |
just task::run semantic-map-build direct evidence_lane=world-oracle-labels seed=7 generated_mess_count=5 |
Local artifact today. |
| Household cleanup | A cleanup agent tidies a generated household mess from minimal map context while private scoring stays hidden. | just task::run household-cleanup direct evidence_lane=world-oracle-labels seed=7 generated_mess_count=5 |
Molmo live index, Kimi K2.6, MiMo v2.5 Pro, MiMo v2.5 |
| Household live agent | Docker-backed Claude Code or Codex connects to the cleanup MCP server and produces the same cleanup report shape. | just task::run household-cleanup claude evidence_lane=world-oracle-labels seed=7 generated_mess_count=5 |
Same Molmo live index; CI currently runs Claude Code through Kimi/MiMo provider profiles. |
| Agent operator console | Standalone local browser console for supported Codex and Claude Code household routes with backend locks, route-specific gates, live state, and artifact links. | just console::run |
Local-only operator surface. |
| Railway appliance | Single-container hosted demo with UI, viewer, Gateway, and AI2-THOR. | DEMO_PASSWORD=demo just appliance::run local |
Local appliance surface. |
| Maintainer gate | Fast mock confidence check before shipping repo changes. | just agent::verify mock |
CI status: workflow |
See ARCHITECTURE.md for the code map and the full operating mode contract.
| Need | Read |
|---|---|
| Code map and operating modes | ARCHITECTURE.md |
| Human setup/runbooks/domain docs | docs/human/README.md |
| Detailed MCP profile reference | docs/human/mcp-skills-and-semantic-profiles.md |
| Skill library convention | skills/README.md |
| Public command grammar | just/README.md |
| Local keys and report artifacts | docs/human/local-runtime.md |
| Coding-agent navigation guide | docs/human/coding-agent-nav-server.md |
| MolmoSpaces settings | docs/human/molmospaces-settings.md |
| Current project focus | STATUS.md |
| Agent operating rules | AGENTS.md |
- Roboharness - visual testing harness for AI coding agents in robot simulation
- Robowbc - whole-body-control experiments
- OpenClaw - open-source personal AI assistant
- ROSClaw - OpenClaw to ROS 2 bridge
- AI2-THOR - interactive 3D indoor simulation
MIT