diff --git a/docs/cli/configuration/settings.mdx b/docs/cli/configuration/settings.mdx index 93e5b974..a05536cc 100644 --- a/docs/cli/configuration/settings.mdx +++ b/docs/cli/configuration/settings.mdx @@ -62,7 +62,7 @@ Choose the default AI model that powers your droid: - **`gpt-5.2`** - OpenAI GPT-5.2 - **`haiku`** - Claude Haiku 4.5, fast and cost-effective - **`gemini-3-pro`** - Gemini 3 Pro -- **`droid-core`** - GLM-4.6 open-source model +- **`droid-core`** - GLM-4.7 open-source model - **`custom-model`** - Your own configured model via BYOK [You can also add custom models and BYOK.](/cli/configuration/byok) diff --git a/docs/cli/droid-exec/overview.mdx b/docs/cli/droid-exec/overview.mdx index 48ff98e4..5880434d 100644 --- a/docs/cli/droid-exec/overview.mdx +++ b/docs/cli/droid-exec/overview.mdx @@ -76,7 +76,7 @@ Supported models (examples): - gpt-5.1-codex - gpt-5.1 - gemini-3-pro-preview -- glm-4.6 +- glm-4.7 See the [model table](/pricing#pricing-table) for the full list of available models and their costs. diff --git a/docs/cli/user-guides/choosing-your-model.mdx b/docs/cli/user-guides/choosing-your-model.mdx index 50a13fb5..98243cd2 100644 --- a/docs/cli/user-guides/choosing-your-model.mdx +++ b/docs/cli/user-guides/choosing-your-model.mdx @@ -4,11 +4,11 @@ description: Balance accuracy, speed, and cost by picking the right model and re keywords: ['model', 'models', 'llm', 'claude', 'sonnet', 'opus', 'haiku', 'gpt', 'openai', 'anthropic', 'choose model', 'switch model'] --- -Model quality evolves quickly, and we tune the CLI defaults as the ecosystem shifts. Use this guide as a snapshot of how the major options compare today, and expect to revisit it as we publish updates. This guide was last updated on Thursday, December 4th 2025. +Model quality evolves quickly, and we tune the CLI defaults as the ecosystem shifts. Use this guide as a snapshot of how the major options compare today, and expect to revisit it as we publish updates. This guide was last updated in February 2026. --- -## 1 · Current stack rank (December 2025) +## 1 · Current stack rank (February 2026) | Rank | Model | Why we reach for it | | ---- | ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | @@ -20,7 +20,11 @@ Model quality evolves quickly, and we tune the CLI defaults as the ecosystem shi | 6 | **Claude Haiku 4.5** | Fast, cost-efficient for routine tasks and high-volume automation. | | 7 | **Gemini 3 Pro** | Strong at mixed reasoning with Low/High settings; helpful for researchy flows with structured outputs. | | 8 | **Gemini 3 Flash** | Fast, cheap (0.2× multiplier) with full reasoning support; great for high-volume tasks where speed matters. | -| 9 | **Droid Core (GLM-4.6)** | Open-source, 0.25× multiplier, great for bulk automation or air-gapped environments; note: no image support. | +| 9 | **Droid Core (GLM-4.7)** | Open-source, 0.25× multiplier, great for bulk automation or air-gapped environments; note: no image support. | + + + If your organization has access, **Claude Opus 4.6** (and **Opus 4.6 Fast Mode**) may appear as additional options. **Opus 4.6 Fast Mode** is available for some accounts at a promotional rate through **Monday, February 16**. + We ship model updates regularly. When a new release overtakes the list above, @@ -65,7 +69,7 @@ Tip: you can swap models mid-session with `/model` or by toggling in the setting - **GPT-5.2**: Low / Medium / High (default: Low) - **Gemini 3 Pro**: Low / High (default: High) - **Gemini 3 Flash**: Minimal / Low / Medium / High (default: High) -- **Droid Core (GLM-4.6)**: None only (default: None; no image support) +- **Droid Core (GLM-4.7)**: None only (default: None; no image support) Reasoning effort increases latency and cost—start low for simple work and escalate as needed. **Extra High** is only available on GPT-5.1-Codex-Max. @@ -82,14 +86,14 @@ Factory ships with managed Anthropic and OpenAI access. If you prefer to run aga ### Open-source models -**Droid Core (GLM-4.6)** is an open-source alternative available in the CLI. It's useful for: +**Droid Core (GLM-4.7)** is an open-source alternative available in the CLI. It's useful for: - **Air-gapped environments** where external API calls aren't allowed - **Cost-sensitive projects** needing unlimited local inference - **Privacy requirements** where code cannot leave your infrastructure - **Experimentation** with open-source model capabilities -**Note:** GLM-4.6 does not support image attachments. For image-based workflows, use Claude or GPT models. +**Note:** GLM-4.7 does not support image attachments. For image-based workflows, use Claude or GPT models. To use open-source models, you'll need to configure them via BYOK with a local inference server (like Ollama) or a hosted provider. See [BYOK documentation](/cli/configuration/byok) for setup instructions. diff --git a/docs/guides/building/droid-exec-tutorial.mdx b/docs/guides/building/droid-exec-tutorial.mdx index 246dc57b..e14e4d83 100644 --- a/docs/guides/building/droid-exec-tutorial.mdx +++ b/docs/guides/building/droid-exec-tutorial.mdx @@ -78,8 +78,8 @@ The Factory example uses a simple pattern: spawn `droid exec` with `--output-for function runDroidExec(prompt: string, repoPath: string) { const args = ["exec", "--output-format", "debug"]; - // Optional: configure model (defaults to glm-4.6) - const model = process.env.DROID_MODEL_ID ?? "glm-4.6"; + // Optional: configure model (defaults to glm-4.7) + const model = process.env.DROID_MODEL_ID ?? "glm-4.7"; args.push("-m", model); // Optional: reasoning level (off|low|medium|high) @@ -105,7 +105,7 @@ function runDroidExec(prompt: string, repoPath: string) { - Alternative: `--output-format json` for final output only **`-m` (model)**: Choose your AI model -- `glm-4.6` - Fast, cheap (default) +- `glm-4.7` - Fast, cheap (default) - `gpt-5-codex` - Most powerful for complex code - `claude-sonnet-4-5-20250929` - Best balance of speed and capability @@ -311,7 +311,7 @@ The example supports environment variables: ```bash # .env -DROID_MODEL_ID=gpt-5-codex # Default: glm-4.6 +DROID_MODEL_ID=gpt-5-codex # Default: glm-4.7 DROID_REASONING=low # Default: low (off|low|medium|high) PORT=4000 # Default: 4000 HOST=localhost # Default: localhost @@ -376,7 +376,7 @@ fs.writeFileSync('./repos/site-content/page.md', markdown); function runWithModel(prompt: string, model: string) { return Bun.spawn([ "droid", "exec", - "-m", model, // glm-4.6, gpt-5-codex, etc. + "-m", model, // glm-4.7, gpt-5-codex, etc. "--output-format", "debug", prompt ], { cwd: repoPath }); diff --git a/docs/guides/building/droid-vps-setup.mdx b/docs/guides/building/droid-vps-setup.mdx index e21ce934..494b4ee2 100644 --- a/docs/guides/building/droid-vps-setup.mdx +++ b/docs/guides/building/droid-vps-setup.mdx @@ -182,15 +182,15 @@ The real power of running droid on a VPS is `droid exec` - a headless mode that ### Basic droid exec usage ```bash -# Simple query with a fast model (GLM 4.6) -droid exec --model glm-4.6 "Tell me a joke" +# Simple query with a fast model (GLM 4.7) +droid exec --model glm-4.7 "Tell me a joke" ``` ### Advanced: System exploration ```bash # Ask droid to explore your system and find specific information -droid exec --model glm-4.6 "Explore my system and tell me where the file is that I'm serving with Nginx" +droid exec --model glm-4.7 "Explore my system and tell me where the file is that I'm serving with Nginx" ``` Droid will: diff --git a/docs/guides/power-user/prompt-crafting.mdx b/docs/guides/power-user/prompt-crafting.mdx index 973d9ff6..96ba0887 100644 --- a/docs/guides/power-user/prompt-crafting.mdx +++ b/docs/guides/power-user/prompt-crafting.mdx @@ -376,7 +376,7 @@ Match the model to the task: | **Feature implementation** | Sonnet 4.5 or GPT-5.1-Codex | Medium | | **Quick edits, formatting** | Haiku 4.5 | Off/Low | | **Code review** | GPT-5.1-Codex-Max | High | -| **Bulk automation** | GLM-4.6 (Droid Core) | None | +| **Bulk automation** | GLM-4.7 (Droid Core) | None | | **Research/analysis** | Gemini 3 Pro | High | --- diff --git a/docs/guides/power-user/token-efficiency.mdx b/docs/guides/power-user/token-efficiency.mdx index 8c056874..23ba1f56 100644 --- a/docs/guides/power-user/token-efficiency.mdx +++ b/docs/guides/power-user/token-efficiency.mdx @@ -134,13 +134,14 @@ Different models have different cost multipliers and capabilities. Match the mod | Model | Multiplier | Best For | |-------|------------|----------| -| GLM-4.6 (Droid Core) | 0.25× | Bulk automation, simple tasks | +| GLM-4.7 (Droid Core) | 0.25× | Bulk automation, simple tasks | +| Gemini 3 Flash | 0.2× | High-volume tasks, quick processing | | Claude Haiku 4.5 | 0.4× | Quick edits, routine work | | GPT-5.1 / GPT-5.1-Codex | 0.5× | Implementation, debugging | +| GPT-5.2 | 0.7× | Harder implementation, deeper reasoning | | Gemini 3 Pro | 0.8× | Research, analysis | | Claude Sonnet 4.5 | 1.2× | Balanced quality/cost | | Claude Opus 4.5 | 2× | Complex reasoning, architecture | -| Claude Opus 4.1 | 6× | Maximum capability (use sparingly) | ### Task-Based Model Selection diff --git a/docs/pricing.mdx b/docs/pricing.mdx index bea6710f..adb4b24e 100644 --- a/docs/pricing.mdx +++ b/docs/pricing.mdx @@ -24,7 +24,7 @@ Different models have different multipliers applied to calculate Standard Token | Model | Model ID | Multiplier | | ------------------------ | ---------------------------- | ---------- | -| Droid Core | `glm-4.6` | 0.25× | +| Droid Core | `glm-4.7` | 0.25× | | Claude Haiku 4.5 | `claude-haiku-4-5-20251001` | 0.4× | | GPT-5.1 | `gpt-5.1` | 0.5× | | GPT-5.1-Codex | `gpt-5.1-codex` | 0.5× | @@ -35,6 +35,10 @@ Different models have different multipliers applied to calculate Standard Token | Claude Sonnet 4.5 | `claude-sonnet-4-5-20250929` | 1.2× | | Claude Opus 4.5 | `claude-opus-4-5-20251101` | 2× | + + **Promo:** Claude Opus 4.6 **Fast Mode** is available for some accounts at a promotional rate through **Monday, February 16**. + + ## Thinking About Tokens As a reference point, using GPT-5.1-Codex at its 0.5× multiplier alongside our typical cache ratio of 4–8× means your effective Standard Token usage goes dramatically further than raw on-demand calls. Switching to very expensive models frequently—or rotating models often enough to invalidate the cache—will lower that benefit, but most workloads see materially higher usage ceilings compared with buying capacity directly from individual model providers. Our aim is for you to run your workloads without worrying about token math; the plans are designed so common usage patterns outperform comparable direct offerings. diff --git a/docs/reference/cli-reference.mdx b/docs/reference/cli-reference.mdx index 516361c7..17ca8b20 100644 --- a/docs/reference/cli-reference.mdx +++ b/docs/reference/cli-reference.mdx @@ -108,7 +108,7 @@ droid exec --auto high "Run tests, commit, and push changes" | `claude-haiku-4-5-20251001` | Claude Haiku 4.5 | Yes (Off/Low/Medium/High) | off | | `gemini-3-pro-preview` | Gemini 3 Pro | Yes (Low/High) | high | | `gemini-3-flash-preview` | Gemini 3 Flash | Yes (Minimal/Low/Medium/High) | high | -| `glm-4.6` | Droid Core (GLM-4.6) | None only | none | +| `glm-4.7` | Droid Core (GLM-4.7) | None only | none | Custom models configured via [BYOK](/cli/configuration/byok) use the format: `custom:`