Skip to content

Validate Mistral Vibe and Gemini one-shot rates #351

@ozymandiashh

Description

@ozymandiashh

Summary

MorsCordis reported that Mistral/Vibe and Gemini are showing 100% one-shot rates across tracked categories. They also noted that Mistral/Vibe does not appear to track cache hits, which can inflate estimated cost.

Context

This came up as a follow-up on #336 after the subagent tracking work in #340 and #343.

The redacted Vibe logging structure shared there shows meta.json.stats with aggregate session-level fields such as session_prompt_tokens, session_completion_tokens, context_tokens, and session_cost, but no cache-read/cache-write token fields. messages.jsonl contains message and tool-call structure, but no per-assistant-message usage or cache fields.

What to validate

  • Mistral/Vibe one-shot rates: CodeBurn currently parses Vibe usage as one coarse session-level provider call, which may not give the classifier enough per-turn structure to produce reliable one-shot/retry rates.
  • Gemini one-shot rates: Gemini now emits per-assistant-message calls, but the one-shot/retry classifier should still be validated against representative sessions because 100% across categories is suspicious.
  • Mistral/Vibe cache accounting: if Vibe does not persist cache-read/cache-write token fields locally, CodeBurn cannot recover exact cache accounting from logs. This should be documented and, if possible, wired up later when Vibe exposes cache token fields.

Notes

CodeBurn one-shot metrics are not based on cache hits. They are based on edit turns with zero detected retries/self-corrections, so one-shot validation and cache accounting should be treated as related observations but separate fixes.

Requested upstream Vibe fields that would help CodeBurn:

  • cache read input tokens
  • cache creation/write input tokens
  • whether session_prompt_tokens includes cached tokens
  • per-assistant-message usage, if available in a future Vibe release

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions