-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Feature: token usage display #3942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @Templight41, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new feature to the ADK development UI that provides real-time visibility into Gemini model token usage and associated costs. Previously, users lacked the ability to monitor token consumption and estimate API call expenses, hindering usage optimization. The solution integrates both backend logic for calculating costs based on live or cached pricing data and a frontend display that shows cumulative token counts and estimated USD costs directly within the UI, enhancing user awareness and control over their LLM interactions. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature for displaying token usage and cost in the developer UI. The implementation is well-structured, with a clear separation between the backend pricing logic and the frontend display script. The use of live pricing with a fallback to hardcoded defaults is a robust design choice. My review includes suggestions to improve the maintainability of the JavaScript code, enhance the robustness of the HTML script injection, and fix a potential bug in the pricing page parser. Overall, this is a great addition to the project.
|
Thanks for the PR, could you please open PR to https://github.com/google/adk-web for UI related changes? |
Sure I'll work on it |
Removed frontend-specific code as per maintainer feedback: - Deleted src/google/adk/cli/browser/token-usage-display.js - Removed JavaScript injection endpoint from adk_web_server.py Backend API remains intact: - Token cost calculation in base_llm_flow.py - cost_usd field in LlmResponse model - Gemini pricing service with live API fetching - All unit tests passing (12/12) Frontend implementation will be done in the separate adk-web repository.
|
@seanzhou1023 I've updated the code here to remove frontend which I will continue on adk-web. Please take a look. |
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
@seanzhou1023 google/adk-web#323 Created PR for frontend change |
|
Hi @Templight41, Thank you for your contribution! We appreciate you taking the time to submit this pull request. |
This web scraping does not rely on CSS classes so it would be unaffected there, but i agree on the page layout causing an issue to this as this tool looks for the table with the models. I was unable to find any APIs to fetch the model pricing and had to rely on web scraping as the last resort. For currency formats, USD seems just right as google itself provides the pricing in USD only. To get local currency formats we would require currency conversion based on the live conversion rates. |
Problem:
The ADK dev UI lacks visibility into token usage and associated costs when interacting with Gemini models. Users cannot track how many tokens are being consumed or estimate the cost of their API calls, making it difficult to monitor usage and optimize prompts.
Solution:
Implemented a comprehensive token usage and cost tracking feature that:
cost_usdfield toLlmResponseandEventmodels that gets populated during streamingTesting Plan
Unit Tests:
Summary of pytest results:
Manual End-to-End (E2E) Tests:
Setup:
source .venv/bin/activateadk web testhttp://localhost:8000Test Cases:
Live pricing fetch on first use:
Token cost calculation in SSE events:
costUsdfield appears in SSE events{ "content": {...}, "usageMetadata": { "candidatesTokenCount": 9, "promptTokenCount": 35, "totalTokenCount": 44 }, "costUsd": 0.000033, "author": "root_agent" }UI Display:
$0.00 | 0 tokensServer-side logging:
Fallback behavior:
Multiple model support:
Checklist
Additional context
Implementation Details:
Backend (Python):
src/google/adk/utils/gemini_pricing.pywith live pricing fetch and cachingbase_llm_flow.pyto calculate costs in_finalize_model_response_event()cost_usdfield toLlmResponsemodel with Pydantic aliascostUsdfor JSON serializationusage_metadataFrontend (JavaScript):
src/google/adk/cli/browser/token-usage-display.jsinjected into dev UI/run_sseresponses to extract cost dataPricing Architecture:
https://cloud.google.com/vertex-ai/generative-ai/pricingTesting Strategy:
enable_fetchparameter toGeminiPricingServicefor test controlenable_fetch=Falseto avoid network calls and use hardcoded defaultsFiles Changed:
src/google/adk/utils/gemini_pricing.py(NEW - 400+ lines)src/google/adk/cli/browser/token-usage-display.js(NEW - 465 lines)tests/unittests/utils/test_gemini_pricing.py(NEW - 12 comprehensive tests)src/google/adk/models/llm_response.py(addedcost_usdfield)src/google/adk/flows/llm_flows/base_llm_flow.py(added cost calculation logic)src/google/adk/cli/adk_web_server.py(added script injection endpoint)Design Decisions:
Why one-time fetch vs. periodic refresh?
Why HTML parsing instead of official API?
Why inject JavaScript instead of modifying Angular app?