-
Notifications
You must be signed in to change notification settings - Fork 120
[Docs] add benchmark refer #358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @jiapingW, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a documentation issue by fixing a broken link and significantly enhancing the benchmarking guide. It introduces a new, detailed document that provides users with clear instructions and examples for evaluating the performance of speculative decoding using the EAGLE3 algorithm, thereby improving the overall clarity and usability of the project's documentation. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds documentation for benchmarking, including fixing a broken link and creating a new guide. The changes are a good addition, but I've found a few issues in the new documentation's command examples. These include incorrect file paths, which would cause the commands to fail, and some minor inconsistencies and redundancies. My review provides specific suggestions to correct these issues, ensuring the documentation is accurate and easy for users to follow.
| python3 bench_eagle3.py \ | ||
| --model-path meta-llama/Llama-3.1-8B-Instruct \ | ||
| --speculative-draft-model-path lmsys/sglang-EAGLE3-LLaMA3.1-Instruct-8B \ | ||
| --port 30000 \ | ||
| --trust-remote-code \ | ||
| --mem-fraction-static 0.8 \ | ||
| --tp-size 1 \ | ||
| --attention-backend fa3 \ | ||
| --config-list 1,0,0,0 1,3,1,4 \ | ||
| --benchmark-list mtbench gsm8k:5 ceval:5:accountant \ | ||
| --dtype bfloat16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path to the bench_eagle3.py script is missing. Assuming users run commands from the repository root, the script will not be found. The path should be updated to benchmarks/bench_eagle3.py for the command to execute correctly.
| python3 bench_eagle3.py \ | |
| --model-path meta-llama/Llama-3.1-8B-Instruct \ | |
| --speculative-draft-model-path lmsys/sglang-EAGLE3-LLaMA3.1-Instruct-8B \ | |
| --port 30000 \ | |
| --trust-remote-code \ | |
| --mem-fraction-static 0.8 \ | |
| --tp-size 1 \ | |
| --attention-backend fa3 \ | |
| --config-list 1,0,0,0 1,3,1,4 \ | |
| --benchmark-list mtbench gsm8k:5 ceval:5:accountant \ | |
| --dtype bfloat16 | |
| python3 benchmarks/bench_eagle3.py \ | |
| --model-path meta-llama/Llama-3.1-8B-Instruct \ | |
| --speculative-draft-model-path lmsys/sglang-EAGLE3-LLaMA3.1-Instruct-8B \ | |
| --port 30000 \ | |
| --trust-remote-code \ | |
| --mem-fraction-static 0.8 \ | |
| --tp-size 1 \ | |
| --attention-backend fa3 \ | |
| --config-list 1,0,0,0 1,3,1,4 \ | |
| --benchmark-list mtbench gsm8k:5 ceval:5:accountant \ | |
| --dtype bfloat16 |
| python bench_eagle3.py \ | ||
| --model-path meta-llama/Llama-3.1-8B-Instruct \ | ||
| --port 30000 \ | ||
| --config-list 1,3,1,4 \ | ||
| --benchmark-list mtbench:5 ceval:5:accountant gsm8k:5 humaneval:5 math500:5 mtbench:5 aime:1 \ | ||
| --skip-launch-server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This command example has several issues that should be addressed:
- The path to
bench_eagle3.pyis missing. It should bebenchmarks/bench_eagle3.py. - For consistency with other examples in this file,
python3should be used instead ofpython. - The indentation of arguments is inconsistent with other code blocks. Using a standard 4-space indent improves readability.
- The benchmark
mtbench:5is listed twice in--benchmark-list, which is redundant.
| python bench_eagle3.py \ | |
| --model-path meta-llama/Llama-3.1-8B-Instruct \ | |
| --port 30000 \ | |
| --config-list 1,3,1,4 \ | |
| --benchmark-list mtbench:5 ceval:5:accountant gsm8k:5 humaneval:5 math500:5 mtbench:5 aime:1 \ | |
| --skip-launch-server | |
| python3 benchmarks/bench_eagle3.py \ | |
| --model-path meta-llama/Llama-3.1-8B-Instruct \ | |
| --port 30000 \ | |
| --config-list 1,3,1,4 \ | |
| --benchmark-list mtbench:5 ceval:5:accountant gsm8k:5 humaneval:5 math500:5 aime:1 \ | |
| --skip-launch-server |
docs/benchmarks/benchmark.md
Outdated
|
|
||
| ## Overview | ||
|
|
||
| We provided a unified script to test the performance of the Speculative Decoding with EAGLE3 algorithm on multiple datasets. You can follow the steps below to run the benchmarks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better grammatical clarity and convention in technical documentation, it's better to use the present tense 'provide' to describe the current functionality of the script.
| We provided a unified script to test the performance of the Speculative Decoding with EAGLE3 algorithm on multiple datasets. You can follow the steps below to run the benchmarks. | |
| We provide a unified script to test the performance of the Speculative Decoding with EAGLE3 algorithm on multiple datasets. You can follow the steps below to run the benchmarks. |
docs/benchmarks/benchmark.md
Outdated
| ```shell | ||
| # you can launch a server | ||
| python3 -m sglang.launch_server \ | ||
| --model meta-llama/Llama-3.1-8B-Instruct \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line contains trailing whitespace. While it may not affect the command's execution, it's best to remove it for code cleanliness and to avoid potential issues with different shell environments.
| --model meta-llama/Llama-3.1-8B-Instruct \ | |
| --model meta-llama/Llama-3.1-8B-Instruct \ |
polish polish
Motivation
Fix the docs bug here.
Modifications
Related Issues
Accuracy Test
Benchmark & Profiling
Checklist