[Docs] add benchmark refer #358

jiapingW · 2025-12-08T14:05:30Z

Motivation

Fix the docs bug here.

Modifications

Related Issues

Accuracy Test

Benchmark & Profiling

Checklist

[✅] Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://sgl-fru7574.slack.com/archives/C09784E3EN6 to discuss your PR.

gemini-code-assist · 2025-12-08T14:05:45Z

Summary of Changes

Hello @jiapingW, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a documentation issue by fixing a broken link and significantly enhancing the benchmarking guide. It introduces a new, detailed document that provides users with clear instructions and examples for evaluating the performance of speculative decoding using the EAGLE3 algorithm, thereby improving the overall clarity and usability of the project's documentation.

Highlights

Documentation Link Fix: Corrected a broken internal link within docs/basic_usage/benchmarking.md to accurately point to the new benchmarking documentation.
New Benchmarking Guide: Introduced a comprehensive new documentation page at docs/benchmarks/benchmark.md detailing how to perform benchmarking for speculative decoding with the EAGLE3 algorithm, including setup and execution instructions.
Documentation Structure Update: Integrated the newly added benchmarking guide into the main documentation index (docs/index.rst) by adding a new 'Benchmarks' section.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds documentation for benchmarking, including fixing a broken link and creating a new guide. The changes are a good addition, but I've found a few issues in the new documentation's command examples. These include incorrect file paths, which would cause the commands to fail, and some minor inconsistencies and redundancies. My review provides specific suggestions to correct these issues, ensuring the documentation is accurate and easy for users to follow.

gemini-code-assist · 2025-12-08T14:07:16Z

docs/benchmarks/benchmark.md

+python3 bench_eagle3.py \
+    --model-path meta-llama/Llama-3.1-8B-Instruct \
+    --speculative-draft-model-path lmsys/sglang-EAGLE3-LLaMA3.1-Instruct-8B \
+    --port 30000 \
+    --trust-remote-code \
+    --mem-fraction-static 0.8 \
+    --tp-size 1 \
+    --attention-backend fa3 \
+    --config-list 1,0,0,0 1,3,1,4 \
+    --benchmark-list mtbench gsm8k:5 ceval:5:accountant \
+    --dtype bfloat16


The path to the bench_eagle3.py script is missing. Assuming users run commands from the repository root, the script will not be found. The path should be updated to benchmarks/bench_eagle3.py for the command to execute correctly.

Suggested change

python3 bench_eagle3.py \

--model-path meta-llama/Llama-3.1-8B-Instruct \

--speculative-draft-model-path lmsys/sglang-EAGLE3-LLaMA3.1-Instruct-8B \

--port 30000 \

--trust-remote-code \

--mem-fraction-static 0.8 \

--tp-size 1 \

--attention-backend fa3 \

--config-list 1,0,0,0 1,3,1,4 \

--benchmark-list mtbench gsm8k:5 ceval:5:accountant \

--dtype bfloat16

python3 benchmarks/bench_eagle3.py \

--model-path meta-llama/Llama-3.1-8B-Instruct \

--speculative-draft-model-path lmsys/sglang-EAGLE3-LLaMA3.1-Instruct-8B \

--port 30000 \

--trust-remote-code \

--mem-fraction-static 0.8 \

--tp-size 1 \

--attention-backend fa3 \

--config-list 1,0,0,0 1,3,1,4 \

--benchmark-list mtbench gsm8k:5 ceval:5:accountant \

--dtype bfloat16

gemini-code-assist · 2025-12-08T14:07:16Z

docs/benchmarks/benchmark.md

+python bench_eagle3.py \
+        --model-path meta-llama/Llama-3.1-8B-Instruct \
+        --port 30000 \
+        --config-list 1,3,1,4 \
+        --benchmark-list mtbench:5 ceval:5:accountant gsm8k:5 humaneval:5 math500:5 mtbench:5 aime:1 \
+        --skip-launch-server


This command example has several issues that should be addressed:

The path to bench_eagle3.py is missing. It should be benchmarks/bench_eagle3.py.

For consistency with other examples in this file, python3 should be used instead of python.

The indentation of arguments is inconsistent with other code blocks. Using a standard 4-space indent improves readability.

The benchmark mtbench:5 is listed twice in --benchmark-list, which is redundant.

Suggested change

python bench_eagle3.py \

--model-path meta-llama/Llama-3.1-8B-Instruct \

--port 30000 \

--config-list 1,3,1,4 \

--benchmark-list mtbench:5 ceval:5:accountant gsm8k:5 humaneval:5 math500:5 mtbench:5 aime:1 \

--skip-launch-server

python3 benchmarks/bench_eagle3.py \

--model-path meta-llama/Llama-3.1-8B-Instruct \

--port 30000 \

--config-list 1,3,1,4 \

--benchmark-list mtbench:5 ceval:5:accountant gsm8k:5 humaneval:5 math500:5 aime:1 \

--skip-launch-server

gemini-code-assist · 2025-12-08T14:07:16Z

docs/benchmarks/benchmark.md

+
+## Overview
+
+We provided a unified script to test the performance of the Speculative Decoding with EAGLE3 algorithm on multiple datasets. You can follow the steps below to run the benchmarks.


For better grammatical clarity and convention in technical documentation, it's better to use the present tense 'provide' to describe the current functionality of the script.

Suggested change

We provided a unified script to test the performance of the Speculative Decoding with EAGLE3 algorithm on multiple datasets. You can follow the steps below to run the benchmarks.

We provide a unified script to test the performance of the Speculative Decoding with EAGLE3 algorithm on multiple datasets. You can follow the steps below to run the benchmarks.

gemini-code-assist · 2025-12-08T14:07:16Z

docs/benchmarks/benchmark.md

+```shell
+# you can launch a server
+python3 -m sglang.launch_server \
+    --model meta-llama/Llama-3.1-8B-Instruct   \


This line contains trailing whitespace. While it may not affect the command's execution, it's best to remove it for code cleanliness and to avoid potential issues with different shell environments.

Suggested change

--model meta-llama/Llama-3.1-8B-Instruct \

--model meta-llama/Llama-3.1-8B-Instruct \

polish polish

docs/index.rst

jiapingW force-pushed the fix/docs branch from 21b2b98 to 8b2c56b Compare December 8, 2025 14:06

gemini-code-assist bot reviewed Dec 8, 2025

View reviewed changes

docs:add benchmark refer

e4f9dbf

polish polish

jiapingW force-pushed the fix/docs branch from 791ad8e to e4f9dbf Compare December 8, 2025 14:10

FrankLeeeee reviewed Dec 8, 2025

View reviewed changes

docs/index.rst Show resolved Hide resolved

polish

d930cd3

FrankLeeeee approved these changes Dec 9, 2025

View reviewed changes

FrankLeeeee merged commit e0625b0 into sgl-project:main Dec 9, 2025
2 checks passed

jiapingW deleted the fix/docs branch December 9, 2025 02:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Docs] add benchmark refer #358

[Docs] add benchmark refer #358

Uh oh!

jiapingW commented Dec 8, 2025

Uh oh!

gemini-code-assist bot commented Dec 8, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Uh oh!

gemini-code-assist bot Dec 8, 2025

Uh oh!

gemini-code-assist bot Dec 8, 2025

Uh oh!

gemini-code-assist bot Dec 8, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		## Overview

		We provided a unified script to test the performance of the Speculative Decoding with EAGLE3 algorithm on multiple datasets. You can follow the steps below to run the benchmarks.

	We provided a unified script to test the performance of the Speculative Decoding with EAGLE3 algorithm on multiple datasets. You can follow the steps below to run the benchmarks.
	We provide a unified script to test the performance of the Speculative Decoding with EAGLE3 algorithm on multiple datasets. You can follow the steps below to run the benchmarks.

	--model meta-llama/Llama-3.1-8B-Instruct \
	--model meta-llama/Llama-3.1-8B-Instruct \

[Docs] add benchmark refer #358

[Docs] add benchmark refer #358

Uh oh!

Conversation

jiapingW commented Dec 8, 2025

Motivation

Modifications

Related Issues

Accuracy Test

Benchmark & Profiling

Checklist

Uh oh!

gemini-code-assist bot commented Dec 8, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants