> Note: Perf-sensitive tests/benches are skipped by default. To opt in, run with feature `perf-tests` and env `PERF_TESTS=1`. See "Performance Tests (opt-in)" below.
- Installation
- Features
- Types
- Collector
- Functions
- Macros
- Production Metrics (feature: metrics)
- Async Usage
- Disabled Mode Behavior
- Doctests and feature flags
- Performance Tests (opt-in)
- Examples
Add this to your Cargo.toml:
[dependencies]
benchmark = "0.8.0"# Basic installation (benchmarking feature only)
cargo add benchmarkAdd this to your Cargo.toml:
[dependencies]
# Disable default features for true zero-overhead
benchmark = { version = "0.8.0", default-features = false }# Explicitly disabled - zero overhead
cargo add benchmark --no-default-features— See FEATURES DOCUMENTATION for more information.
benchmark(default): timing functions and macros.collector(default):Collector,Stats, and built-in histogram backend.metrics(optional): production metrics (Watch,Timer,stopwatch!). Impliescollector.high-precision(optional): enables high-precision histogram backend. Impliescollector.hdr(optional): HDR histogram backend via optionalhdrhistogramdependency. Requireshigh-precision. Initialization is non-panicking with a safe fallback in release builds.
Notes:
stdis internal and implied by the above features; you do not enable it directly.- Minimal build: use
default-features = falseand selectively opt in.
— See FEATURES DOCUMENTATION for more information.
Minimal examples to get productive fast. See full examples below for more depth.
// Benchmarking (default features)
let (out, d) = benchmark::time!({ 2 + 2 });
assert_eq!(out, 4);
assert!(d.as_nanos() >= 0);
// Production metrics (features = ["metrics"])
let w = benchmark::Watch::new();
benchmark::stopwatch!(w, "op", { std::thread::sleep(std::time::Duration::from_millis(1)); });
assert!(w.snapshot()["op"].count >= 1);Represents a duration in nanoseconds, backed by a u128 for wide range and precision.
use benchmark::Duration;
let d = Duration::from_nanos(1_500);
assert_eq!(d.as_micros(), 1);
assert_eq!(d.to_string(), "1.50µs");- Constructors:
from_nanos(u128) - Accessors:
as_nanos() -> u128,as_micros() -> u128,as_millis() -> u128,as_secs_f64() -> f64,as_secs_f32() -> f32 - Constants:
Duration::ZERO - Display: human-friendly units (ns/µs/ms/s/m)
Represents a single named timing with timestamp (nanoseconds since UNIX epoch by default under std).
use benchmark::{Duration, Measurement};
let m = Measurement::new("op", Duration::from_nanos(42), 0);
assert_eq!(m.name, "op");- Fields:
name: &'static str,duration: Duration,timestamp: u128 - Constructors:
new(name, duration, timestamp),zero(name) - Notes: Timestamps may be
0in Miri or restricted environments.
Basic statistics for a set of measurements. Available with std feature.
- Fields:
count: u64,total: Duration,min: Duration,max: Duration,mean: Duration - Construction: Returned by
Collector::stats()/Collector::all_stats().
Fixed-range, high-performance histogram used by production metrics. Available with feature = "collector" at benchmark::histogram.
use benchmark::histogram::Histogram; // requires feature = "collector"
let mut h = Histogram::new(1, 1_000_000); // bounds: 1ns..=1_000_000ns
for _ in 0..1000 { h.record(500); }
assert_eq!(h.count(), 1000);
assert_eq!(h.percentile(50.0), 500);
let ps = h.percentiles(&[50.0, 90.0, 99.0]);
assert!(ps[2] >= ps[0]);- Constructors:
new(lowest: u64, highest: u64) - Recording:
record(ns: u64),record_duration(Duration) - Queries:
count(),min(),max(),median(),percentile(q),percentiles(&qs) - Notes: Inputs are clamped to bounds; percentile arguments
qare clamped to [0.0, 1.0] (e.g., -0.1 -> 0.0, 1.2 -> 1.0); choose bounds to your expected SLOs for precision.
Thread-safe aggregation of measurements. Available with feature = "collector".
use benchmark::{Collector, Duration, Measurement};
let c = Collector::new();
let m = Measurement::new("op", Duration::from_nanos(10), 0);
c.record(&m); // v0.2.0: takes &Measurement
let stats = c.stats("op").unwrap();
assert_eq!(stats.count, 1);- Constructors:
new(),with_capacity(usize) - Recording:
record(&Measurement),record_duration(name, Duration) - Stats:
stats(name) -> Option<Stats>,all_stats() -> Vec<(String, Stats)> - Maintenance:
clear(),clear_name(name) - Concurrency:
CollectorisCloneand can be shared across threads; internally usesArc<RwLock<...>>.
Example: Concurrent recording across threads
use benchmark::{Collector, Duration};
use std::sync::Arc;
use std::thread;
let collector = Arc::new(Collector::new());
let mut handles = vec![];
for _ in 0..4 {
let c = collector.clone();
handles.push(thread::spawn(move || {
for _ in 0..1000 {
c.record_duration("io", Duration::from_nanos(100));
}
}));
}
for h in handles { h.join().unwrap(); }
let s = collector.stats("io").unwrap();
assert_eq!(s.count, 4 * 1000);Measures execution time of a closure and returns (result, Duration).
use benchmark::measure;
let (result, duration) = measure(|| 2 + 2);
assert_eq!(result, 4);- Enabled path: high-resolution timer via
std::time::Instant. - Disabled path (
!benchmark): returnsDuration::ZERO. - Overhead: designed to be minimal and competitive with direct
Instantusage.
Measures execution time and returns (result, Measurement) with a name.
use benchmark::measure_named;
let (result, m) = measure_named("add", || 2 + 2);
assert_eq!(result, 4);
assert_eq!(m.name, "add");- Timestamp set to UNIX epoch nanos (0 under Miri/isolation).
- Disabled path (
!benchmark): returnsMeasurement { duration: ZERO, timestamp: 0 }.
Times an expression and returns (result, Duration).
use benchmark::time;
let (result, dur) = time!(2 + 2);
assert_eq!(result, 4);Async example (requires feature = "benchmark"):
use benchmark::time;
#[tokio::main(flavor = "current_thread")]
async fn main() {
let ((), d) = time!(tokio::time::sleep(std::time::Duration::from_millis(5)).await);
assert!(d.as_millis() >= 5);
}Times an expression with a name and returns (result, Measurement).
use benchmark::time_named;
let (result, m) = time_named!("addition", 2 + 2);
assert_eq!(result, 4);
assert_eq!(m.name, "addition");With Collector (requires features = ["benchmark", "collector"]):
use benchmark::{time_named, Collector};
let collector = Collector::new();
let (_, m) = time_named!("db", {
// your operation
1 + 1
});
collector.record(&m);
let s = collector.stats("db").unwrap();
assert_eq!(s.count, 1);Async example:
use benchmark::time_named;
#[tokio::main(flavor = "current_thread")]
async fn main() {
let ((), m) = time_named!("sleep", tokio::time::sleep(std::time::Duration::from_millis(3)).await);
assert!(m.duration.as_millis() >= 3);
}Disabled example (default-features = false):
// returns Duration::ZERO/Measurement with zero duration
let (_out, d) = benchmark::time!(42);
assert_eq!(d.as_nanos(), 0);Runs a code block repeatedly and returns raw per-iteration durations as Vec<Duration>.
Forms:
// Default iterations: 10_000
let samples: Vec<benchmark::Duration> = benchmark::benchmark_block!({
// code to benchmark
std::hint::black_box(1 + 1);
});
// Explicit iterations
let n = 1_234usize;
let samples = benchmark::benchmark_block!(n, {
std::hint::black_box(2 * 3);
});Notes:
- Raw data enables flexible downstream stats (mean, percentiles, etc.).
- Async-compatible: you can
awaitinside the block. - Disabled path (
!benchmark): the block runs once, returns an empty vec for zero overhead.
Async example:
#[tokio::main(flavor = "current_thread")]
async fn main() {
let samples = benchmark::benchmark_block!(100, {
tokio::time::sleep(std::time::Duration::from_millis(1)).await;
});
assert_eq!(samples.len(), 100);
}Runs an expression repeatedly, labeling per-iteration samples. Returns (Option<T>, Vec<Measurement>) where Option<T> is the last result.
Forms:
// Default iterations: 10_000
let (last, samples) = benchmark::benchmark!("add", { 2 + 3 });
// Explicit iterations
let (last, samples) = benchmark::benchmark!("mul", 77usize, { 6 * 7 });Notes:
samples[i].name == "add"(or your provided name).- Async-compatible: expressions/blocks may use
await. - Disabled path (
!benchmark): runs once and returns(Some(result), vec![]).
Async example:
#[tokio::main(flavor = "current_thread")]
async fn main() {
let (_last, samples) = benchmark::benchmark!("sleep", 50usize, {
tokio::time::sleep(std::time::Duration::from_millis(1)).await;
});
assert_eq!(samples.len(), 50);
}Provides production-friendly timing and percentile statistics with negligible overhead and zero cost when disabled.
Installation with feature:
[dependencies]
benchmark = { version = "0.8.0", features = ["metrics"] }Thread-safe collector of nanosecond timings using a built-in, zero-dependency histogram under the hood.
use benchmark::Watch; // requires feature = "metrics"
let watch = Watch::new();
watch.record("db.query", 42_000);
let stats = watch.snapshot();
let s = &stats["db.query"];
assert!(s.p99 >= s.p50);- Methods:
new(),builder() -> WatchBuilder,with_bounds(lowest, highest),record(name, ns),record_instant(name, start),snapshot(),clear(),clear_name(name) - Concurrency:
Watchis cheap to clone andSend + Sync.
Records elapsed time to a Watch automatically when dropped.
use benchmark::{Timer, Watch}; // requires feature = "metrics"
let watch = Watch::new();
{
let _t = Timer::new(watch.clone(), "render");
// do work...
}
let s = watch.snapshot()["render"];
assert!(s.count >= 1);Ergonomic macro to time a scoped block and record to a Watch. Works in sync and async contexts.
use benchmark::{stopwatch, Watch}; // requires feature = "metrics"
let watch = Watch::new();
stopwatch!(watch, "io", {
std::thread::sleep(std::time::Duration::from_millis(1));
});
assert_eq!(watch.snapshot()["io"].count, 1);Async example:
use benchmark::{stopwatch, Watch};
#[tokio::main(flavor = "current_thread")]
async fn main() {
let watch = Watch::new();
stopwatch!(watch, "sleep", {
tokio::time::sleep(std::time::Duration::from_millis(1)).await;
});
assert_eq!(watch.snapshot()["sleep"].count, 1);
}Notes:
- Percentiles are computed from histograms cloned outside locks for low contention.
- Durations are clamped to histogram bounds; defaults cover 1ns..~1h.
- Percentile inputs are clamped to [0.0, 1.0]; out-of-range queries map to min/max.
- Internal histogram: fixed-size, lock-free recording with nanosecond precision; zero external dependencies.
The snippets below assume features = ["standard"].
Record per-iteration latency and export periodically.
use benchmark::{stopwatch, Watch};
#[tokio::main(flavor = "multi_thread")]
async fn main() {
let watch = Watch::new();
// Periodic exporter (e.g., log or scrape endpoint)
let exporter = {
let w = watch.clone();
tokio::spawn(async move {
loop {
tokio::time::sleep(std::time::Duration::from_secs(10)).await;
let snap = w.snapshot();
for (name, s) in snap {
println!(
"metric={} count={} min={}ns p50={}ns p90={}ns p99={}ns max={}ns mean={:.1}",
name, s.count, s.min, s.p50, s.p90, s.p99, s.max, s.mean
);
}
}
})
};
// Service work loop
for i in 0..100u32 {
stopwatch!(watch, "service.tick", {
// do work (e.g., handle a batch)
tokio::time::sleep(std::time::Duration::from_millis(5 + (i % 3) as u64)).await;
});
}
exporter.abort();
}Use metric names to encode endpoint/method. Avoid user-provided strings directly.
use benchmark::{stopwatch, Watch};
fn endpoint_metric(method: &str, path: &str) -> String {
// Prefer a stable, low-cardinality naming scheme
format!("http.{}:{}", method, path) // e.g., http.GET:/users/:id
}
fn handle_request(watch: &Watch, method: &str, route_path: &str) {
let name = endpoint_metric(method, route_path);
stopwatch!(watch.clone(), &name, {
std::thread::sleep(std::time::Duration::from_millis(2));
});
}Measure I/O and processing separately with clear names.
use benchmark::{stopwatch, Watch};
let watch = Watch::new();
for _ in 0..1000 {
stopwatch!(watch, "worker.fetch", {
std::thread::sleep(std::time::Duration::from_millis(1));
});
stopwatch!(watch, "worker.process", {
std::thread::sleep(std::time::Duration::from_millis(3));
});
}
let s = watch.snapshot();
assert!(s["worker.process"].p90 >= s["worker.fetch"].p90);- Prefer
stopwatch!inside async code; it scopes correctly and is ergonomic. Timeralso works, but be mindful of lifetimes and earlystop()if you need to record before scope end.
use benchmark::{Timer, Watch};
let w = Watch::new();
let mut maybe_id = None;
let mut t = Timer::new(w.clone(), "job.run");
// ... compute an id
maybe_id = Some(42);
// early stop to record now
t.stop();
assert!(w.snapshot()["job.run"].count == 1);Set bounds to your SLOs to reduce memory and improve precision.
use benchmark::Watch; // requires feature = "metrics"
// Builder: 100ns to 10s (fixed precision internally)
let watch = Watch::builder()
.lowest(100)
.highest(10_000_000_000)
.build();
watch.record("op", 250);use benchmark::Watch;
let w = Watch::new();
// ... record over a minute
let minute = w.snapshot();
// export minute
w.clear(); // start fresh for the next intervalIterate and serialize to your logging/metrics system. Below shows simple logging.
use benchmark::Watch;
fn export(w: &Watch) {
for (name, s) in w.snapshot() {
println!(
"name={} count={} min={} p50={} p90={} p99={} max={} mean={:.2}",
name, s.count, s.min, s.p50, s.p90, s.p99, s.max, s.mean
);
}
}JSON export example (using serde_json):
// Add to Cargo.toml
// serde = { version = "1", features = ["derive"] }
// serde_json = "1"
use benchmark::Watch; // requires feature = "metrics"
use serde::Serialize;
#[derive(Serialize)]
struct MetricRow<'a> {
name: &'a str,
count: u64,
min: u64,
p50: u64,
p90: u64,
p95: u64,
p99: u64,
p999: u64,
max: u64,
mean: f64,
}
fn export_json(w: &Watch) -> String {
let mut rows = Vec::new();
for (name, s) in w.snapshot() {
rows.push(MetricRow {
name: &name,
count: s.count,
min: s.min,
p50: s.p50,
p90: s.p90,
p95: s.p95,
p99: s.p99,
p999: s.p999,
max: s.max,
mean: s.mean,
});
}
serde_json::to_string_pretty(&rows).unwrap()
}- Clone
Watchfreely and pass by value to tasks/threads. - Use stable, low-cardinality metric names to keep the map small.
- If extremely hot, consider sharding names (e.g., per-core suffix) and merging snapshots offline.
The macros inline timing using std::time::Instant under feature = "benchmark" and fully support await inside the macro body. They can be used with any async runtime (Tokio, async-std, etc.).
Notes:
- When
benchmarkis off, macros return zero durations but still evaluate expressions. - Avoid holding locks across awaited code within your own operations.
When compiled with default-features = false or without benchmark:
measure()returns(result, Duration::ZERO).measure_named()returns(result, Measurement { duration: ZERO, timestamp: 0 }).time!returns(result, Duration::ZERO).time_named!returns(result, Measurement::zero(name)).benchmark_block!executes once and returnsVec::new().benchmark!executes once and returns(Some(result), Vec::new()).CollectorandStatsarecollector-gated; ifcollectoris disabled they are not available.
- Ensure required features are enabled for copied snippets (
collector,metrics). - Zero durations are valid; avoid rewriting them at collection time. Clamp only at presentation.
- Tune histogram bounds near your SLOs for better percentile precision and lower memory.
- Keep metric names low-cardinality and stable to reduce map contention.
- Avoid holding your own locks across
awaitinside timed regions.
- Preserve fidelity in the data layer: zero durations are valid measurements for extremely fast ops.
- Apply a visualization floor only at presentation time if necessary.
- Consider filtering 0ns when computing percentiles if they reflect timer granularity rather than business latency.
- If you must avoid zeros in histograms, clamp on export (
max(value, 1)), not at collection.
Create a minimal Rust benchmark that repeatedly measures a function and reports summary statistics using `Collector`.
use benchmark::{Collector, time};
fn fibonacci(n: u64) -> u64 {
match n { 0 => 0, 1 => 1, _ => fibonacci(n - 1) + fibonacci(n - 2) }
}
fn main() {
let mut c = Collector::new();
for _ in 0..1_000 {
let (_, d) = time!(fibonacci(20));
c.record_duration("fib20", d);
}
let s = c.stats("fib20").unwrap();
println!("iterations={} mean={}ns min={}ns max={}ns",
s.count, s.mean.as_nanos(), s.min.as_nanos(), s.max.as_nanos());
}Benchmark a code block by running it many times and collecting per-iteration durations using `benchmark_block!`.
use benchmark::benchmark_block;
fn hot() { std::hint::black_box(1 + 1); }
fn main() {
// Default 10_000 iterations
let samples = benchmark_block!({ hot() });
assert_eq!(samples.len(), 10_000);
// Explicit iterations
let n = 5_000usize;
let samples2 = benchmark_block!(n, { hot() });
println!("n1={} n2={} first={}ns",
samples.len(), samples2.len(), samples[0].as_nanos());
}Measure small inner loops or tight functions; prefer deterministic inputs and avoid global state.
use benchmark::{Collector, time};
fn parse_u64(s: &str) -> u64 { s.parse().unwrap_or_default() }
fn main() {
let mut c = Collector::new();
for _ in 0..50_000 {
let (_, d) = time!(parse_u64("123456"));
c.record_duration("parse", d);
}
let s = c.stats("parse").unwrap();
println!("count={} mean={}ns", s.count, s.mean.as_nanos());
}Benchmark an end-to-end path (e.g., request handling). Capture realistic latencies across components.
use benchmark::{Collector, time};
fn handle_request() { std::thread::sleep(std::time::Duration::from_millis(3)); }
fn main() {
let mut c = Collector::new();
for _ in 0..1_000 { let (_, d) = time!(handle_request()); c.record_duration("req", d); }
let s = c.stats("req").unwrap();
println!("count={} min={}ns max={}ns mean={}ns",
s.count, s.min.as_nanos(), s.max.as_nanos(), s.mean.as_nanos());
}Compare multiple implementations by sampling each separately with identical workloads.
use benchmark::Collector;
fn impl_a(buf: &[u8]) -> usize { buf.iter().filter(|b| **b % 2 == 0).count() }
fn impl_b(buf: &[u8]) -> usize { buf.chunks(2).map(|c| c.len()).sum() }
fn main() {
let data = vec![0u8; 4096];
let mut ca = Collector::new();
let mut cb = Collector::new();
for _ in 0..10_000 { let (_, d) = benchmark::time!(impl_a(&data)); ca.record_duration("a", d); }
for _ in 0..10_000 { let (_, d) = benchmark::time!(impl_b(&data)); cb.record_duration("b", d); }
let sa = ca.stats("a").unwrap();
let sb = cb.stats("b").unwrap();
println!("A mean={}ns | B mean={}ns", sa.mean.as_nanos(), sb.mean.as_nanos());
}Sampling many iterations reduces noise and reveals distribution; compute summary stats.
use benchmark::Collector;
fn main() {
let mut c = Collector::with_capacity(100_000);
for _ in 0..100_000 { let (_, d) = benchmark::time!({ 1 + 1 }); c.record_duration("op", d); }
let s = c.stats("op").unwrap();
println!("n={} mean={}ns", s.count, s.mean.as_nanos());
}Generate sustained load to exercise systems and observe tail latency behavior.
use benchmark::Collector;
fn io() { std::thread::sleep(std::time::Duration::from_millis(1)); }
fn main() {
let mut c = Collector::new();
for _ in 0..5_000 { let (_, d) = benchmark::time!(io()); c.record_duration("io", d); }
let s = c.stats("io").unwrap();
println!("min={}ns p50~{}ns max={}ns", s.min.as_nanos(), s.mean.as_nanos(), s.max.as_nanos());
}Record production timings with minimal overhead using the `metrics` feature.
// Requires: features = ["std", "metrics"]
use benchmark::{stopwatch, Watch};
let watch = Watch::new();
stopwatch!(watch, "db.query", {
std::thread::sleep(std::time::Duration::from_millis(2));
});
println!("count={}", watch.snapshot()["db.query"].count);Model spans for sub-operations (e.g., DB, cache, remote call) by naming timers consistently.
// Requires: features = ["std", "metrics"]
use benchmark::{stopwatch, Watch};
let w = Watch::new();
stopwatch!(w, "req.db", { std::thread::sleep(std::time::Duration::from_millis(1)); });
stopwatch!(w, "req.cache", { std::thread::sleep(std::time::Duration::from_millis(1)); });
stopwatch!(w, "req.http", { std::thread::sleep(std::time::Duration::from_millis(2)); });Continuously collect and snapshot percentiles with negligible overhead.
// Requires: features = ["std", "metrics"]
use benchmark::Watch;
let w = Watch::new();
for _ in 0..1000 { w.record("tick", 500); }
let s = w.snapshot()["tick"];
println!("p50={} p99={}", s.p50, s.p99);Track endpoint health like TTFB and response time; alert on SLO breaches.
// Requires: features = ["std", "metrics"]
use benchmark::{stopwatch, Watch};
let watch = Watch::new();
stopwatch!(watch, "health.ping", {
std::thread::sleep(std::time::Duration::from_millis(1));
});
let s = watch.snapshot()["health.ping"];
println!("p99={}ns", s.p99);Export snapshots to your logging/metrics stack periodically.
// Requires: features = ["std", "metrics"]
use benchmark::Watch;
fn export(w: &Watch) {
for (name, s) in w.snapshot() {
println!(
"name={} count={} min={} p50={} p90={} p99={} max={} mean={:.2}",
name, s.count, s.min, s.p50, s.p90, s.p99, s.max, s.mean
);
}
}Some examples require specific features to compile under doctest or when copy-pasted:
time!,measure,benchmark_block!,benchmark!: Requiresfeature = "benchmark".Collector,Stats,histogram: Requiresfeature = "collector".Watch,Timer,stopwatch!: Requiresfeature = "metrics".
When running doctests locally with docs.rs-like configuration, consider enabling all features:
RUSTDOCFLAGS="--cfg docsrs" cargo test --doc --all-featuresAlternatively, gate your local snippets with cfgs when experimenting.
Perf-sensitive tests/benches are gated to avoid noisy CI variance. Opt in explicitly:
# run perf tests (ignored by default)
PERF_TESTS=1 cargo test -F perf-tests -- --ignored
# run benches that exercise perf paths
PERF_TESTS=1 cargo bench -F perf-testsNotes:
- The
perf-testsfeature gates perf-sensitive code in tests/benches. - Tests also check
PERF_TESTSat runtime and will early-exit when not set.
COPYRIGHT © 2025 JAMES GOBER.