feat(decoders): add ajpegli benchmark support#38
Conversation
Adds the ajpegli decoder backend and dependency wiring, includes the five-platform ajpegli result matrix, refreshes README plots/tables, and updates paper asset validation for the 13/11 decoder scope.
Reviewer's GuideAdds an ajpegli decoder backend and wires it into the benchmark/plotting pipeline, enables selecting specific decoders (e.g., ajpegli) in GCP benchmark runs, and refreshes documentation, figures, and tests to reflect the expanded 13-decoder, 5-platform scope. Sequence diagram for selecting ajpegli in GCP benchmark runsequenceDiagram
actor User
participant run_sh as gcp_run_sh
participant GCE as gcp_instances
participant VM as vm_startup_sh
participant CLI as imread_benchmark_run
participant AJ as AJpegliDecoder
participant Lib as ajpegli
User->>run_sh: ./gcp/run.sh --libs ajpegli
run_sh->>GCE: instances create ... metadata libs=ajpegli
GCE-->>VM: start VM with metadata
VM->>VM: LIBS=$(meta libs)
VM->>CLI: imread-benchmark run --libs "$LIBS"
CLI->>AJ: load entry point ajpegli
loop for each image
CLI->>AJ: decode(data)
AJ->>Lib: imdecode(data, mode=RGB)
Lib-->>AJ: np.ndarray
AJ-->>CLI: np.ndarray
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- In
plot_fig02_amd_worker_delta,finite_deltascan be empty if all library/platform combinations are missing or NaN, which will causemin()/max()to raise; consider guarding for the empty case and falling back to the previous static x-limits. - In
AJpegliDecoder, you importajpegliinside bothdecodeanddecode_path; consider importing once at module scope (or caching the import) to avoid repeated import overhead in tight loops.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `plot_fig02_amd_worker_delta`, `finite_deltas` can be empty if all library/platform combinations are missing or NaN, which will cause `min()`/`max()` to raise; consider guarding for the empty case and falling back to the previous static x-limits.
- In `AJpegliDecoder`, you import `ajpegli` inside both `decode` and `decode_path`; consider importing once at module scope (or caching the import) to avoid repeated import overhead in tight loops.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 862ee8a031
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| --boot-disk-size=60GB \ | ||
| --boot-disk-type="$BOOT_DISK_TYPE" \ | ||
| --metadata="results-bucket=$RUN_GCS,imagenet-bucket=$IMAGENET_BUCKET,num-images=$NUM_IMAGES,num-runs=$NUM_RUNS,dl-runs=$DL_RUNS,workers=$WORKERS,repo-tarball=$RUN_GCS/repo.tar.gz,cache-bucket=$CACHE_BUCKET,force-rebuild=$FORCE_REBUILD,keep-on-failure=$([[ $KEEP_ON_FAILURE == true ]] && echo 1 || echo 0)" \ | ||
| --metadata="results-bucket=$RUN_GCS,imagenet-bucket=$IMAGENET_BUCKET,libs=$LIBS,num-images=$NUM_IMAGES,num-runs=$NUM_RUNS,dl-runs=$DL_RUNS,workers=$WORKERS,repo-tarball=$RUN_GCS/repo.tar.gz,cache-bucket=$CACHE_BUCKET,force-rebuild=$FORCE_REBUILD,keep-on-failure=$([[ $KEEP_ON_FAILURE == true ]] && echo 1 || echo 0)" \ |
There was a problem hiding this comment.
Encode --libs metadata before passing comma-separated values
--libs is documented and parsed as a comma-separated list, but this value is interpolated directly into gcloud ... --metadata, whose syntax is KEY=VALUE,KEY=VALUE,.... When a user passes multiple decoders (for example --libs opencv,pillow), the comma inside $LIBS is treated as a metadata separator, so the second token is no longer part of libs and can make instance creation fail (or at best corrupt the metadata payload). This breaks the new multi-library selection path introduced by this commit.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Adds an ajpegli decoder backend to the benchmark suite and wires it into dependency management, result loading, paper-asset generation, and the GCP benchmarking workflow so ajpegli can be benchmarked and included in published tables/figures.
Changes:
- Introduce
AJpegliDecoderand register it viapyproject.tomlentry points; addajpeglito themainstreamoptional dependency set (and lockfile). - Extend results/paper tooling to include
ajpegliin library ordering/scope checks; refresh README tables and narrative to includeajpegli. - Add GCP
--libsplumbing so runs can target a subset of decoders (e.g., onlyajpegli) and commit newajpegliresult JSONs for the 5-platform matrix.
Reviewed changes
Copilot reviewed 27 out of 32 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Adds the ajpegli locked dependency and wires it into the mainstream extra metadata. |
| pyproject.toml | Adds ajpegli to mainstream extras and registers the new decoder via entry points. |
| imread_benchmark/decoders/ajpegli_decoder.py | New decoder implementation using ajpegli.imread/ajpegli.imdecode. |
| tools/_results.py | Adds ajpegli to the canonical library ordering used by loaders/plots/README rendering. |
| tools/paper_assets.py | Updates paper-scope validation/robustness notes and makes Fig02 x-limits dynamic. |
| tests/test_decoders.py | Extends registry expectations to include ajpegli. |
| tests/conftest.py | Adds ajpegli distribution→module mapping for test skip behavior. |
| tests/test_paper_assets.py | Updates expected decoder counts/row counts for the expanded paper scope. |
| README.md | Updates benchmark tables/metadata date and documentation sections to include ajpegli. |
| gcp/run.sh | Adds --libs flag support and passes it via instance metadata. |
| gcp/vm_startup.sh | Reads libs metadata and passes it through to imread-benchmark run --libs. |
| docs/plotting_and_statistics.md | Adjusts wording to cover strict native decoders more generally (incl. ajpegli). |
| docs/gcp_benchmarks.md | Updates examples/placeholders and documents --libs for single-decoder runs. |
| output/linux_AMD-EPYC-9B14/run_summary.json | Updates run summary timestamp and recorded requested/run libs. |
| output/linux_AMD-EPYC-9B14/ajpegli_1t_results.json | Adds ajpegli 1-thread benchmark results for this platform. |
| output/linux_AMD-EPYC-9B14/ajpegli_dataloader_results.json | Adds ajpegli DataLoader benchmark results for this platform. |
| output/linux_AMD-EPYC-9B45/run_summary.json | Updates run summary timestamp and recorded requested/run libs. |
| output/linux_AMD-EPYC-9B45/ajpegli_1t_results.json | Adds ajpegli 1-thread benchmark results for this platform. |
| output/linux_AMD-EPYC-9B45/ajpegli_dataloader_results.json | Adds ajpegli DataLoader benchmark results for this platform. |
| output/linux_INTEL(R)-XEON(R)-PLATINUM-8581C-CPU-@-2.30GHz/run_summary.json | Updates run summary timestamp and recorded requested/run libs. |
| output/linux_INTEL(R)-XEON(R)-PLATINUM-8581C-CPU-@-2.30GHz/ajpegli_1t_results.json | Adds ajpegli 1-thread benchmark results for this platform. |
| output/linux_INTEL(R)-XEON(R)-PLATINUM-8581C-CPU-@-2.30GHz/ajpegli_dataloader_results.json | Adds ajpegli DataLoader benchmark results for this platform. |
| output/linux_Neoverse-V2/run_summary.json | Updates run summary timestamp and recorded requested/run libs. |
| output/linux_Neoverse-V2/ajpegli_1t_results.json | Adds ajpegli 1-thread benchmark results for this platform. |
| output/linux_Neoverse-V2/ajpegli_dataloader_results.json | Adds ajpegli DataLoader benchmark results for this platform. |
| output/linux_Neoverse-N1/run_summary.json | Updates run summary timestamp and recorded requested/run libs. |
| output/linux_Neoverse-N1/ajpegli_1t_results.json | Adds ajpegli 1-thread benchmark results for this platform. |
| output/linux_Neoverse-N1/ajpegli_dataloader_results.json | Adds ajpegli DataLoader benchmark results for this platform. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| finite_deltas = [d for deltas in delta_by_platform.values() for d in deltas if not pd.isna(d)] | ||
| x_min = min(-18.0, min(finite_deltas) - 5.0) | ||
| x_max = max(32.0, max(finite_deltas) + 8.0) | ||
|
|
| "libs_requested": [ | ||
| "imagecodecs", | ||
| "imageio", | ||
| "jpeg4py", | ||
| "kornia", | ||
| "opencv", | ||
| "pillow", | ||
| "pyvips", | ||
| "simplejpeg", | ||
| "skimage", | ||
| "tensorflow", | ||
| "torchvision", | ||
| "turbojpeg" | ||
| "ajpegli" | ||
| ], | ||
| "libs_skipped_platform": [], | ||
| "libs_run": [ | ||
| "imagecodecs", | ||
| "imageio", | ||
| "jpeg4py", | ||
| "kornia", | ||
| "opencv", | ||
| "pillow", | ||
| "pyvips", | ||
| "simplejpeg", | ||
| "skimage", | ||
| "tensorflow", | ||
| "torchvision", | ||
| "turbojpeg" | ||
| "ajpegli" | ||
| ], |
| "libs_requested": [ | ||
| "imagecodecs", | ||
| "imageio", | ||
| "jpeg4py", | ||
| "kornia", | ||
| "opencv", | ||
| "pillow", | ||
| "pyvips", | ||
| "simplejpeg", | ||
| "skimage", | ||
| "tensorflow", | ||
| "torchvision", | ||
| "turbojpeg" | ||
| "ajpegli" | ||
| ], | ||
| "libs_skipped_platform": [], | ||
| "libs_run": [ | ||
| "imagecodecs", | ||
| "imageio", | ||
| "jpeg4py", | ||
| "kornia", | ||
| "opencv", | ||
| "pillow", | ||
| "pyvips", | ||
| "simplejpeg", | ||
| "skimage", | ||
| "tensorflow", | ||
| "torchvision", | ||
| "turbojpeg" | ||
| "ajpegli" | ||
| ], |
| "libs_requested": [ | ||
| "imagecodecs", | ||
| "imageio", | ||
| "jpeg4py", | ||
| "kornia", | ||
| "opencv", | ||
| "pillow", | ||
| "pyvips", | ||
| "simplejpeg", | ||
| "skimage", | ||
| "tensorflow", | ||
| "torchvision", | ||
| "turbojpeg" | ||
| "ajpegli" | ||
| ], | ||
| "libs_skipped_platform": [], | ||
| "libs_run": [ | ||
| "imagecodecs", | ||
| "imageio", | ||
| "jpeg4py", | ||
| "kornia", | ||
| "opencv", | ||
| "pillow", | ||
| "pyvips", | ||
| "simplejpeg", | ||
| "skimage", | ||
| "tensorflow", | ||
| "torchvision", | ||
| "turbojpeg" | ||
| "ajpegli" | ||
| ], |
| "libs_requested": [ | ||
| "imagecodecs", | ||
| "imageio", | ||
| "jpeg4py", | ||
| "kornia", | ||
| "opencv", | ||
| "pillow", | ||
| "pyvips", | ||
| "simplejpeg", | ||
| "skimage", | ||
| "tensorflow", | ||
| "torchvision", | ||
| "turbojpeg" | ||
| "ajpegli" | ||
| ], | ||
| "libs_skipped_platform": [], | ||
| "libs_run": [ | ||
| "imagecodecs", | ||
| "imageio", | ||
| "jpeg4py", | ||
| "kornia", | ||
| "opencv", | ||
| "pillow", | ||
| "pyvips", | ||
| "simplejpeg", | ||
| "skimage", | ||
| "tensorflow", | ||
| "torchvision", | ||
| "turbojpeg" | ||
| "ajpegli" | ||
| ], |
| "libs_requested": [ | ||
| "imagecodecs", | ||
| "imageio", | ||
| "jpeg4py", | ||
| "kornia", | ||
| "opencv", | ||
| "pillow", | ||
| "pyvips", | ||
| "simplejpeg", | ||
| "skimage", | ||
| "tensorflow", | ||
| "torchvision", | ||
| "turbojpeg" | ||
| "ajpegli" | ||
| ], | ||
| "libs_skipped_platform": [], | ||
| "libs_run": [ | ||
| "imagecodecs", | ||
| "imageio", | ||
| "jpeg4py", | ||
| "kornia", | ||
| "opencv", | ||
| "pillow", | ||
| "pyvips", | ||
| "simplejpeg", | ||
| "skimage", | ||
| "tensorflow", | ||
| "torchvision", | ||
| "turbojpeg" | ||
| "ajpegli" | ||
| ], |
Adds the ajpegli decoder backend and dependency wiring, includes the five-platform ajpegli result matrix, refreshes README plots/tables, and updates paper asset validation for the 13/11 decoder scope.
Summary by Sourcery
Add the ajpegli JPEG decoder to the benchmark suite and update tooling, benchmarks, and documentation to include it in the five-platform paper matrix.
New Features:
Enhancements:
Tests: