Codex/update arxiv preprint citation by ternaus · Pull Request #35 · ternaus/imread_benchmark

ternaus · 2026-05-12T07:21:31Z

Summary by Sourcery

Add a new LaTeX platform recommendation table for zero-skip DataLoader choices, ensure generated plots use embeddable fonts, and update the README to reference the new arXiv preprint and citation.

New Features:

Generate a per-platform zero-skip DataLoader recommendation LaTeX table, including integration into the table generation workflow.

Enhancements:

Configure matplotlib PDF/PS font types for better compatibility with LaTeX and vector outputs.

Documentation:

Update README with the current preprint link and BibTeX citation, and clarify citation guidance.

Tests:

Add coverage to verify the platform recommendation table contents, formatting, and decoder inclusion/exclusion rules.

sourcery-ai · 2026-05-12T07:21:38Z

Reviewer's Guide

Adds a new LaTeX platform recommendation table generator (including escaping helpers and deterministic selection logic), configures matplotlib font types for vector output, and updates the README citation to the new arXiv preprint, with tests covering the new table output expectations.

File-Level Changes

Change	Details	Files
Configure matplotlib to use TrueType fonts in generated PDF/PS assets.	Import the top-level matplotlib module alongside pyplot, seaborn, and Patch in the plotting helper. Set rcParams pdf.fonttype and ps.fonttype to 42 after successful imports to ensure text is embedded as TrueType fonts.	`tools/paper_assets.py`
Add LaTeX-safe string escaping and a new platform recommendation LaTeX table generator for zero-skip DataLoader choices.	Introduce a _latex_escape helper that replaces LaTeX-reserved characters with escaped equivalents. Add helpers to format per-platform DataLoader choices and build table rows using paper-scoped, peak-throughput data and filtered zero-skip decoders. Generate a LaTeX tabularx table with caption, label, and notes describing per-platform zero-skip recommendations, and write it to a .tex file.	`tools/paper_assets.py`
Integrate the new platform recommendation table into the table generation pipeline and documentation.	Call generate_platform_recommendation_table from generate_tables to produce table07_platform_recommendations.tex alongside existing tables. Update the generated tables README section to list the new .tex table output.	`tools/paper_assets.py`
Add tests to validate the new platform recommendation table content and structure.	Import generate_platform_recommendation_table into the test module. Verify that the generated LaTeX file has one data row per platform, each with exactly three img/s entries, and excludes skip decoders, pyvips, and tensorflow. Assert that a set of expected formatted choices (decoder, throughput, workers) appear in the generated table for regression coverage.	`tests/test_paper_assets.py`
Update README to reference the new arXiv preprint and citation details.	Add a direct preprint link near the top of the README in the results description. Replace the old BibTeX entry with a new one for the updated preprint, including new title, year, identifier, primary class, and URL.	`README.md`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

The test_platform_recommendation_table_has_three_zero_skip_choices_per_platform test hard-codes exact throughput values and worker counts, which will make it very brittle to any change in source data; consider asserting structural properties (e.g., counts, ordering, presence/absence of libraries) rather than exact numeric strings.
The LaTeX table assembly in generate_platform_recommendation_table is largely built from inline raw string fragments; consider factoring the common LaTeX boilerplate or format patterns into small helpers/constants to reduce duplication and make future layout changes easier.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The `test_platform_recommendation_table_has_three_zero_skip_choices_per_platform` test hard-codes exact throughput values and worker counts, which will make it very brittle to any change in source data; consider asserting structural properties (e.g., counts, ordering, presence/absence of libraries) rather than exact numeric strings.
- The LaTeX table assembly in `generate_platform_recommendation_table` is largely built from inline raw string fragments; consider factoring the common LaTeX boilerplate or format patterns into small helpers/constants to reduce duplication and make future layout changes easier.

## Individual Comments

### Comment 1
<location path="tests/test_paper_assets.py" line_range="141-150" />
<code_context>
         assert expected_prefix in matching_line
+
+
+def test_platform_recommendation_table_has_three_zero_skip_choices_per_platform(tmp_path: Path) -> None:
+    root = _repo_output()
+    dest = tmp_path / "table07_platform_recommendations.tex"
+    generate_platform_recommendation_table(load_dataloader_results(root), dest)
+
+    text = dest.read_text(encoding="utf-8")
+    data_rows = [line for line in text.splitlines() if "GCP \\texttt" in line]
+    assert len(data_rows) == len(PAPER_PLATFORMS)
+    assert all(row.count("img/s") == 3 for row in data_rows)
+
+    for decoder in EXPECTED_SKIP_DECODERS:
+        assert f"\\texttt{{{decoder}}}:" not in text
+    assert r"\texttt{pyvips}:" not in text
+    assert r"\texttt{tensorflow}:" not in text
+
+    expected_choices = [
</code_context>
<issue_to_address>
**issue (testing):** Missing negative-path test for the case where a platform has fewer than three zero-skip DataLoader choices.

The new test only exercises the happy path with real benchmark data and doesn’t verify the error branch in `_platform_recommendation_rows` that raises `ValueError` when a platform has fewer than three zero-skip choices.

Please add a small synthetic test that builds a minimal `dl` where one platform has only 0–2 zero-skip choices (or temporarily patches `EXPECTED_DATALOADER_DECODERS` / `PAPER_PLATFORMS`), and asserts that `_platform_recommendation_rows` or `generate_platform_recommendation_table` raises `ValueError` with the expected message. This will lock in the failure mode and protect future refactors.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-05-12T07:22:59Z

+def test_platform_recommendation_table_has_three_zero_skip_choices_per_platform(tmp_path: Path) -> None:
+    root = _repo_output()
+    dest = tmp_path / "table07_platform_recommendations.tex"
+    generate_platform_recommendation_table(load_dataloader_results(root), dest)
+
+    text = dest.read_text(encoding="utf-8")
+    data_rows = [line for line in text.splitlines() if "GCP \\texttt" in line]
+    assert len(data_rows) == len(PAPER_PLATFORMS)
+    assert all(row.count("img/s") == 3 for row in data_rows)
+


issue (testing): Missing negative-path test for the case where a platform has fewer than three zero-skip DataLoader choices.

The new test only exercises the happy path with real benchmark data and doesn’t verify the error branch in _platform_recommendation_rows that raises ValueError when a platform has fewer than three zero-skip choices.

Please add a small synthetic test that builds a minimal dl where one platform has only 0–2 zero-skip choices (or temporarily patches EXPECTED_DATALOADER_DECODERS / PAPER_PLATFORMS), and asserts that _platform_recommendation_rows or generate_platform_recommendation_table raises ValueError with the expected message. This will lock in the failure mode and protect future refactors.

Copilot

Pull request overview

Updates the repository’s publication-facing assets to reflect the new arXiv preprint, and extends the paper asset generator to emit an additional per-platform “starting points” LaTeX table (with a validating test).

Changes:

Add a new LaTeX table generator for per-platform zero-skip DataLoader recommendations, including LaTeX escaping utilities.
Add a regression test asserting three zero-skip choices per platform and expected selections from the committed output/ dataset.
Update README with the new preprint link and refreshed BibTeX citation; set Matplotlib font embedding defaults for PDF/PS outputs.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
tools/paper_assets.py	Adds LaTeX escaping + a new platform recommendation table generator; sets Matplotlib rcParams for font embedding.
tests/test_paper_assets.py	Adds a test validating the generated platform recommendation LaTeX table contents.
README.md	Updates the preprint link and BibTeX citation to the new arXiv preprint.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

ternaus added 2 commits May 6, 2026 15:36

Add generated platform recommendation table

b073c8c

docs: update arxiv preprint citation

fa5057a

Copilot AI review requested due to automatic review settings May 12, 2026 07:21

Copilot started reviewing on behalf of ternaus May 12, 2026 07:22 View session

sourcery-ai Bot reviewed May 12, 2026

View reviewed changes

Copilot AI reviewed May 12, 2026

View reviewed changes

ternaus merged commit 0cd6d69 into main May 12, 2026
10 checks passed

ternaus deleted the codex/update-arxiv-preprint-citation branch May 12, 2026 07:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Codex/update arxiv preprint citation#35

Codex/update arxiv preprint citation#35
ternaus merged 2 commits into
mainfrom
codex/update-arxiv-preprint-citation

ternaus commented May 12, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot commented May 12, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

sourcery-ai Bot May 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ternaus commented May 12, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ternaus commented May 12, 2026 •

edited by sourcery-ai Bot

Loading

sourcery-ai Bot commented May 12, 2026 •

edited

Loading