Skip to content

fix(scheduler): set EnableTaskIdBasedBlobDigest in preheat to match dfdaemon default, fixing preheat cache miss#4787

Open
jasoncai1227 wants to merge 4 commits into
dragonflyoss:mainfrom
jasoncai1227:feat/preheatEnableTaskIdBasedBlobDigestDefaultSetTrue
Open

fix(scheduler): set EnableTaskIdBasedBlobDigest in preheat to match dfdaemon default, fixing preheat cache miss#4787
jasoncai1227 wants to merge 4 commits into
dragonflyoss:mainfrom
jasoncai1227:feat/preheatEnableTaskIdBasedBlobDigestDefaultSetTrue

Conversation

@jasoncai1227

Copy link
Copy Markdown

Description

This PR fixes a preheat cache miss issue caused by task ID mismatch between the scheduler's preheat requests and the dfdaemon client's default task ID calculation behavior.

Root Cause: When the dfdaemon client downloads OCI blob URLs (e.g., /v2/<repository>/blobs/sha256:<digest>), it defaults enable_task_id_based_blob_digest to true (see client/dragonfly-client-util/src/request/mod.rs), which means the task ID is derived from the blob digest rather than the full URL. However, when the scheduler sends preheat requests, it does not set EnableTaskIdBasedBlobDigest, so the task ID is calculated using TaskIDV2ByURLBased (based on the full URL). This results in different task IDs for the same blob, causing the preheated cache to be missed by subsequent dfdaemon downloads.

Changes:

  1. pkg/digest/digest.go — Added IsBlobURL() and ExtractFromBlobURL() utilities to detect OCI blob URLs and extract the digest from them, mirroring the Rust client's is_blob_url() and BLOB_URL_REGEX in client/dragonfly-client-util/src/digest/mod.rs.

  2. pkg/idgen/task_id.go — Added TaskIDByBlobDigest() function that generates a task ID from the blob digest extracted from an OCI blob URL, consistent with the Rust client's TaskIDParameter::BlobDigestBased logic.

  3. scheduler/job/job.go — Updated three preheat functions (preheatV2SingleSeedPeerByURL, PreheatAllSeedPeers, PreheatAllPeers) to:

    • Detect whether the URL is an OCI blob URL using pkgdigest.IsBlobURL().
    • If it is a blob URL, use idgen.TaskIDByBlobDigest() for task ID calculation.
    • Set EnableTaskIdBasedBlobDigest: isBlobURL in the DownloadTaskRequest to align with the dfdaemon client's default behavior.
  4. pkg/digest/digest_test.go — Added unit tests for IsBlobURL() (9 cases) and ExtractFromBlobURL() (7 cases), covering valid blob URLs with various schemes, query params, nested repositories, and invalid URLs.

  5. pkg/idgen/task_id_test.go — Added unit tests for TaskIDByBlobDigest() (7 cases), including a key test that verifies the same blob digest from different registries produces the same task ID.

Motivation and Context

When preheating OCI container images, the scheduler and dfdaemon client calculate different task IDs for the same blob URL, causing preheated data to never be served to subsequent download requests. This is because:

  • dfdaemon client (Rust): defaults enable_task_id_based_blob_digest to true for proxy requests, using the blob's SHA256 digest as the task ID for OCI blob URLs.
  • scheduler preheat: always uses TaskIDV2ByURLBased which incorporates the full URL, producing a different task ID.

This mismatch means preheat operations complete successfully (data is cached on seed peers), but when dfdaemon later downloads the same blob, it looks for a task ID based on the blob digest and cannot find the preheated data, resulting in a cache miss and redundant download.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation Update (if none of the other choices apply)

Checklist

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.

@codecov

codecov Bot commented May 27, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 28.35821% with 48 lines in your changes missing coverage. Please review.
✅ Project coverage is 28.14%. Comparing base (42aefc1) to head (194841a).
⚠️ Report is 16 commits behind head on main.

Files with missing lines Patch % Lines
scheduler/job/job.go 0.00% 48 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4787      +/-   ##
==========================================
+ Coverage   28.09%   28.14%   +0.05%     
==========================================
  Files         232      233       +1     
  Lines       23136    23158      +22     
==========================================
+ Hits         6499     6518      +19     
- Misses      16197    16200       +3     
  Partials      440      440              
Flag Coverage Δ
unittests 28.14% <28.35%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
pkg/digest/digest.go 80.76% <100.00%> (+2.99%) ⬆️
pkg/idgen/task_id.go 82.97% <100.00%> (+2.02%) ⬆️
scheduler/job/task.go 100.00% <100.00%> (ø)
scheduler/job/job.go 0.00% <0.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

caich1 added 3 commits May 27, 2026 15:31
…fdaemon default, fixing preheat cache miss

Signed-off-by: caich1 <caich1@chinatelecom.cn>
Signed-off-by: caich1 <caich1@chinatelecom.cn>
Signed-off-by: caich1 <caich1@chinatelecom.cn>
@jasoncai1227 jasoncai1227 force-pushed the feat/preheatEnableTaskIdBasedBlobDigestDefaultSetTrue branch from 99a90db to 388b46e Compare May 27, 2026 07:43
@gaius-qi gaius-qi added the enhancement New feature or request label May 27, 2026
@gaius-qi

Copy link
Copy Markdown
Member

@jasoncai1227 Please fix lint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants