Skip to content

feat(c-binding): expose custom collector callback#310

Closed
adeshkumar1 wants to merge 1 commit intomainfrom
c-binding/collector-callback
Closed

feat(c-binding): expose custom collector callback#310
adeshkumar1 wants to merge 1 commit intomainfrom
c-binding/collector-callback

Conversation

@adeshkumar1
Copy link
Copy Markdown
Contributor

Summary

Adds dd_tracer_conf_set_collector_callback(handle, callback, user_data) so FFI callers can intercept finished trace chunks in-process instead of POSTing to an agent. The callback receives the exact msgpack bytes /v0.4/traces would have served, produced by the same encoder the DatadogAgent uses (wrapped in a 1-element outer array to match the wire format).

When a callback is installed, telemetry and remote-configuration polling are disabled automatically so the tracer performs zero network I/O.

Motivation

The Kong plugin's integration test (spec/02-integration-tests/01-agent_spec.lua) is currently pending because helpers.http_mock accepts one connection but dd-trace-cpp opens two (telemetry first, then traces). This callback lets the test intercept traces in-process with no HTTP mock.

Notes

  • send is invoked synchronously on the thread finishing each span. Blocking in the callback stalls span finishing on that thread.
  • user_data must outlive the tracer (the callback may fire during dd_tracer_free as pending spans auto-finish).
  • Auto-disabling telemetry + remote-config is one-way for this conf handle. Passing callback == NULL clears the collector but does not re-enable them.
  • Chose raw msgpack bytes over an accessor struct: reuses the existing encoder, keeps the C surface to 1 typedef + 1 function, and any msgpack library decodes it.

Test plan

  • ./build-c/test/tests \"[c_binding]\" — 11 cases / 58 assertions
  • bin/check-format
  • Downstream: link the Kong plugin against this branch, flip 01-agent_spec.lua off pending, confirm it passes with no HTTP mock

🤖 Generated with Claude Code

@adeshkumar1
Copy link
Copy Markdown
Contributor Author

@codex review

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Apr 17, 2026

Benchmarks

Benchmark execution time: 2026-04-17 20:18:32

Comparing candidate commit 3cc9c15 in PR branch c-binding/collector-callback with baseline commit 1f76c5c in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee1edb7346

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread binding/c/src/tracer.cpp Outdated
return;
}
cfg->collector = std::make_shared<CallbackCollector>(callback, user_data);
cfg->telemetry.enabled = false;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Override env telemetry when callback mode is enabled

Setting cfg->telemetry.enabled = false here does not reliably disable telemetry in callback mode, because telemetry finalization gives environment variables higher precedence (pick(env_config->enabled, user_config.enabled, true) in src/datadog/telemetry/configuration.cpp). In environments with DD_INSTRUMENTATION_TELEMETRY_ENABLED=true, the tracer will still start telemetry network traffic even after dd_tracer_conf_set_collector_callback, which breaks the documented "zero network I/O" behavior for this API.

Useful? React with 👍 / 👎.

@datadog-official
Copy link
Copy Markdown

datadog-official Bot commented Apr 17, 2026

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 90.94%

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 3cc9c15 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

@adeshkumar1 adeshkumar1 force-pushed the c-binding/collector-callback branch from ee1edb7 to a6108c6 Compare April 17, 2026 19:55
@adeshkumar1
Copy link
Copy Markdown
Contributor Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a6108c673f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread binding/c/src/tracer.cpp

const auto *config = reinterpret_cast<const dd::TracerConfig *>(conf_handle);
const auto validated_config = dd::finalize_config(*config);
auto validated_config = dd::finalize_config(*config);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Bypass agent URL validation in callback collector mode

dd_tracer_new always runs finalize_config(*config) before handling the callback-collector path, so tracer creation still fails when DD_TRACE_AGENT_URL/DD_OPT_AGENT_URL is malformed even though traces are being diverted to the in-process callback. finalize_config validates agent config unconditionally (src/datadog/tracer_config.cpp), which makes this new callback mode unusable in callback-only deployments that intentionally do not rely on agent transport.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented as a caveat in the header docstring — malformed DD_TRACE_AGENT_URL / DD_OPT_AGENT_URL still fails finalize_config before the callback path is taken. Fixing this properly requires either unsetenv() (process-wide side effect) or refactoring finalize_config to defer agent validation, both of which are out of scope for this PR. Callers in callback mode should simply not set a malformed URL.

Comment thread binding/c/src/tracer.cpp
Comment on lines +204 to +206
validated_config->telemetry.enabled = false;
validated_config->telemetry.report_metrics = false;
validated_config->telemetry.report_logs = false;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Disable telemetry globally when callback collector is used

This only mutates the local tracer config, but telemetry is process-global and initialized once (src/datadog/telemetry/telemetry.cpp uses a static singleton), so if another tracer already initialized telemetry as enabled, creating a callback-configured tracer here will not actually stop telemetry network traffic. In multi-tracer processes, this violates the new API contract that callback mode performs zero network I/O.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented as a caveat in the header docstring. Telemetry is process-global; the config setter can only affect this tracer's finalized config. In multi-tracer processes where an earlier tracer already initialized telemetry as enabled, that traffic continues. Fully silencing it from the C binding would require touching the telemetry module's singleton, which is out of scope. The typical C-binding use case (Kong) is single-tracer-per-process, where this is a non-issue.

@adeshkumar1
Copy link
Copy Markdown
Contributor Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5f810b75d8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread binding/c/src/tracer.cpp Outdated
Comment on lines +176 to +179
// In callback mode the agent URL is never contacted. Clear any user-set
// value so finalize_config's agent URL validation uses the internal
// default and doesn't fail on a malformed DD_OPT_AGENT_URL.
cfg->agent.url.reset();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve agent URL when clearing collector callback

Avoid mutating unrelated configuration when enabling the callback: this path unconditionally calls cfg->agent.url.reset(), but the clear path (callback == NULL) only resets collector. If a caller sets DD_OPT_AGENT_URL, enables the callback, and later clears it (which the API advertises as supported), the original agent URL is silently lost and subsequent tracers will send to the default endpoint instead of the configured one.

Useful? React with 👍 / 👎.

Add dd_tracer_conf_set_collector_callback so FFI callers can intercept
finished trace chunks in-process instead of POSTing to an agent. The
callback receives the exact msgpack bytes that /v0.4/traces would have
served, produced by the same encoder the DatadogAgent uses (wrapped in
a 1-element outer array to match the wire format).

When a callback is installed, telemetry and remote-configuration
polling are disabled automatically so the tracer performs zero network
I/O — the whole point of installing a custom collector is not to
depend on an agent.

Motivated by the Kong plugin's integration test
(spec/02-integration-tests/01-agent_spec.lua), which is currently
pending because helpers.http_mock accepts only one connection and
dd-trace-cpp's default transport opens two (telemetry first, then
traces), so the traces assertion never fires.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@adeshkumar1
Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Chef's kiss.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@adeshkumar1 adeshkumar1 deleted the c-binding/collector-callback branch April 22, 2026 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant