Skip to content

feat: RFC Appendix B evaluation-reason system tests for Java FFE [java@typo/full-spec-evaluator]#6793

Draft
typotter wants to merge 14 commits intomainfrom
typo/ffe-reason-tests
Draft

feat: RFC Appendix B evaluation-reason system tests for Java FFE [java@typo/full-spec-evaluator]#6793
typotter wants to merge 14 commits intomainfrom
typo/ffe-reason-tests

Conversation

@typotter
Copy link
Copy Markdown
Contributor

Motivation

RFC Appendix B defines 26 test cases covering all valid (return value type, reason, error code) combinations for Feature Flag Evaluation. This adds system tests validating that SDK implementations emit the correct evaluation reason and error details in span tags — work that was previously untested at this level.

Changes

tests/ffe/test_flag_eval_reasons.py (new, 1466 lines)

  • Covers REASON-2 through REASON-26 from RFC Appendix B
  • REASON-1, 3 (missing flag), 4, 5, 6, 12, 15 are already covered in test_flag_eval_metrics.py (cross-referenced in the file header)
  • 8 tests pass against Java ≥ v1.61.0
  • 12 tests are missing_feature pending Java RFC compliance gaps (see manifest comment block for details)
  • Uses REASON-## naming throughout (self-documenting vs opaque B-## references)

manifests/java.yml

  • Replaces a file-level missing_feature entry (which defeated all per-test version pins due to additive manifest semantics) with 20 individual per-test entries
  • Fixes stale class names carried over from pre-rename state
  • Upgrades several test_flag_eval_metrics.py entries from bug (FFL-1972)v1.61.0 where the underlying Java issues are now resolved

Workflow

  1. ⚠️ Create your PR as draft ⚠️
  2. Work on you PR until the CI passes
  3. Mark it as ready for review
    • Test logic is modified? -> Get a review from RFC owner.
    • Framework is modified, or non obvious usage of it -> get a review from R&P team

🚀 Once your PR is reviewed and the CI green, you can merge it!

🛟 #apm-shared-testing 🛟

Reviewer checklist

  • Anything but tests/ or manifests/ is modified ? I have the approval from R&P team
  • A docker base image is modified?
    • the relevant build-XXX-image label is present
  • A scenario is added, removed or renamed?

…va FFE

Adds test_flag_eval_reasons.py covering REASON-2 through REASON-26 from RFC
Appendix B (evaluation reason + error code correctness). 8 tests pass against
Java ≥ v1.61.0; 12 are marked missing_feature pending Java RFC compliance work.

Updates java.yml: replaces file-level missing_feature (which defeated per-test
version pins) with 20 individual entries. Fixes stale class names from pre-rename.
Upgrades several test_flag_eval_metrics.py entries from bug→v1.61.0 now that
the underlying Java issues are resolved.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 22, 2026

CODEOWNERS have been resolved as:

ffe-reason-ufc-examples.md                                              @DataDog/system-tests-core
tests/ffe/test_flag_eval_reasons.py                                     @DataDog/feature-flagging-and-experimentation-sdk @DataDog/system-tests-core
manifests/java.yml                                                      @DataDog/asm-java @DataDog/apm-java

Fixes in dd-trace-java master (PRs #11036, #11037, #11071) resolve the
FFL-1972 bugs. Use v1.62.0-SNAPSHOT so these tests run as active (not
xfail) in CI against Java master.

Also removes the file-level missing_feature entry and fixes two YAML
explicit-key entries for the parse-error tests.
@typotter typotter changed the title feat: RFC Appendix B evaluation-reason system tests for Java FFE feat: RFC Appendix B evaluation-reason system tests for Java FFE [java@master] Apr 22, 2026
typotter added 11 commits April 22, 2026 16:04
Manifest validator requires string-sorted keys. REASON_10 < REASON_2_ etc.
because '0' (48) < '_' (95) in ASCII. Reorders all 20 REASON entries.
- Add feature_flag.key tag assertion to all 20 test methods (H-1)
- Add variant=on assertion to REASON-11 test (H-3)
- Move rc.tracer_rc_state.reset().apply() inside try block in REASON-2
  setup so it is covered by the finally cleanup (H-4)
- Wrap REASON-2 setup body in try/finally for exception-safe mock
  restoration (M-1); wrap finally cleanup in try/except so cleanup
  failures do not mask the primary exception (M-2)
- Fix ruff formatting on long assert in REASON-22
…ests

- Replace try/except/pass with contextlib.suppress (SIM105, S110)
- Activate all 12 remaining missing_feature reason tests at v1.62.0-SNAPSHOT
REASON-11 (StaticNoSplit) passes at v1.62.0-SNAPSHOT.
11 tests remain missing_feature with failure-mode comments.
Surface SDKs that fail to parse or evaluate an empty splits array.
Previously used vacuous-split form which masked this class of bug.
An allocation with splits:[] cannot produce a variant. Correct expected
result is coded default / DEFAULT (not STATIC). Update test assertions,
fixture docstring, and UFC examples doc accordingly.
'No matching split' means the allocation has splits with shards but the
targeting key falls outside every shard range — allocation skipped →
DEFAULT. The empty splits:[] form is an unexpected structural edge case
covered by a separate spec point.

Fixture: salt 'b11-static-no-match' + range [3921,10000) guarantees
user-1 (shard 3920) deterministically misses. Manifest reset to
missing_feature pending a verified test run against Java master.
"No split" means no matching split — the split entry exists but the subject's
hash lands outside all shard ranges. Allocation skipped, waterfall exhausted
→ coded default / DEFAULT. Distinct from REASON-12 (vacuous split → STATIC)
and from the empty-splits-array edge case (separate spec point).

Rename fixture to make_shard_miss_static_fixture, class to
Test_FFE_REASON_11_StaticNoMatchingSplit. Update manifest and UFC examples doc.
Shard-miss fixture (no matching split → DEFAULT) verified passing
against Java 1.62.0-SNAPSHOT. 9 REASON tests now active.
@typotter typotter changed the title feat: RFC Appendix B evaluation-reason system tests for Java FFE [java@master] feat: RFC Appendix B evaluation-reason system tests for Java FFE [java@typo/full-spec-evaluator] Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant