Skip to content

docs(flaky-tests): document AI Investigation tab#539

Draft
samgutentag wants to merge 1 commit intomainfrom
sam-gutentag/flaky-tests-ai-investigation-tab
Draft

docs(flaky-tests): document AI Investigation tab#539
samgutentag wants to merge 1 commit intomainfrom
sam-gutentag/flaky-tests-ai-investigation-tab

Conversation

@samgutentag
Copy link
Copy Markdown
Member

Summary

  • Adds a new page flaky-tests/ai-investigation.md documenting the Analysis tab on the Flaky Tests test detail page (trunk-io/trunk2#3504)
  • Covers viewing investigation results, triggering new investigations, applying fix recommendations, and viewing investigation history
  • Documents all seven fact types surfaced in Key Findings
  • Adds the page to summary.md

Source

Test plan

  • Preview in GitBook

Adds a new page covering the Analysis tab introduced in trunk-io/trunk2#3504.
Documents how to view AI investigation results, trigger new investigations,
apply fix recommendations, and browse investigation history.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@trunk-staging-io
Copy link
Copy Markdown

Merging to main in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 11, 2026

Claude finished @samgutentag's task in 1m 39s —— View job


PR Review

  • Gather context and read files
  • Review flaky-tests/ai-investigation.md
  • Review summary.md changes
  • Post inline review comments

Overall the page is well-structured, follows the repo's documentation conventions, and all relative links resolve to existing files. Two issues found:

Blocking

  • Line 31 — Confidence score ranges overlap: "yellow for 50%+" includes the 80%+ range already labeled green. The three bands need to be mutually exclusive (e.g., green 80%+, yellow 50–79%, orange below 50%). Inline suggestion posted.

Non-blocking

  • Line 55 — The "CI Logs" fact type description reads as if Trunk is augmenting test failure outputs rather than analyzing CI logs. Inline suggestion posted for a clearer phrasing, but not blocking.


At the top, you will see:

- The **overall confidence score** (color-coded green for 80%+, yellow for 50%+, orange below 50%)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The confidence score ranges overlap: "yellow for 50%+" technically includes the 80%+ range already described as green. The three bands should be mutually exclusive.

Suggested change
- The **overall confidence score** (color-coded green for 80%+, yellow for 50%+, orange below 50%)
- The **overall confidence score** (color-coded green for 80%+, yellow for 50–79%, orange below 50%)


| Fact type | What it analyzes |
|---|---|
| **CI Logs** | Supplements test failure outputs with CI workflow logs |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phrasing "Supplements test failure outputs with CI workflow logs" reads as though the fact type is augmenting the outputs rather than analyzing the logs themselves. Suggest clarifying the direction:

Suggested change
| **CI Logs** | Supplements test failure outputs with CI workflow logs |
| **CI Logs** | CI workflow logs that provide additional context beyond test failure output |

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants