Skip to content

feat: use tree_tests_rollup for tree details summary#1842

Merged
gustavobtflores merged 3 commits intokernelci:mainfrom
gustavobtflores:feat/tree-details-denormalization-view
Apr 10, 2026
Merged

feat: use tree_tests_rollup for tree details summary#1842
gustavobtflores merged 3 commits intokernelci:mainfrom
gustavobtflores:feat/tree-details-denormalization-view

Conversation

@gustavobtflores
Copy link
Copy Markdown
Contributor

@gustavobtflores gustavobtflores commented Apr 6, 2026

Description

Tree details summary now reads pre-aggregated test and boot data from tree_tests_rollup instead of scanning per-test rows from a single large query. Builds still come from a dedicated builds query. This matches the denormalized rollup model and should reduce work for the summary endpoint.

Changes

  • Add treeDetailsRollup.py helpers to normalize build rows and to apply filters, issues, and status summaries from rollup rows.
  • Refactor treeDetailsSummaryView to call both queries and process builds and rollup rows in separate sanitization paths.
  • Extend seed_test_data to build tree_tests_rollup rows from seeded tests/incidents.
  • Add TestsRollupFactory and export it from test factories.

How to test

  1. Ensure migrations/schema include tree_tests_rollup and run seed_test_data or populate database with ingester data.
  2. Call the tree details summary API for a commit that has builds and rollup data; confirm summaries and filters behave as before for representative trees.
  3. Run the backend test suite covering tree details / summary if present.

Closes #1801

@gustavobtflores gustavobtflores changed the title Feat/tree details denormalization view feat: use tree_tests_rollup for tree details summary Apr 6, 2026
@gustavobtflores gustavobtflores force-pushed the feat/tree-details-denormalization-view branch 2 times, most recently from c003c95 to 3a35e46 Compare April 6, 2026 18:12
@gustavobtflores gustavobtflores self-assigned this Apr 6, 2026
@gustavobtflores gustavobtflores added the Backend Most or all of the changes for this issue will be in the backend code. label Apr 6, 2026
ROLLUP_TEST_ID = "rollup_test"


def normalize_build_dict(row_dict: dict) -> dict:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice touch

):
return True

if len(platform_filters) > 0 and test_platform not in platform_filters:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer to check if platform_filters instead of using len

if len(platform_filters) > 0 and test_platform not in platform_filters:
return True

if len(origin_filters) > 0 and test_origin not in origin_filters:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

origin_summary = instance.test_summary["origins"]
typed_summary = instance.test_summary_typed

arch_key = "%s-%s" % (build_arch, build_compiler)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use a tuple as a key here?

lab_entry = typed_summary.labs.setdefault(test_lab, StatusCount())
setattr(lab_entry, status_name, getattr(lab_entry, status_name) + count)

if is_status_failure(status_name):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can avoid this conditional, and just use the value:
has_status_failure |= is_status_failure(status_name)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even simpler, just add current test_platform to platforms_failing here

issue_comment: Optional[str],
issue_report_url: Optional[str],
starting_count_status: Optional[DatabaseStatusValues],
autoincrement: bool = True,
Copy link
Copy Markdown
Contributor

@alanpeixinho alanpeixinho Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add this flag? Cant it be the default behaviour?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it still behaves as before because the flag defaults to True. however, I needed to reuse this function for rollup aggregations in the view, and in that context it's not appropriate to increment the status after creating the IssueTyped.

this also highlights some technical debt: a function named create_issue_typed having a hidden side effect (incrementing status values) is confusing.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the tree details summary endpoint to use the denormalized tree_tests_rollup table for test/boot aggregation (instead of scanning per-test rows), while keeping builds fetched from a dedicated builds query to reduce load and improve performance.

Changes:

  • Refactor summary view to fetch/process builds and rollup rows via separate queries and sanitization paths.
  • Add treeDetailsRollup.py helpers for build normalization plus rollup-based filters, issues, and status summaries.
  • Extend seed data + factories to generate tree_tests_rollup rows for tests/incidents; enhance status counting utilities to support increment-by-count.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
backend/kernelCI_app/views/treeDetailsSummaryView.py Switch summary aggregation to two-query model (builds + rollup) and separate sanitization.
backend/kernelCI_app/helpers/treeDetailsRollup.py New helper module to aggregate filters/issues/summaries from rollup rows and normalize build rows.
backend/kernelCI_app/queries/tree.py Add get_tree_details_builds and get_tree_details_rollup queries (cached) to support the new summary flow.
backend/kernelCI_app/management/commands/seed_test_data.py Seed tree_tests_rollup rows derived from seeded tests/incidents and clear rollup table on reset.
backend/kernelCI_app/utils.py Extend create_issue_typed with autoincrement to support rollup-driven issue counts.
backend/kernelCI_app/typeModels/common.py Allow StatusCount.increment() to increment by an arbitrary count (for rollups).
backend/kernelCI_app/tests/factories/tree_tests_rollup_factory.py Add a factory for TreeTestsRollup model rows.
backend/kernelCI_app/tests/factories/init.py Export the new TreeTestsRollupFactory.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

):
row_dict["issue_id"] = UNCATEGORIZED_STRING

return row_dict
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function seems pretty nice, but do you thing that using this code below might improve readability? Feel free to resolve it if you disagree.

defaults = {
    "build_status": NULL_STATUS,
    "build_architecture": UNKNOWN_STRING,
    "build_compiler": UNKNOWN_STRING,
    "build_config_name": UNKNOWN_STRING,
}
for key, default in defaults.items():
    row_dict.setdefault(key, default)

)
config_entry = config_map.setdefault(build_config, {})

is_env_compatible = hardware_key != UNKNOWN_STRING and hardware_key != test_platform
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is_env_compatible = hardware_key not in {test_platform, UNKNOWN_STRING} looks more pythonic to me as it has less repetitions

tree_name_full_clause = "AND " + tree_name_clause if tree_name_clause else ""
git_url_full_clause = "AND " + git_url_clause if git_url_clause else ""

query = f"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: have you thought of creating another file with bigger queries? I believe that would help improve the view readability

@gustavobtflores gustavobtflores force-pushed the feat/tree-details-denormalization-view branch 2 times, most recently from e582372 to a57bb74 Compare April 9, 2026 23:51
@gustavobtflores gustavobtflores force-pushed the feat/tree-details-denormalization-view branch from a57bb74 to 689d120 Compare April 10, 2026 12:58
@gustavobtflores gustavobtflores requested review from MarceloRobert, alanpeixinho and dede999 and removed request for alanpeixinho and dede999 April 10, 2026 13:05
}

rows = get_query_cache(cache_key, params)
if rows is None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: prefer to return early if not None

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it is kinda of a pattern within the other functions, I would prefer to leave it as it is

Copy link
Copy Markdown
Contributor

@alanpeixinho alanpeixinho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +504 to +508
git_branch_clause = checkout_clauses.get("git_branch_clause")
tree_name_clause = checkout_clauses.get("tree_name_clause")
git_url_clause = checkout_clauses.get("git_url_clause")
tree_name_full_clause = "AND " + tree_name_clause if tree_name_clause else ""
git_url_full_clause = "AND " + git_url_clause if git_url_clause else ""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nit: why not a full_clause for git_branch?

Comment on lines +779 to +783
git_branch_clause = checkout_clauses.get("git_branch_clause")
tree_name_clause = checkout_clauses.get("tree_name_clause")
git_url_clause = checkout_clauses.get("git_url_clause")
tree_name_full_clause = "AND " + tree_name_clause if tree_name_clause else ""
git_url_full_clause = "AND " + git_url_clause if git_url_clause else ""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

git_url_full_clause = "AND " + git_url_clause if git_url_clause else ""

query = f"""
WITH RELEVANT_CHECKOUTS AS (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not blocker but maybe we could move the relevant_checkouts query to a separate builder method so it can be reused between the queries that require it

@gustavobtflores gustavobtflores added this pull request to the merge queue Apr 10, 2026
Merged via the queue into kernelci:main with commit 6b46d3f Apr 10, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Backend Most or all of the changes for this issue will be in the backend code.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: denormalize tree details with rollup table to improve summary performance

5 participants