diff --git a/services/libs/tinybird/datasources/project_insights_copy_ds.datasource b/services/libs/tinybird/datasources/project_insights_copy_ds.datasource index cdc97c9681..fd1554607e 100644 --- a/services/libs/tinybird/datasources/project_insights_copy_ds.datasource +++ b/services/libs/tinybird/datasources/project_insights_copy_ds.datasource @@ -5,6 +5,16 @@ DESCRIPTION > - `id` column is the primary key identifier for the project. - `name` column is the human-readable project name. - `slug` column is the URL-friendly identifier used in routing and filtering. + - `logoUrl` column is the URL to the project's logo image. + - `isLF` column indicates whether the project is a Linux Foundation project (1 = true, 0 = false). + - `contributorCount` column is the total number of contributors for the project. + - `organizationCount` column is the total number of organizations for the project. + - `softwareValue` column is the estimated economic value of the software. + - `contributorDependencyCount` column is the number of contributors making up 51% of contributions (bus factor). + - `contributorDependencyPercentage` column is the combined contribution percentage of dependent contributors. + - `organizationDependencyCount` column is the number of organizations making up 51% of contributions. + - `organizationDependencyPercentage` column is the combined contribution percentage of dependent organizations. + - `achievements` column is an array of tuples (leaderboardType, rank, totalCount) representing project achievements. - `healthScore` column is the overall health score for the project (0-100). - `firstCommit` column is the timestamp of the first commit in the project. - `starsLast365Days` column is the count of stars in the last 365 days. @@ -22,7 +32,16 @@ SCHEMA > `id` String, `name` String, `slug` String, + `logoUrl` String, + `isLF` UInt8, + `contributorCount` UInt64, + `organizationCount` UInt64, `softwareValue` UInt64, + `contributorDependencyCount` UInt64, + `contributorDependencyPercentage` Float64, + `organizationDependencyCount` UInt64, + `organizationDependencyPercentage` Float64, + `achievements` Array(Tuple(String, UInt64, UInt64)), `healthScore` Nullable(Float64), `firstCommit` Nullable(DateTime64(3)), `starsLast365Days` UInt64, diff --git a/services/libs/tinybird/pipes/project_insights.pipe b/services/libs/tinybird/pipes/project_insights.pipe index e5fc6c699e..a73d3be3fb 100644 --- a/services/libs/tinybird/pipes/project_insights.pipe +++ b/services/libs/tinybird/pipes/project_insights.pipe @@ -1,9 +1,12 @@ DESCRIPTION > - `project_insights.pipe` serves project insights data for a specific project. - - Returns comprehensive metrics including health score, first commit, and activity metrics for both current and previous 365-day periods. + - Returns comprehensive metrics including project metadata (name, logoUrl, isLF), health score, contributor count, software value, contributor and organization dependency metrics, leaderboard achievements, and activity metrics for both current and previous 365-day periods. - Parameters: - - `slug`: Required string for project slug (e.g., 'kubernetes', 'tensorflow') - - Response: Single project record with all insights metrics + - `slug`: Optional string for a single project slug (e.g., 'kubernetes') + - `slugs`: Optional array of project slugs for multi-project query (e.g., ['kubernetes', 'tensorflow']) + - `ids`: Optional array of project ids for multi-project query + - At least one of `slug`, `slugs`, or `ids` should be provided. + - Response: Project records with all insights metrics including achievements as array of (leaderboardType, rank, totalCount) tuples TAGS ""Insights, Widget", "Project"" NODE project_insights_endpoint @@ -13,7 +16,16 @@ SQL > id, name, slug, + logoUrl, + isLF, + contributorCount, + organizationCount, softwareValue, + contributorDependencyCount, + contributorDependencyPercentage, + organizationDependencyCount, + organizationDependencyPercentage, + achievements, healthScore, firstCommit, starsLast365Days, @@ -28,5 +40,13 @@ SQL > WHERE 1 = 1 {% if defined(slug) %} - AND slug = {{ String(slug, description="Project slug", required=True) }} + AND slug = {{ String(slug, description="Project slug", required=False) }} + {% end %} + {% if defined(slugs) %} + AND slug + IN {{ Array(slugs, 'String', description="Filter by project slug list", required=False) }} + {% end %} + {% if defined(ids) %} + AND id + IN {{ Array(ids, 'String', description="Filter by project id list", required=False) }} {% end %} diff --git a/services/libs/tinybird/pipes/project_insights_copy.pipe b/services/libs/tinybird/pipes/project_insights_copy.pipe index 29ec737da8..21509fb2ad 100644 --- a/services/libs/tinybird/pipes/project_insights_copy.pipe +++ b/services/libs/tinybird/pipes/project_insights_copy.pipe @@ -1,11 +1,56 @@ NODE project_insights_copy_base_projects DESCRIPTION > - Returns base project information (id, name, slug, segmentId, softwareValue, healthScore, firstCommit) + Returns base project information (id, name, slug, segmentId, logoUrl, isLF, contributorCount, organizationCount, softwareValue, healthScore, firstCommit) SQL > - SELECT id, name, slug, segmentId, softwareValue, healthScore, firstCommit + SELECT + id, + name, + slug, + segmentId, + logoUrl, + isLF, + contributorCount, + organizationCount, + softwareValue, + healthScore, + firstCommit FROM insights_projects_populated_ds - GROUP BY id, name, slug, segmentId, softwareValue, healthScore, firstCommit + GROUP BY + id, + name, + slug, + segmentId, + logoUrl, + isLF, + contributorCount, + organizationCount, + softwareValue, + healthScore, + firstCommit + +NODE project_insights_copy_dependency_metrics +DESCRIPTION > + Get contributor and organization dependency metrics from health_score_copy_ds + +SQL > + SELECT + slug, + contributorDependencyCount, + contributorDependencyPercentage, + organizationDependencyCount, + organizationDependencyPercentage + FROM health_score_copy_ds + +NODE project_insights_copy_achievements +DESCRIPTION > + Aggregate leaderboard achievements per project from leaderboards_copy_ds (latest snapshot) + +SQL > + SELECT slug, groupArray(tuple(leaderboardType, rank, totalCount)) AS achievements + FROM leaderboards_copy_ds + WHERE snapshotId = (SELECT max(snapshotId) FROM leaderboards_copy_ds) + GROUP BY slug NODE project_insights_copy_last_365_days_metrics DESCRIPTION > @@ -50,7 +95,16 @@ SQL > base.id AS id, base.name AS name, base.slug AS slug, + base.logoUrl AS logoUrl, + base.isLF AS isLF, + base.contributorCount AS contributorCount, + base.organizationCount AS organizationCount, base.softwareValue AS softwareValue, + COALESCE(dep.contributorDependencyCount, 0) AS contributorDependencyCount, + COALESCE(dep.contributorDependencyPercentage, 0) AS contributorDependencyPercentage, + COALESCE(dep.organizationDependencyCount, 0) AS organizationDependencyCount, + COALESCE(dep.organizationDependencyPercentage, 0) AS organizationDependencyPercentage, + COALESCE(ach.achievements, []) AS achievements, base.healthScore AS healthScore, base.firstCommit AS firstCommit, l365.starsLast365Days AS starsLast365Days, @@ -62,6 +116,8 @@ SQL > p365.activeContributorsPrevious365Days AS activeContributorsPrevious365Days, p365.activeOrganizationsPrevious365Days AS activeOrganizationsPrevious365Days FROM project_insights_copy_base_projects AS base + LEFT JOIN project_insights_copy_dependency_metrics AS dep ON base.slug = dep.slug + LEFT JOIN project_insights_copy_achievements AS ach ON base.slug = ach.slug LEFT JOIN project_insights_copy_last_365_days_metrics AS l365 USING (segmentId) LEFT JOIN project_insights_copy_previous_365_days_metrics AS p365 USING (segmentId)