fix: resolve CI failures — gofmt and errcheck lint#245
Conversation
Run gofmt on 6 skill-related files from PR GoogleCloudPlatform#399 and replace os.Setenv with t.Setenv in pipeline_health_test.go from PR GoogleCloudPlatform#401 to satisfy the errcheck linter.
There was a problem hiding this comment.
Code Review
This pull request introduces a new build command for container images, integrates OpenTelemetry metrics export to Cloud Monitoring for the Hub server, adds self-monitoring metrics to the telemetry pipeline, and enhances the test-login endpoint with input validation and size limits. The review feedback suggests performing atomic writes when updating config.yaml to prevent corruption, passing context.Context through the GCP token metrics recorder to align with OpenTelemetry best practices, and avoiding the use of high-cardinality broker_id as a metric attribute to prevent cardinality explosion.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if err := os.WriteFile(configPath, updatedData, 0644); err != nil { | ||
| return fmt.Errorf("failed to write updated config.yaml: %w", err) | ||
| } |
There was a problem hiding this comment.
Writing directly to config.yaml using os.WriteFile can leave the file truncated or corrupted if the write operation is interrupted or fails (e.g., due to a full disk or process termination).
To ensure robustness and prevent file corruption, perform an atomic write by writing the updated data to a temporary file in the same directory first, and then renaming it to the target path using os.Rename.
tmpConfigPath := configPath + ".tmp"
if err := os.WriteFile(tmpConfigPath, updatedData, 0644); err != nil {
return fmt.Errorf("failed to write temporary config.yaml: %w", err)
}
if err := os.Rename(tmpConfigPath, configPath); err != nil {
_ = os.Remove(tmpConfigPath)
return fmt.Errorf("failed to update config.yaml atomically: %w", err)
}| type GCPTokenMetricsRecorder interface { | ||
| RecordAccessTokenRequest(success bool, latency time.Duration) | ||
| RecordIDTokenRequest(success bool, latency time.Duration) | ||
| RecordRateLimitRejection() | ||
| GetSnapshot() *GCPTokenMetricsSnapshot | ||
| } |
There was a problem hiding this comment.
To align with OpenTelemetry best practices, the GCPTokenMetricsRecorder interface methods should accept a context.Context as their first parameter.
Passing the active context allows the OpenTelemetry SDK to associate measurements with the active span/trace (enabling exemplars) and propagates baggage or other context-scoped attributes.
| type GCPTokenMetricsRecorder interface { | |
| RecordAccessTokenRequest(success bool, latency time.Duration) | |
| RecordIDTokenRequest(success bool, latency time.Duration) | |
| RecordRateLimitRejection() | |
| GetSnapshot() *GCPTokenMetricsSnapshot | |
| } | |
| type GCPTokenMetricsRecorder interface { | |
| RecordAccessTokenRequest(ctx context.Context, success bool, latency time.Duration) | |
| RecordIDTokenRequest(ctx context.Context, success bool, latency time.Duration) | |
| RecordRateLimitRejection(ctx context.Context) | |
| GetSnapshot() *GCPTokenMetricsSnapshot | |
| } |
| func (r *OTelGCPTokenMetrics) RecordAccessTokenRequest(success bool, latency time.Duration) { | ||
| ctx := context.Background() | ||
| r.accessRequests.Add(ctx, 1) | ||
| if success { | ||
| r.accessSuccesses.Add(ctx, 1) | ||
| } else { | ||
| r.accessFailures.Add(ctx, 1) | ||
| } | ||
| r.iamDuration.Record(ctx, float64(latency.Milliseconds())) | ||
| r.snap.RecordAccessTokenRequest(success, latency) | ||
| } | ||
|
|
||
| func (r *OTelGCPTokenMetrics) RecordIDTokenRequest(success bool, latency time.Duration) { | ||
| ctx := context.Background() | ||
| r.idRequests.Add(ctx, 1) | ||
| if success { | ||
| r.idSuccesses.Add(ctx, 1) | ||
| } else { | ||
| r.idFailures.Add(ctx, 1) | ||
| } | ||
| r.iamDuration.Record(ctx, float64(latency.Milliseconds())) | ||
| r.snap.RecordIDTokenRequest(success, latency) | ||
| } | ||
|
|
||
| func (r *OTelGCPTokenMetrics) RecordRateLimitRejection() { | ||
| r.rateLimitRejects.Add(context.Background(), 1) | ||
| r.snap.RecordRateLimitRejection() | ||
| } |
There was a problem hiding this comment.
Update the OTelGCPTokenMetrics recorder implementation to accept and propagate the active context.Context instead of using context.Background(). This enables proper trace-metric correlation (exemplars) and context propagation.
func (r *OTelGCPTokenMetrics) RecordAccessTokenRequest(ctx context.Context, success bool, latency time.Duration) {
r.accessRequests.Add(ctx, 1)
if success {
r.accessSuccesses.Add(ctx, 1)
} else {
r.accessFailures.Add(ctx, 1)
}
r.iamDuration.Record(ctx, float64(latency.Milliseconds()))
r.snap.RecordAccessTokenRequest(success, latency)
}
func (r *OTelGCPTokenMetrics) RecordIDTokenRequest(ctx context.Context, success bool, latency time.Duration) {
r.idRequests.Add(ctx, 1)
if success {
r.idSuccesses.Add(ctx, 1)
} else {
r.idFailures.Add(ctx, 1)
}
r.iamDuration.Record(ctx, float64(latency.Milliseconds()))
r.snap.RecordIDTokenRequest(success, latency)
}
func (r *OTelGCPTokenMetrics) RecordRateLimitRejection(ctx context.Context) {
r.rateLimitRejects.Add(ctx, 1)
r.snap.RecordRateLimitRejection()
}|
|
||
| func (r *OTelMetricsRecorder) RecordAuthAttempt(brokerID string, success bool, latency time.Duration) { | ||
| ctx := context.Background() | ||
| attrs := metric.WithAttributes(attribute.String("broker_id", brokerID)) |
There was a problem hiding this comment.
Using broker_id (which is a high-cardinality unique identifier like a UUID) as a metric attribute/dimension can lead to a cardinality explosion in the metrics backend (e.g., Google Cloud Monitoring). This can significantly increase ingestion costs, degrade query performance, or cause metrics to be dropped due to backend limits.
Consider whether tracking individual broker_ids is strictly necessary at the metrics layer. If individual broker tracking is required, it is generally better suited for logs or distributed traces. For metrics, consider aggregating by a lower-cardinality attribute (such as broker version, region, or status) or removing the broker_id dimension entirely.
Summary
gofmton 6 skill-related files from PR Scion/skill bank M1 GoogleCloudPlatform/scion#399 that were merged without formattingos.Setenvwitht.Setenvinpipeline_health_test.gofrom PR feat: wire hub OTel metrics pipeline with GCP Cloud Monitoring export GoogleCloudPlatform/scion#401 to satisfy theerrchecklinterCI Failures Fixed
gofmt -l .flagged 6 files:caching_skill_resolver_test.go,skill_resolver.go,skill_uri.go,skill_handlers.go,skills.go,skill_store.goos.Setenvreturn values inpipeline_health_test.golines 19-21Test plan
gofmt -l .returns no outputgo build ./...passesgo vet ./...passesgo test ./pkg/sciontool/telemetry/...passes