-
Notifications
You must be signed in to change notification settings - Fork 49
use tls for prometheus metrics in ssp operator and template validator #1567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hi @Ruclo. Thanks for your PR. I'm waiting for a kubevirt member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
| return &promv1.SecretOrConfigMap{ | ||
| ConfigMap: &v1.ConfigMapKeySelector{ | ||
| LocalObjectReference: v1.LocalObjectReference{ | ||
| Name: "openshift-service-ca.crt", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make name and key a const? Also, this introduces further hard dependencies to OCP. Is there no way to make it configurable?
| }, | ||
| }, | ||
| } | ||
| tlsConfig.ServerName = ptr.To("virt-template-validator.kubevirt.svc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read that from the cert to be sure? The namespace can vary.
| return &promv1.SecretOrConfigMap{ | ||
| Secret: &v1.SecretKeySelector{ | ||
| LocalObjectReference: v1.LocalObjectReference{ | ||
| Name: "ssp-operator-service-cert", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Names of objects that are not reconciled by SSP should be configurable.
| // Variable to store OLM deployment info (set from main) | ||
| var isOLMDeployment bool | ||
|
|
||
| func SetOLMDeployment(isOLM bool) { | ||
| isOLMDeployment = isOLM | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a global variable is an antipattern. Can you instead pass the value into functions that need it?
|
|
||
| // getCAConfigForServiceMonitor returns the appropriate CA configuration | ||
| func getCAConfigForServiceMonitor() *promv1.SecretOrConfigMap { | ||
| if isOLMDeployment { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you pass this global as a parameter to this function?
| } | ||
|
|
||
| func newSspServiceMonitor(namespace string) *promv1.ServiceMonitor { | ||
| certPath := filepath.Join(DefaultCertsDirectory, CertFilename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoding a path like this makes this code hard to unit test. The path will not exist on the machine where tests run. Can you pass the hostname as a parameter to this function?
5d3cb20 to
1893326
Compare
akrejcir
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good. I mostly have code style comments.
The CI is failing, because some files need to be regenerated using make bundle.
Can you please write more information into the commit message? A good commit message explains the reason why is the change done, not only what is changed.
It will be useful when someone looks at git history, to find out why a code is written in a certain way.
| PrometheusServiceAccountName = "prometheus-k8s" | ||
| MetricsPortName = "http-metrics" | ||
|
|
||
| TemplateValidatorMetricsServiceName = "template-validator-metrics" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use this constant also in template-validator/resources.go file, so that we are sure that it is the same string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above comment still applies.
The string template-validator-metrics is used to connect 2 resources in the cluster: Service and ServiceMonitor. It is a good code style if the 2 strings are also the same constant in the code. Then we can be sure, that there is not a typo in one of them.
Can you extract the constant to a different package, so it does not cause import loop?
| MetricsPortName = "http-metrics" | ||
|
|
||
| TemplateValidatorMetricsServiceName = "template-validator-metrics" | ||
| MetricsServiceName = "ssp-operator-metrics" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use this constant also in controllers/services_contorller.go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same reason as in my comment above.
| ServiceCABundle = "openshift-service-ca.crt" | ||
| ServiceCABUndleKey = "service-ca.crt" | ||
| OLMManagedCert = "ssp-operator-service-cert" | ||
| OLMManagedCertKey = "olmCAKey" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These 4 constants don't need to be exported, because they are only used in a single function in this file.
Can you move them to the functions where they are used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still valid.
| } | ||
| } | ||
|
|
||
| func olmManagedCABundle() promv1.SecretOrConfigMap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you inline this function? It is only used in one place, and it is short.
|
|
||
| func olmManagedCABundle() promv1.SecretOrConfigMap { | ||
| return promv1.SecretOrConfigMap{ | ||
| Secret: &v1.SecretKeySelector{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment, that OLM uses a self-signed CA, and that it is not available in a ConfigMap.
So that future developers know why we use a secret.
596b681 to
f527836
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- There’s a typo in ServiceCABUndleKey (should be ServiceCABundleKey) and you may want to correct it to avoid confusion.
- The TLSConfig construction in newValidatorServiceMonitor and newSspServiceMonitor is almost identical—consider extracting a shared helper to reduce duplication.
- extractHostnameFromCert references sdkTLSDir which no longer exists—ensure you introduce or update the constant/path so the code compiles and reads the intended cert.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- There’s a typo in ServiceCABUndleKey (should be ServiceCABundleKey) and you may want to correct it to avoid confusion.
- The TLSConfig construction in newValidatorServiceMonitor and newSspServiceMonitor is almost identical—consider extracting a shared helper to reduce duplication.
- extractHostnameFromCert references sdkTLSDir which no longer exists—ensure you introduce or update the constant/path so the code compiles and reads the intended cert.
## Individual Comments
### Comment 1
<location> `internal/operands/metrics/resources.go:110` </location>
<code_context>
+ )
+}
+
+func newValidatorServiceMonitor(request common.Request) *promv1.ServiceMonitor {
+ tlsConfig := &promv1.TLSConfig{
+ SafeTLSConfig: promv1.SafeTLSConfig{
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring the three ServiceMonitor constructors into a single, parameterized function to reduce duplication and centralize logic.
Here’s one way to collapse all three builders into a single, parameterized constructor. You keep the same inputs (name, namespace, label sets, port, CA bundle, serverName), but only write the service‐monitor shape once:
```go
// newServiceMonitor constructs any ServiceMonitor.
// - labels: service labels to merge with the defaults (cluster-monitoring & k8s-app)
// - selector: the MatchLabels for the ServiceMonitor selector
// - port: the metrics port name
// - ca: the CA bundle source
// - serverName: optional server name override
func newServiceMonitor(
name, namespace string,
labels, selector map[string]string,
port string,
ca promv1.SecretOrConfigMap,
serverName *string,
) *promv1.ServiceMonitor {
return &promv1.ServiceMonitor{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: namespace,
Labels: serviceMonitorLabels(labels),
},
Spec: promv1.ServiceMonitorSpec{
NamespaceSelector: promv1.NamespaceSelector{Any: true},
Selector: metav1.LabelSelector{MatchLabels: selector},
Endpoints: []promv1.Endpoint{{
Port: port,
Scheme: "https",
TLSConfig: &promv1.TLSConfig{
SafeTLSConfig: promv1.SafeTLSConfig{
CA: ca,
ServerName: serverName,
},
},
HonorLabels: true,
}},
},
}
}
```
Then each specialized builder becomes a one‐liner:
```go
func newValidatorServiceMonitor(req common.Request) *promv1.ServiceMonitor {
svcLabels := template_validator.PrometheusServiceLabels()
return newServiceMonitor(
template_validator.MetricsServiceName,
req.Namespace,
svcLabels,
svcLabels,
template_validator.MetricsPortName,
serviceCABundle(),
ptr.To(fmt.Sprintf("%s.%s.svc", template_validator.VirtTemplateValidator, req.Namespace)),
)
}
func newSspServiceMonitor(req common.Request) *promv1.ServiceMonitor {
svcLabels := resources.PrometheusServiceLabels()
return newServiceMonitor(
resources.MetricsServiceName,
req.Namespace,
svcLabels,
svcLabels,
resources.MetricsPortName,
getCAConfigForServiceMonitor(req.OLMDeployment),
ptr.To(req.SSPServiceHostname),
)
}
```
And you can drop the old `newServiceMonitorCR`, `TemplateValidatorServiceMonitorLabels` and `SspServiceMonitorLabels` helpers completely—everything funnels through that single `newServiceMonitor`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
akrejcir
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only done partial review.
Can you please put the refactoring commit that fixes import cycle before the commit that implements new functionality, or can you merge the 2 commits?
A commit should be a code change that compiles and preferably passes tests. So having a second commit that fixes the first commit is unusual.
Can you also add more text to the commit messages? Mainly in the first commit, the message should explain why is the symlink removed.
internal/common/labels.go
Outdated
| func PrometheusServiceLabels(serviceName string) map[string]string { | ||
| return map[string]string{ | ||
| PrometheusLabelKey: serviceName, | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is trivial, and can be inlined. PrometheusLabelKey is exported variable, so it can be accessed from other packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps exporting only the function would be better? i also renamed the function to MetricsServiceLabels().
| func serviceMonitorLabels(serviceLabels map[string]string) map[string]string { | ||
| labels := map[string]string{ | ||
| "openshift.io/cluster-monitoring": "true", | ||
| PrometheusLabelKey: PrometheusLabelValue, | ||
| "k8s-app": "kubevirt", | ||
| } | ||
|
|
||
| for k, v := range serviceLabels { | ||
| labels[k] = v | ||
| } | ||
|
|
||
| return labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is unnecessarily complex. Can you use the same function that was here, but make the PrometheusLabelValue a parameter?
| return common.PrometheusServiceLabels(MetricsServiceName) | ||
| } | ||
|
|
||
| func SspMetricsService(namespace string) *v1.Service { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is only used in one file, and a functional test file. I think it would not cause an import cycle, if it is left in the services_controller.go where it is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it causes an import cycle so i had to extract it out of the controllers package.
internal/constants.go
Outdated
| const ( | ||
| GoldenImagesNamespace = "kubevirt-os-images" | ||
| DefaultOperatorVersion = "devel" | ||
| SspDeploymentName = "ssp-operator" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constant is only used in one place and does not need to be here.
This file contains only constants that change value in the build scripts for downstream. The name of SSP deployment is the same in upstream and downstream, so this constant doesn't need to be here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now, it's used in more places. I think this constant could be extracted somewhere else if this file is not adequate.
a7edeb2 to
884d98e
Compare
|
|
||
| ctrl.SetLogger(zap.New(zap.UseFlagOptions(&opts))) | ||
|
|
||
| err := createCertificateSymlinks() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a text to the commit message explaining why the symlinks are not needed anymore?
It is not obvious to me, and it would be helpful for others, when they look at commit history.
| metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
| "k8s.io/utils/ptr" | ||
|
|
||
| "kubevirt.io/ssp-operator/internal/common" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a newline above this line to split imports into 3 blocks.
| PrometheusServiceAccountName = "prometheus-k8s" | ||
| MetricsPortName = "http-metrics" | ||
|
|
||
| TemplateValidatorMetricsServiceName = "template-validator-metrics" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above comment still applies.
The string template-validator-metrics is used to connect 2 resources in the cluster: Service and ServiceMonitor. It is a good code style if the 2 strings are also the same constant in the code. Then we can be sure, that there is not a typo in one of them.
Can you extract the constant to a different package, so it does not cause import loop?
| MetricsPortName = "http-metrics" | ||
|
|
||
| TemplateValidatorMetricsServiceName = "template-validator-metrics" | ||
| MetricsServiceName = "ssp-operator-metrics" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same reason as in my comment above.
| ServiceCABundle = "openshift-service-ca.crt" | ||
| ServiceCABUndleKey = "service-ca.crt" | ||
| OLMManagedCert = "ssp-operator-service-cert" | ||
| OLMManagedCertKey = "olmCAKey" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still valid.
| } | ||
| tlsConfig.ServerName = ptr.To(request.SSPServiceHostname) | ||
|
|
||
| serviceMonitor := newServiceMonitor(rules.RuleName, request.Namespace, tlsConfig, metav1.LabelSelector{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rules.RuleName us not a good name for ServiceMonitor. Rules are not related to metrics collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, the Service monitor can be named after the service, just as it's done in the template validator.
|
I tried to extract the related functionality in the third commit that was part of this branch. Was this a bad thing ? Do you mind giving me a review on that ? |
I meant to only extract the constants to a separate file, and use it in multiple files. Like this: In the previous commit, you've extracted a function, but that was not needed, because it was only used in a single file. |
| return cert.Subject.CommonName, nil | ||
| } | ||
|
|
||
| return cert.DNSNames[0], nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CommonName and DNSNames[0] should match, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they should match, can this be kept in place as a fallback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a check to verify that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do that? (CommonName == DNSNames[0])
akrejcir
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have only a few code style comments. Otherwise it looks good.
Can you also please add more text to the second commit message?
| ServiceCABundle = "openshift-service-ca.crt" | ||
| ServiceCABUndleKey = "service-ca.crt" | ||
| OLMManagedCert = "ssp-operator-service-cert" | ||
| OLMManagedCertKey = "olmCAKey" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These constants are only used in one function in this file, so they don't need to be exported.
Can you please inline them?
internal/constants.go
Outdated
| SspOperatorMetricsServiceName = "ssp-operator-metrics" | ||
| TemplateValidatorMetricsServiceName = "template-validator-metrics" | ||
| VirtTemplateValidator = "virt-template-validator" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move these constants to a different file.
This file only contains constants that are changed by the build process downstream. We use a simple sed command to change the values, so it would be best if there are small number of constants here.
tests/misc_test.go
Outdated
|
|
||
| . "github.com/onsi/ginkgo/v2" | ||
| . "github.com/onsi/gomega" | ||
| "kubevirt.io/ssp-operator/internal" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this import to the last block.
tests/service_controller_test.go
Outdated
|
|
||
| . "github.com/onsi/ginkgo/v2" | ||
| . "github.com/onsi/gomega" | ||
| "kubevirt.io/ssp-operator/internal" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same as above.
tests/validator_test.go
Outdated
|
|
||
| . "github.com/onsi/ginkgo/v2" | ||
| . "github.com/onsi/gomega" | ||
| "kubevirt.io/ssp-operator/internal" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same as above.
0xFelix
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ok-to-test
| return cert.Subject.CommonName, nil | ||
| } | ||
|
|
||
| return cert.DNSNames[0], nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do that? (CommonName == DNSNames[0])
| } | ||
| } | ||
|
|
||
| func createCertificateSymlinks() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we test CSV deployment in the upstream lanes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rucle Can you please make sure this will not break stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the metrics, both when manually deployed and also OLM-deployed and the metrics were correctly reported in both scenarios.
|
The CI failures are unrelated. Recently |
The certificates get mounted to the correct mount points in the filesystem, Symlink creation is unnecessary. Signed-off-by: Michal Vavrinec <[email protected]>
|
Can you update the failing functional tests, and include a new test for the second ServiceMonitor? |
This commit implements TLS encryption for both SSP operator and template validator metrics services. The implementation handles both OLM and non-OLM deployments with proper certificate management. Changes: - Add --olm-deployment flag to distinguish OLM-managed deployments - Create separate ServiceMonitors for SSP and template validator metrics - Configure TLS settings with appropriate CA bundles based on deployment type - Move common service names to constants package for reusability - Update ServiceMonitor selector labels to differentiate between services Technical details: - Non-OLM deployments use service-ca.crt from ConfigMap - OLM deployments use ssp-operator-service-cert Secret with olmCAKey Signed-off-by: Michal Vavrinec <[email protected]>
|
|
Looks good now. /approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: akrejcir The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |



What this PR does / why we need it:
Creates a separate service monitor targeting the template validator metrics and verifies the metrics service tls certificate when querying prometheus metrics.
Which issue(s) this PR fixes:
Fixes CNV-55455
Release note: