-
Notifications
You must be signed in to change notification settings - Fork 266
feat: implement agentic identities authentication via cert bound tokens. #1842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…trol the sleep time for threads. Fixed the agent_spiffe_cert.pem
nbayati
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall a really good PR! Just a few small comments on the implementation. I'll review the unit tests next.
oauth2_http/java/com/google/auth/oauth2/AgentIdentityUtils.java
Outdated
Show resolved
Hide resolved
oauth2_http/java/com/google/auth/oauth2/AgentIdentityUtils.java
Outdated
Show resolved
Hide resolved
oauth2_http/java/com/google/auth/oauth2/AgentIdentityUtils.java
Outdated
Show resolved
Hide resolved
e0d450a to
5353838
Compare
oauth2_http/java/com/google/auth/oauth2/AgentIdentityUtils.java
Outdated
Show resolved
Hide resolved
| final class AgentIdentityUtils { | ||
| private static final Logger LOGGER = Logger.getLogger(AgentIdentityUtils.class.getName()); | ||
|
|
||
| static final String GOOGLE_API_CERTIFICATE_CONFIG = "GOOGLE_API_CERTIFICATE_CONFIG"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a constant here:
| static final String CERTIFICATE_CONFIGURATION_ENV_VARIABLE = "GOOGLE_API_CERTIFICATE_CONFIG"; |
Could we re-use the constant there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I see it, it is present in a different module and I don't think it is worth it to add a cross module dependency for one constant import.
| boolean warned = false; | ||
|
|
||
| // Deterministic polling loop based on pre-calculated intervals. | ||
| for (long sleepInterval : POLLING_INTERVALS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about this given that this is to be opt-out and not opt-in. By default, customers may now experience up to 30s of busy waiting just because the filesystem on certain environments can't be loaded fast enough. This applies for customers who don't touch Agent Identities.
IIUC the code from above, the only way to to know if Agent Identities shouldn't be used:
- Load the cert file (up to 30s of busy waiting)
- Then load and parse the cert file to check the SAN
OR
Require users to explicitly add the new env var to their workloads + envs
Can you let me know if my understanding is correct? If this is the case that the cert has to be loaded, then I feel that opt-in would be better to avoid the possibility busy waiting for all customers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your understanding is majorly right: first the logic will look for a config file which contains the path to the cert file both of these could take time to load and hence the wait.
I believe that for 99%+ of the cases the cert bundle is loaded within <1s as per this
The understanding is that typically if a customer has set their GOOGLE_API_CERT_CONFIG env variable to a path, the config file and the cert file should already be present.
oauth2_http/java/com/google/auth/oauth2/AgentIdentityUtils.java
Outdated
Show resolved
Hide resolved
| for (long sleepInterval : POLLING_INTERVALS) { | ||
| try { | ||
| if (Files.exists(Paths.get(certConfigPath))) { | ||
| String certPath = extractCertPathFromConfig(certConfigPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, null is used to represent that the cert_path doesn't exist inside the certConfigPath. I think this would be a case where we wouldn't want to retry anymore since the certConfigPath is found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting thought, in that case the cert-config file itself is malformed which I didn't account for.
Your thoughts @nbayati, would this even be possible?
As a corollary, what if the cert file is malformed?
| try { | ||
| if (Files.exists(Paths.get(certConfigPath))) { | ||
| String certPath = extractCertPathFromConfig(certConfigPath); | ||
| if (!Strings.isNullOrEmpty(certPath) && Files.exists(Paths.get(certPath))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is testing that the certPath from certConfig exists, right? I think this would be the happy path.
Just want to confirm: What about the test case that the certPath from certConfig doesn't exist on the file system. Should we stop retrying and fail instead of continue to retry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, I will defer to @nbayati's opinion on this:
-
What if the GOOGLE_API_CERTIFICATE_CONFIG points to a non-existent config file?
-
What if the certPath pointed to by the config file is non-existent?
| private static final int FAST_POLL_CYCLES = 50; | ||
| private static final long FAST_POLL_INTERVAL_MS = 100; // 0.1 seconds | ||
| private static final long SLOW_POLL_INTERVAL_MS = 500; // 0.5 seconds | ||
| private static final long TOTAL_TIMEOUT_MS = 30000; // 30 seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any specific reason for these pre-configured values? I'm guessing the data shows that 5s resolves for a vast majority of cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe as per this design doc's comment thread.
…mports are now listed.
This feature implements the ability for agentic identities to authenticate themselves via X509 cert bound tokens. We are limiting the scope here to only cloud run based agentic workloads.