[RFC-0010] Multi-Tenant Workload Identity #5209

matheuscscp · 2025-02-23T04:32:34Z

In this RFC we aim to add support for multi-tenant workload identity in Flux, i.e. the ability to specify at the object-level which set of cloud provider permissions must be used for interacting with the respective cloud provider on behalf of the reconciliation of the object. In this process, credentials must be obtained automatically, i.e. this feature must not involve the use of secrets. This would be useful in a number of Flux APIs that need to interact with cloud providers, including all controllers except helm-controller.

Link: https://github.com/fluxcd/flux2/blob/main/rfcs/0010-multi-tenant-workload-identity/README.md

Umbrella issue for implementation: #5022

pjbgf

@matheuscscp good job putting this together. From a high-level it is a great feature and would further bolster Flux's least privileged model.

I added a few nits, but overall multi-tenancy is a charged term, so it is quite important to define what precisely we mean by that and what guarantees we are providing as part of this RFC.

My take is that this supports out-of-the-box the multiple teams model. The multiple customers model is achievable, but requires stronger security boundaries at cluster topology level - for coping with more hostile environment.

It would be nice to explicitly call out in the RFC:

The cache keys format, so that it is crystal clear that it will be resistant to cross-tenant takeovers. As you already pointed out, when sharing the same cache storage, the cache key will require some level of tenant or object level information to provide isolation. What in the format picked will result in a malicious tenant not being able to forge it (e.g. the fully qualified service account name plus x, y and z)?
What security controls are in place (or can be implemented by the user) to ensure that even if tenant A knows the Cloud Provider annotations for tenant B, it won't be able to impersonate them.

This can be as simple as they enforcing rules via Admission Controllers (e.g. Kubewarden), as per some examples of Flux multi-tenancy. Or something more sophisticated as part of the new feature.

The limitations of this feature for multi-tenancy in hostile environments where the tenants are not trustworthy. Suggestions on how to overcome them is optional, as that can be a larger topic - e.g. stronger isolation so that tenants can't impersonate each other's identities even if they know them and can bypass controls that may be operating at a degraded state (i.e. admission controllers).

rfcs/0010-multi-tenant-workload-identity/README.md

matheuscscp · 2025-03-29T18:32:12Z

@pjbgf Thank you very much for the review!

multi-tenancy is a charged term, so it is quite important to define what precisely we mean by that and what guarantees we are providing as part of this RFC.
My take is that this supports out-of-the-box the multiple teams model. The multiple customers model is achievable, but requires stronger security boundaries at cluster topology level - for coping with more hostile environment.

Agreed! Added a section to clarify this on the beginning 👍

The cache keys format, so that it is crystal clear that it will be resistant to cross-tenant takeovers. As you already pointed out, when sharing the same cache storage, the cache key will require some level of tenant or object level information to provide isolation. What in the format picked will result in a malicious tenant not being able to forge it (e.g. the fully qualified service account name plus x, y and z)?

Format added 👍 And specifically about:

What in the format picked will result in a malicious tenant not being able to forge it (e.g. the fully qualified service account name plus x, y and z)?

In the Cache Key section, the five paragraphs following the list of components are there to justify the presence of each component, be it for preventing malicious tenants from forging/stealing tokens from other tenants, or for making sure that served tokens are according to the request specifications and hence will be valid for the use case in question. I added a ##### Justification header on the beginning of these paragraphs to make it clearer.

What security controls are in place (or can be implemented by the user) to ensure that even if tenant A knows the Cloud Provider annotations for tenant B, it won't be able to impersonate them.
This can be as simple as they enforcing rules via Admission Controllers (e.g. Kubewarden), as per some examples of Flux multi-tenancy. Or something more sophisticated as part of the new feature.

Great point, there's a paragraph in the Technical Background section pointing out how this works. This is inherently built into the workload identity features of each cloud provider, you must create a strong link between the Kubernetes ServiceAccount and the cloud provider identity by granting impersonation permission to the latter on the former, see the original paragraph I wrote:

Another aspect of workload identity that is important for this RFC is how the cloud identities are associated with the Kubernetes ServiceAccounts. In most cases, an identity from the IAM service of the cloud provider (e.g. a GCP IAM Service Account, or an AWS IAM Role) is associated with a Kubernetes ServiceAccount by the process of impersonation. Permission to impersonate the cloud identity is granted to the ServiceAccount.

Here's the updated paragraph with a bit more details at the end:

Another aspect of workload identity that is important for this RFC is how the cloud
identities are associated with the Kubernetes ServiceAccounts. In most cases, an
identity from the IAM service of the cloud provider (e.g. a GCP IAM Service Account,
or an AWS IAM Role) is associated with a Kubernetes ServiceAccount by the process
of impersonation. Permission to impersonate the cloud identity is granted to the
ServiceAccount through a configuration that points to the fully qualified name of
the Kubernetes ServiceAccount, i.e. the name and namespace of the ServiceAccount
and which cluster it belongs to in the name/address system of the cloud provider.

So essentially the identities are not secret, knowing what cloud identity a tenant uses gives no advantages to a malicious neighbor tenant whatsoever.

The limitations of this feature for multi-tenancy in hostile environments where the tenants are not trustworthy. Suggestions on how to overcome them is optional, as that can be a larger topic - e.g. stronger isolation so that tenants can't impersonate each other's identities even if they know them and can bypass controls that may be operating at a degraded state (i.e. admission controllers).

As discussed right above, no admission controllers are required, the impersonation permission is implemented and enforced by the cloud provider. You grant impersonation permission for ServiceAccount A to impersonate cloud identity X at the cloud provider level. The ServiceAccount B will never be able to impersonate cloud identity X if you don't give it the same permission. The model here is zero trust, ServiceAccount B has no impersonation permissions by default.

Workload identity is pretty solid :)

matheuscscp · 2025-03-31T04:33:47Z

New SOPS version 3.10.0 released with the GCP KMS oauth2.TokenSource authentication method and already bumped in kustomize-controller:

fluxcd/kustomize-controller#1410

rfcs/0010-multi-tenant-workload-identity/README.md

matheuscscp · 2025-04-07T12:43:22Z

I'm now addressing the offline comments I got during KubeCon EU 2025.

From @stealthybox:

An alternative to using service account tokens would be using a token whose subject string encodes a direct reference to the respective Flux resource, this way a resource would be its own identity. This is more secure than having a configuration knob to define another resource (a service account in this case) as the object identity, as it prevents another object in the same namespace from abusing the same service account/cloud permissions.

From @hiddeco:

We should move the interfaces in interfaces.go to multiple files with names matching the interface names themselves.

From @stefanprodan:

To avoid introducing kustomizations.spec.decryption.key.serviceAccountName for disambiguating single-tenant and multi-tenant workload identity we should instead introduce a binary flag in the controllers to switch to and enforce multi-tenant workload identity, e.g. --require-service-account-for-provider-auth

matheuscscp · 2025-04-07T16:57:47Z

@stealthybox @hiddeco @stefanprodan Comments from KubeCon addressed, please feel free to do another pass/approve.

rfcs/0010-multi-tenant-workload-identity/README.md

pjbgf

LGTM

rfcs/0010-multi-tenant-workload-identity/README.md

hiddeco

🚀

rfcs/0010-multi-tenant-workload-identity/README.md

stefanprodan

LGTM

Thanks @matheuscscp 🏅

Signed-off-by: Matheus Pimenta <matheuscscp@gmail.com>

matheuscscp added the area/rfc Feature request proposals in the RFC format label Feb 23, 2025

matheuscscp force-pushed the rfc-multi-tenant-workload-identity branch 2 times, most recently from b795adf to 3034741 Compare February 23, 2025 04:33

stefanprodan changed the title ~~[RFC-0010] Multi-Tenant Workload Identity~~ [RFC] Multi-Tenant Workload Identity Feb 23, 2025

stefanprodan marked this pull request as draft February 23, 2025 10:39

matheuscscp force-pushed the rfc-multi-tenant-workload-identity branch 15 times, most recently from 5a5f4ac to 614bbc8 Compare March 1, 2025 17:03

matheuscscp force-pushed the rfc-multi-tenant-workload-identity branch 10 times, most recently from f505036 to 3e9b36a Compare March 8, 2025 01:36

matheuscscp force-pushed the rfc-multi-tenant-workload-identity branch 2 times, most recently from f3dc52d to 2f0b445 Compare March 29, 2025 01:51

pjbgf reviewed Mar 29, 2025

View reviewed changes

dipti-pai approved these changes Mar 31, 2025

View reviewed changes

pjbgf reviewed Apr 1, 2025

View reviewed changes

rfcs/0010-multi-tenant-workload-identity/README.md Show resolved Hide resolved

matheuscscp force-pushed the rfc-multi-tenant-workload-identity branch from af55097 to ac8a8de Compare April 7, 2025 09:11

matheuscscp force-pushed the rfc-multi-tenant-workload-identity branch from 6de8de0 to c76ea14 Compare April 7, 2025 17:34

stefanprodan reviewed Apr 7, 2025

View reviewed changes

rfcs/0010-multi-tenant-workload-identity/README.md Outdated Show resolved Hide resolved

pjbgf approved these changes Apr 8, 2025

View reviewed changes

matheuscscp mentioned this pull request Apr 8, 2025

[RFC-0010] Multi-Tenant Workload Identity Implementation #5022

Open

15 tasks

matheuscscp force-pushed the rfc-multi-tenant-workload-identity branch 4 times, most recently from 64e29f8 to 0f4cf9f Compare April 11, 2025 02:17

stefanprodan reviewed Apr 11, 2025

View reviewed changes

rfcs/0010-multi-tenant-workload-identity/README.md Outdated Show resolved Hide resolved

hiddeco approved these changes Apr 12, 2025

View reviewed changes

matheuscscp force-pushed the rfc-multi-tenant-workload-identity branch from 0f4cf9f to d0a69fe Compare April 12, 2025 13:41

matheuscscp commented Apr 12, 2025

View reviewed changes

rfcs/0010-multi-tenant-workload-identity/README.md Outdated Show resolved Hide resolved

matheuscscp commented Apr 12, 2025

View reviewed changes

rfcs/0010-multi-tenant-workload-identity/README.md Outdated Show resolved Hide resolved

stefanprodan changed the title ~~[RFC] Multi-Tenant Workload Identity~~ [RFC-0010] Multi-Tenant Workload Identity Apr 14, 2025

stefanprodan approved these changes Apr 14, 2025

View reviewed changes

[RFC-0010] Multi-Tenant Workload Identity

a7e41df

Signed-off-by: Matheus Pimenta <matheuscscp@gmail.com>

matheuscscp force-pushed the rfc-multi-tenant-workload-identity branch from 0ca7ef6 to a7e41df Compare April 14, 2025 10:34

matheuscscp merged commit 9127181 into main Apr 14, 2025
6 checks passed

matheuscscp deleted the rfc-multi-tenant-workload-identity branch April 14, 2025 10:44

matheuscscp mentioned this pull request Apr 14, 2025

[RFC-0010] Add core auth library fluxcd/pkg#906

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC-0010] Multi-Tenant Workload Identity #5209

[RFC-0010] Multi-Tenant Workload Identity #5209

matheuscscp commented Feb 23, 2025 •

edited

Loading

pjbgf left a comment

matheuscscp commented Mar 29, 2025 •

edited

Loading

matheuscscp commented Mar 31, 2025

matheuscscp commented Apr 7, 2025 •

edited

Loading

matheuscscp commented Apr 7, 2025

pjbgf left a comment

hiddeco left a comment

stefanprodan left a comment

[RFC-0010] Multi-Tenant Workload Identity #5209

[RFC-0010] Multi-Tenant Workload Identity #5209

Conversation

matheuscscp commented Feb 23, 2025 • edited Loading

pjbgf left a comment

Choose a reason for hiding this comment

matheuscscp commented Mar 29, 2025 • edited Loading

matheuscscp commented Mar 31, 2025

matheuscscp commented Apr 7, 2025 • edited Loading

matheuscscp commented Apr 7, 2025

pjbgf left a comment

Choose a reason for hiding this comment

hiddeco left a comment

Choose a reason for hiding this comment

stefanprodan left a comment

Choose a reason for hiding this comment

matheuscscp commented Feb 23, 2025 •

edited

Loading

matheuscscp commented Mar 29, 2025 •

edited

Loading

matheuscscp commented Apr 7, 2025 •

edited

Loading