cilium
Service mesh: add mTLS auth method
#24263
Merged

Service mesh: add mTLS auth method #24263

aditighag merged 3 commits into cilium:master from meyskens:mtls-spiffe
meyskens
meyskens2 years ago👍 3🎉 4❤ 4🚀 5

This adds an mTLS auth handler to the Serice Mesh auth package.
It will listen on a given port and does a mutual TLS handshake with
SPIFFE IDs it received. This will assure the both sides got the needed
certificates.

In order to integrate with the datapath tables it also improves the SPIFFE
interface to use the Cilium Numeric Identities. And convert them from and
to valid SNI fields. As well as implement code to validate the URI SANS
inside the certificates.

This can be enabled in the in the Helm chart.

Fixes: #23807

Add mtls-spiffe as auth mode in the CiliumNetworkPolicy
maintainer-s-little-helper maintainer-s-little-helper added dont-merge/needs-release-note-label
meyskens meyskens force pushed 2 years ago
meyskens meyskens force pushed 2 years ago
meyskens
meyskens meyskens added kind/feature
meyskens meyskens added release-note/major
meyskens meyskens added area/servicemesh
maintainer-s-little-helper maintainer-s-little-helper removed dont-merge/needs-release-note-label
meyskens meyskens marked this pull request as ready for review 2 years ago
meyskens meyskens requested a review 2 years ago
meyskens meyskens requested a review 2 years ago
meyskens meyskens requested a review 2 years ago
meyskens meyskens requested a review 2 years ago
meyskens meyskens requested a review 2 years ago
meyskens meyskens requested a review from nathanjsweet nathanjsweet 2 years ago
meyskens meyskens requested a review from christarazi christarazi 2 years ago
meyskens meyskens requested a review from youngnick youngnick 2 years ago
meyskens meyskens requested a review from kaworu kaworu 2 years ago
meyskens meyskens force pushed 2 years ago
meyskens meyskens force pushed 2 years ago
meyskens meyskens force pushed 2 years ago
meyskens meyskens force pushed 2 years ago
meyskens meyskens force pushed 2 years ago
meyskens meyskens requested a review 2 years ago
meyskens meyskens requested a review from qmonnet qmonnet 2 years ago
christarazi
christarazi commented on 2023-03-09
christarazi2 years ago

Nice work!

One major concern I have is that I don't see any tests. I realize from reviewing that this seems to be more of integration code so unit testing probably wasn't obvious. However, I can imagine a test that can spin off two TLS clients / servers and validate the code paths without too much complication.

Otherwise, I only have code nits.

Conversation is marked as resolved
Show resolved
pkg/auth/mtls_authhandler.go
31
32func newMTLSAuthHandler(lc hive.Lifecycle, cfg MTLSConfig, params mtlsParams, log logrus.FieldLogger) authHandlerResult {
33 if cfg.MTLSListenerPort == 0 {
34
log.Info("mTLS Auth Handler disabled is disabled as no port is configured")
christarazi2 years ago

Nit: "disabled is disabled" :)

Conversation is marked as resolved
Show resolved
pkg/auth/mtls_authhandler.go
66}
67
68func (cfg MTLSConfig) Flags(flags *pflag.FlagSet) {
69
flags.IntVar(&cfg.MTLSListenerPort, "mesh-auth-mtls-listener-port", 0, "Port on which the Cilium Agent will perfom mTLS handshakes between agents on")
christarazi2 years ago

Nit: "Port on which the Cilium Agent will perfom mTLS handshakes between other Agents"

Conversation is marked as resolved
Show resolved
pkg/auth/mtls_authhandler.go
86 }
87 clientCert, err := m.cert.GetCertificateForNumericIdentity(ar.localIdentity)
88 if err != nil {
89
return nil, fmt.Errorf("Failed to get certificate for local identity %s: %w", ar.localIdentity.String(), err)
90
}
91
92
caBundle, err := m.cert.GetTrustBundle()
93
if err != nil {
94
return nil, fmt.Errorf("Failed to get CA bundle: %w", err)
95
}
96
97
// set up TCP connection
98
conn, err := net.Dial("tcp", fmt.Sprintf("%s:%d", ar.remoteHostIP.String(), m.cfg.MTLSListenerPort))
99
if err != nil {
100
return nil, fmt.Errorf("Failed to dial %s:%d: %w", ar.remoteHostIP.String(), m.cfg.MTLSListenerPort, err)
christarazi2 years ago

Nit: Error strings usually don't start with a capital case (https://github.com/golang/go/wiki/CodeReviewComments#error-strings)

Conversation is marked as resolved
Show resolved
pkg/auth/mtls_authhandler.go
95 }
96
97 // set up TCP connection
98
conn, err := net.Dial("tcp", fmt.Sprintf("%s:%d", ar.remoteHostIP.String(), m.cfg.MTLSListenerPort))
christarazi2 years ago

net.JoinHostPort

meyskens2 years ago

TIL

Conversation is marked as resolved
Show resolved
pkg/auth/spire/certificate_provider.go
7477}
78
79func (s *SpireDelegateClient) NumericIdentityToSNI(id identity.NumericIdentity) string {
80
return fmt.Sprintf("%s.%s", id.String(), s.cfg.SpiffeTrustDomain)
christarazi2 years ago

Can we do a simple string concat where we can instead of using fmt.Sprintf? It usually comes up in flamegraphs. I'm not sure how hot this code path is, but we might as well avoid it since it's trivial to and not have to deal with optimizing this later.

Conversation is marked as resolved
Show resolved
pkg/auth/spire/delegate.go
246 conn, err := grpc.Dial(unixPath, grpc.WithTransportCredentials(insecure.NewCredentials()),
247 grpc.WithDefaultCallOptions(
248 grpc.MaxCallRecvMsgSize(20*1024*1024),
249
grpc.MaxCallSendMsgSize(20*1024*1024))) // setting this to 20MB to handle large bundles TODO: not make this the random chosen 20MB
christarazi2 years ago

Typically we file an issue for followups like these to avoid having TODOs (at least new ones) in the code

meyskens2 years ago👍 2

I forgot this one was in there, it was more of a note to myself when working on this :)
Having a call with some SPIFFE maintainers later today where I hope to bring this up for advice on large number of workloads in general.

meyskens2 years ago

Result was that changes are needed upstream, made an issue and linked it

Conversation is marked as resolved
Show resolved
pkg/auth/mtls_authhandler.go
40 continue
41 }
42 if cert != nil {
43
log.Fatal("Multiple certificate providers configured, but only one is supported")
44
}
45
cert = c
46
}
47
if cert == nil {
48
log.Fatal("No certificate provider configured, but one is required")
christarazi2 years ago

Do we need to provide a bit more context into these fatal msgs to make it easier for the user to know what to do? I feel they are a bit terse.

meyskens
meyskens meyskens force pushed 2 years ago
youngnick
youngnick commented on 2023-03-10
youngnick2 years ago

Amazing work @meyskens! I'm really impressed with how quickly you've pulled functional code together.

However, I'm in a similar boat to @christarazi here, the changes look great, but I'd like to see unit tests, particularly for mtls_authhandler.go and certificate_provider.go. I think just tests that exercise the code paths inside the if statements are a great place to start, even for the internal functions. Let's lock in tests while this is a small change, and then we can have greater confidence as the changesets and assoicated testing get bigger.

meyskens meyskens force pushed 2 years ago
meyskens
meyskens meyskens force pushed 2 years ago
meyskens meyskens force pushed 2 years ago
qmonnet
qmonnet approved these changes on 2023-03-10
qmonnet2 years ago

Everything's in order from my side. Nice!

kaworu
kaworu requested changes on 2023-03-10
kaworu2 years ago

Great work @meyskens! 🎉

I have some Helm suggestions and questions around the mTLS codepath.

While trying the patch I hit the following issue where the Cilium agent can't connect to the spire server:

cilium-98l82 cilium-agent level=info msg="Connecting to SPIRE Delegate API Client" subsys=spire-delegate
cilium-98l82 cilium-agent level=error msg="error in delegate stream, restarting" error="rpc error: code = PermissionDenied desc = no identity issued" subsys=spire-delegate
meyskens meyskens force pushed 2 years ago
meyskens meyskens requested a review from kaworu kaworu 2 years ago
kaworu
kaworu requested changes on 2023-03-13
kaworu2 years ago

Thanks for the update @meyskens!

Couple of comments about context handling and locking, but overall LGTM.

meyskens meyskens force pushed 2 years ago
meyskens meyskens requested a review from kaworu kaworu 2 years ago
meyskens meyskens requested a review from mhofstetter mhofstetter 2 years ago
mhofstetter
mhofstetter commented on 2023-03-14
mhofstetter2 years ago

looks great 🎉 just some small suggestions

jrajahalme
jrajahalme approved these changes on 2023-03-14
jrajahalme2 years ago

Looks great to me!

meyskens meyskens force pushed 2 years ago
mhofstetter
mhofstetter approved these changes on 2023-03-14
nathanjsweet
nathanjsweet approved these changes on 2023-03-14
nathanjsweet2 years ago

LGTM for API changes

christarazi
christarazi commented on 2023-03-14
kaworu
kaworu approved these changes on 2023-03-15
kaworu2 years ago

Thanks @meyskens! Couple of nitpicking comments but the patch LGTM 🎉

kaworu
meyskens
christarazi
christarazi approved these changes on 2023-03-15
christarazi
meyskens Add mTLS auth controller
08c7000a
meyskens Remove the debug statement for processing SVIDs
b21dac49
meyskens Add mTLS auth to the Helm chart
d6990867
meyskens meyskens force pushed to d6990867 2 years ago
meyskens
youngnick
youngnick approved these changes on 2023-03-17
youngnick2 years ago

LGTM, nice work @meyskens !

meyskens
sayboras
sayboras
sayboras sayboras added ready-to-merge
aditighag aditighag merged 971ec321 into master 2 years ago
meyskens meyskens deleted the mtls-spiffe branch 2 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone