So, about THAT GnuTLS session ticket bug (CVE-2020-13777)

It’s been almost two weeks ago now that I discovered the bug in GnuTLS now known as CVE-2020-13777. The report is issue #1011 on the GnuTLS bug tracker. Here I want to talk a bit about how I discovered the bug and some thoughts on its impact.

Session resumption for mod_gnutls

I was working on session resumption support for proxy connections in mod_gnutls. To test I had a setup with two Apache servers with mod_gnutls:

  1. a front-end server, which should cache session tickets and try to resume sessions, and
  2. a back-end server which should issue and use the session tickets.

After session resumption worked as it should, I restarted the back-end server to invalidate the session ticket the front-end server had cached. On server stop mod_gnutls wipes the primary key used to protect session tickets from memory, so the freshly started server should have had no way to decrypt the old ticket. But somehow both sides still reported successful session resumption. That absolutely shouldn’t happen, so I started investigating.

At first I worried about a bug in mod_gnutls: Maybe the log output about resumption wasn’t correct? Or, way worse, was the key not used correctly? Over time I ruled those out, and instead started to suspect a bug in GnuTLS.

Testing GnuTLS itself

To check that I tested gnutls-serv, a relatively simple TLS server included with GnuTLS.

As the client I had to use OpenSSL’s s_client, because contrary to gnutls-cli it offers a way to cache sessions to disk and load them. I needed that because for my test I needed to change the server between initial session and resumption, and the --resume flag in gnutls-cli is way too fast to do that manually.

This brought me to the steps I described in the bug report to reproduce the problem:

  1. Start a server with valid credentials: gnutls-serv --x509keyfile=authority/server/secret.key --x509certfile=authority/server/x509.pem
  2. Connect to the server and store resumption data: openssl s_client -connect localhost:5556 -CAfile authority/x509.pem -verify_return_error -sess_out session.cache
  3. Stop the server started in step 1.
  4. Start a server with bogus credentials at the same address port (imagine a real attacker redirecting connections only if the client is attempting resumption): gnutls-serv --x509keyfile=rogueca/mitm/secret.key --x509certfile=rogueca/mitm/x509.pem
  5. Connect again, using the stored resumption data: openssl s_client -connect localhost:5556 -CAfile authority/x509.pem -verify_return_error -sess_in session.cache

That worked flawlessly on the first try. At that point it sunk in that I had a serious security issue on my hands that allowed man-in-the-middle (MITM) attacks on GnuTLS servers, and did a round of cycling to clear my mind.

After a few more tests I wrote up the bug report. It was clearly too bad to sit on until I had all the details, what I had was already more than bad enough to require an immediate fix. A quick attempt to reproduce the problem with TLS 1.2 had failed (looking back I must’ve made some mistake in the commands), so I guessed there was some difference in how GnuTLS handled tickets for TLS 1.3 and 1.2.

It gets worse

Daiki Ueno commented on my report:

“Looking at the code path, ticket encryption key and decryption key are all-zero, until the first rotation happens. In TLS 1.3, that can only bypass the authentication, but in TLS 1.2, it may allow attackers to recover the previous conversations.”

With that hint I was able to find the relevant GnuTLS code pretty quickly, and indeed there was no hint of different ticket encryption for the different TLS versions. So I tried to reproduce the problem with TLS 1.2 again, and indeed it worked. With a bit of fiddling I was able to get the server to write the key to disk during the resumed session, and decrypt the previous, initial session. Mind that that was just the easiest way to do it, it is possible to get the same data just from captured network traffic.

Allowing MITM with TLS 1.3 is already really bad, but effectively unprotected tickets with TLS 1.2 are way worse, because it means it’s possible to decrypt past recorded sessions. If the NSA has such a TLS connection from a year ago stored somewhere? Whoops, they can decrypt it now.

This is because of design flaws in TLS 1.2:

  • Resumed sessions use exactly the same secret keys as the initial session. So if you get access to the secrets at any point (like from a not properly protected ticket here), you can decrypt all the sessions.
  • Session tickets are sent from the server to the client right before starting to use the negotiated encryption. For this bug that means TLS 1.2 sessions are vulnerable as soon as the server sends a ticket, not only on resumption.

TLS 1.3 mostly avoids those issues by doing a fresh Diffie-Hellman exchange during resumption (simply put, it negotiates new keys, and just keeps authentication in place – which allowed the MITM here), and sending the ticket to the client encrypted. During session resumption the client naturally has to send the ticket in the clear.

I wrote “mostly” above, because there is one part where TLS 1.3 should be vulnerable to passive decryption, that I haven’t seen discussed much yet: Early data. I say “should” because I haven’t tested it, but based on what I know about how early data works it should be. Early data is sent with the request for session resumption to speed up communication (hence the name), so it cannot be protected by the new Diffie-Hellman exchange yet and uses key material from the ticket, similar to what TLS 1.2 does for the whole session. “Early data” tends to be small (if it is used at all, mod_gnutls does not support it), but if that small piece of data contains, say, an authentication token… Not good.

Consequences

First of, big thanks to the GnuTLS team for handling the bug report and fix well!

Secondly, this demonstrates why the design flaws in TLS 1.2 I described above are so bad. If a ticket is compromised, all associated sessions are. And these are not the only flaws, and even though most can be avoided with careful implementation: Don’t rely on TLS 1.2 (much less older) any more if you can avoid it! From the client side that’s hard to enforce (for example, way too many websites still offer only TLS 1.2), but seriously consider it if you’re operating a TLS server. Although I understand that may be difficult too: All modern browsers support TLS 1.3, but I have no idea how many legacy devices, like phones and IoT devices that might not have seen updates in a long time, might still need your servers.

After this it gets murky. Should you still use session tickets with TLS 1.3? I think it’s reasonable to, assuming of course that the implementation is sound. Should you use early data? I’m leaning towards no, and have no plans to support it with mod_gnutls, because it doesn’t offer forward secrecy in case of ticket compromise (unlike the rest of TLS 1.3), and I doubt the slight speedup is worth the complexity.

The issue with session ticket key rotation

A particularly interesting case is the session ticket key rotation as implemented in GnuTLS. What I learned while looking at the code that caused this bug I learned that the rotation doesn’t in fact rotate the primary key (as generated using gnutls_session_ticket_key_generate()) based on time, it only derives the key used for a particular session from it in a time-based manner. That means it does not protect against an attacker who is able to steal the primary key from server memory, which I consider to be the main risk for tickets. That is because if you can steal that key, you can decrypt all tickets, and at the very least do MITM attacks.

Some have called the rotation scheme entirely useless because of that. I’m not yet sure if I agree with that, but I do want to implement a rotation of the primary key in mod_gnutls to protect against server compromise. However, I also see that it is extremely difficult for a library like GnuTLS to offer a key rotation mechanism that reliably works for all possible use cases. For example with mod_gnutls I’d have to synchronize a truly random new key across multiple server processes. Maybe clarifying the documentation on exactly what is and isn’t rotated is the best GnuTLS can do on the library side.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: