Java, Scala, Kotlin and TLS1.0 / TLS1.1

Screen Shot 2021-04-15 at 10.42.30 AM.png

On April 20th the open source OpenJDK project will be releasing new versions of JDK11, JDK8, and JDK7 that remove TLS1.0 and TLS1.1 from the default list of supported TLS/SSL protocols. If a Java, Scala, Kotlin or other JVM-based application is inter-operating with other things that only support the TLS1.0 and TLS1.1 protocol, this is a breaking change. TLS1.0 and TLS1.1 connections that were working before will start returning low level exceptions once the JVM is updated and restarted. Depending on the type of exception that the application is catching and how good the logging is, there may or may not be a lot to go on for operators to understand what is happening and why.

Web browsers have been pushing similar changes for a while now, though different browsers are at different stages of remaining support for TLS1.0 and TLS1.1, and they’ve been able to go more incrementally by using click-through security warnings to increase friction before completely disallowing connections to servers that can only manage TLS1.0 or TLS1.1. The OpenJDK change has been sign-posted in the OpenJDK crypto roadmap for quite a while, but breaking changes like this are still a big deal.

TLS1.2 has been available for a very long time. The JDK has supported it for over 10 years and it’s the default. Java, Scala, Kotlin will all use TLS1.2 when available. That support can’t be downgraded by attackers either, TLS has built-in defenses for that. Removing TLS1.0 and TLS1.1 from the default set of protocols supported doesn’t directly move any traffic to TLS1.2, that’s already been happening, it’s more about forbidding any use of TLS1.0/1.1 and being able to “weed out” lingering usage.

Where I work at AWS, every service has long supported at least TLS1.2 and this change won’t impact communication to or from those services. I haven’t checked, but other cloud providers are almost certainly in a similar situation. Sadly, things are a little different on non-cloud networks like co-lo and on-premises datacenters, or on industrial and home networks where it’s not uncommon to encounter legacy appliances and applications that haven’t been, or can’t be updated to support TLS1.2. 

In my experience of working with customers, there are three common causes of TLS1.0 or TLS1.1 getting stuck in the network. By far the most common is legacy hardware load balancers. Some simply don’t support TLS1.2, or don’t support AES-GCM (the main new feature in TLS1.2) in hardware, but otherwise they work fine and they are an expensive and hard to remove device. If a device like this is in the network, then JVM based applications connecting to the load balancer will break, and so will JVM based applications that are behind the load balancer (like Tomcat). 

The next most common reason is because the network is performing TLS interception to inspect and vet all traffic on the network. TLS is designed to thwart interception, but administrators can still do it when they either know the RSA private key for the services they are intercepting, or when they can load a certificate authority into the clients. This practice is controversial and some security experts recommend against it, but some administrators who have compliance or security requirements to detect any potential exfiltration of sensitive data on a network find it useful. The technique tends to be associated with Financial Services Industry and Government networks and it’s not that uncommon that interception appliances “force” TLS1.0 because that’s what they know. 

The final common reason is other legacy appliances and devices that didn’t ship with automatic “over the air” update capabilities, or the vendor stopped supporting it, but they’re still present in a network. This can be everything from Industrial control systems to IoT devices in the home. There are robots, cars, TVs, Blu-ray players, and more still running ten-year old TLS stacks..

Although web-browsers have been great at pushing TLS1.0 and TLS1.1 out of the web, that doesn’t help with most of these situations, which are invisible to web browsers. It’s also not uncommon for application testing and development to happen on different networks than production (in fact that’s a good practice). But the testing setup may not have these legacy appliances. All of this together means that if you operate JVM-based applications on a legacy network, the first time you find out about all of this may well be when you deploy a new JVM to production and connections start breaking confusingly. In other words, an outage.

It’s very hard to get good data on just how many networks and applications may be impacted here. At AWS, we are also working to retire TLS1.0 and TLS1.1, and we already have disabled TLS1.0 and TLS1.1 on some services where we’ve been able to do this safely.  We do this by working with customers to identity and update sources of TLS1.0 or TLS1.1 traffic. But we also see that about 1% of incoming connections to some other services are still using TLS1.0. We suspect that traffic interception devices on customer’s networks are responsible for a good portion of it. I’ve also reached out to industry friends and colleagues to get their own estimates of TLS1.0 and TLS1.1 usage on non-Cloud networks and their estimates varied from “about one percent too” to as high as “tens of percent”. 

Besides rolling back to the previous version of the JVM, operators and administrators can also re-enable TLS1.0 and TLS1.1 in the default set of supported protocols. For Amazon Corretto, the OpenJDK distribution that we maintain at AWS, we’re going to keep TLS1.0 and TLS1.1 in the default set for a while longer, to avoid the breaking change. Even though Amazon Corretto is mostly used on AWS, where we have strong support for TLS1.2, we know that some customers run it on their own networks too. To speed up the move away from TLS1.0 and TLS1.1, we’re going to add some instrumentation to Amazon Corretto that will allow operators and administrators to see how many connections are forced to use TLS1.0 and TLS1.1. The goal is to give operators and administrators a way to determine if it’s safe to remove TLS1.0 and TLS1.1 before doing it.

What’s wrong with TLS1.0 and TLS1.1?

Since this change in the JDK is causing some administrators and operators to ask what exactly are the trade-offs and whether it’s safe to keep running TLS1.0 or TLS1.1 for some period of time, I thought it would be useful to summarize what we know. If I learn any new information or some numbers change, I’ll update it too and make a note.

Let’s start with this – TLS1.2 and TLS1.3 are unambiguously better than TLS1.0 and TLS1.1. It’s always better to use at least TLS1.2 if you can. Thankfully OpenJDK already gets this right and TLS1.2 and TLS1.3 are always used when possible. If you do have legacy applications, appliances, and devices that only support TLS1.0 or TLS1.1 you should absolutely be working on removing them. 

TLS1.0 and TLS1.1 have some known security issues. None of the issues are serious as the problems found in the older versions of the same protocol, SSLv2 and SSLv3, which is why we haven’t seen the same urgency to turn them off, but we all know that attacks only improve over time. 

If you’re using Java, Scala, or Kotlin on a private network you may not consider active traffic interception as part of your threat model. It’s not uncommon to see applications that use self-signed certificates and don’t validate certificates. This still protects against passive interception (e.g. network taps), but not active. If you’re in that category, some of the issues may not represent a new threat at all, so I’ll make a note of that as we go through each. Let’s get into the issues! 

Issue 1: The BEAST attack

The BEAST attack impacts TLS1.0 only, TLS1.1 and higher are immune. The BEAST attack is a sophisticated attack that utilizes both active traffic interception and manipulation of a browser (the “B” in BEAST stands for “Browser”). The underlying cryptographic flaw is to do with CBC-mode ciphers and initialization vectors but the short version is that the flaw allows an attacker to decrypt one byte of a data, if the same-data is sent over and over on many connections. On its own that isn’t very interesting, but BEAST uses a clever combination of triggering browsers to retry the same request and also manipulating that request a little to move interesting data into the position that can be decrypted, byte by byte. Over a very large number of requests (tens of thousands), the attacker can gradually decrypt more and more data.

If active traffic interception is not in your threat model, you don’t care about this issue. Otherwise, OpenJDK includes mitigations for the BEAST attack. These mitigations protect the traffic that OpenJDK is sending, but the traffic OpenJDK is receiving may be unprotected.  When BEAST came out, it did cause a big spark and in my experience most systems that support only TLS1.0 and TLS1.1 do also include BEAST mitigations. 

Additionally, it’s hard to imagine cases where Java, Scala or Kotlin applications would be vulnerable to the same kinds of manipulation as browsers. BEAST uses cross-domain Javascript to do its manipulation. 

Issue 2: The SHA1/MD5 transcript hash

To ensure that a sender or receiver isn’t being tricked by any kind of attacker in the middle, TLS uses a secure record of all the data sent and received called a transcript hash. In TLSv1.0 and TLSv1.1 this hash uses either a novel combination of the MD5 and SHA1 algorithms, or the SHA1 algorithm alone. Both MD5 and SHA1 are considered cryptographically broken hashing algorithms. 

An attacker could use this weakness to intercept themselves into a connection between two other parties. Doing this doesn’t exactly require brute-forcing the hash in the usual sense, like a password-cracker would. Instead, the attacker has to generate a series of messages to each party that will ultimately result in the same hash values. Karthikeyan Bhargavan and Gaëtan Laurent have demonstrated this with some great research, but there are some caveats. 

They estimate that it takes about 150 trillion trillion (that’s 150 trillion, times a trillion) operations to brute-force match a TLSv1.0 or TLSv1.1 transcript hash. To succeed, an attacker has to do this in real time, before connections time out. Today, performing the attack within 30 seconds would require 100s of millions of devices operating in parallel. Cryptographers are rightly concerned about this issue. Processors will only get faster, and at some point, this attack may fall within the capabilities of large actors such as nation states and organized crime. 

Again, if active traffic interception is not part of your threat model, this issue isn’t that interesting to you in the first place. 

Issue 3: Weak ciphers

TLS1.0 and TLS1.1 support only what are now considered weak ciphers. There’s RC4, AES-CBC, and 3DES-CBC and they all have marks against them. OpenJDK removed RC4 from the default set quite a while ago, so we don’t need to concern ourselves with that.

That leaves AES-CBC and 3DES-CBC. As it happens, removing support for TLS1.0 and TLS1.1 doesn’t mean these can’t be used. They are supported by TLS1.2 too, and some hardware systems force these ciphers even when you’re using TLS1.2, so it’s worth checking if turning off TLS1.0 and TLS1.1 really makes any difference here. But in general, TLS1.2 includes and prefers AES-GCM, and ChaCha20-poly1305, which are better. 

Why are these ciphers considered weak? Firstly, both CBC mode ciphers are impacted by Lucky13. Lucky13 isn't a practical attack against TLS, even in extremely favorable circumstances for an attacker. But it is practical against DTLS (aka the version of TLS for UDP). If you’re not using DTLS you don’t need to worry about it. Regardless, OpenJDK includes mitigations and uses a constant-time padding algorithm which is what matters. Traffic sent by OpenJDK is protected. Lucky13 also requires an active traffic interceptor, so if that’s outside of your threat model, account for that.

The second weakness is Sweet32, which is specific to 3DES. This issue impacts applications that send very large amounts of data over the same connection, and requires passive tapping from an attacker. To give you an idea the Sweet32 researchers were able to recover HTTP cookies on a connection that had around 785GB of traffic on it. Similar to the BEAST attack, it’s helpful when it’s the same information being transmitted many times. Though unlike BEAST, the data has to be on the same connection. 

This is one reason why 3DES is the least preferred cipher and generally only used as a last resort. If passive tapping is in your threat model, it’s worth avoiding 3DES. 

Previous
Previous

AWS SIGv4 and SIGv4A

Next
Next

Improving security in s2n