Is WebAssembly Susceptible to Log4Shell-style Attacks?
Matt Butcher
log4shell
webassembly
security
WebAssembly runs in a language virtual machine that is similar in many respects to the JVM (Java’s runtime) or the CLR (.NET’s runtime). But a key difference in WebAssembly’s architecture (built, as it was, for the browser) is that by default the runtime does not trust the code that it executes.
Late last year, a hack against a prominent open source Java library rocked the software world. A vulnerability in a logging system allowed attackers to execute arbitrary commands on the host. The recent Log4j attack provides an opportunity to talk about that characteristic and see why WebAssembly is resistant to this kind of attack. But I end with a warning that WebAssembly’s security model is only as good as its runtime security, and a troubling tendency to insecurely expose underlying host facilities to WebAssembly modules could undermine WebAssembly’s security model.
The Log4Shell Attack
In a nutshell, Log4Shell exploits a couple of weaknesses in the Log4j logging library (a venerable and nearly ubiquitous logging engine) in order to execute untrusted code locally with the permissions of the Java application. In the worst cases, it can result in a hostile takeover of a server. At this point in time, there are at least two related CVEs and two security patches to Log4j to close this loophole.
An excellent post by Sophos explains the contributing factors for the attack, and I’ll use their description as an outline for discussing WebAssembly.
Primarily, what we are interested in is not whether a binary compiled to WebAssembly can have a security hole. (Spoiler: It can.). The question is whether such a hole can be exploited in order to “break out” of the WebAssembly sandbox and take over the server.
To make this distinction clear, we’ll talk about two facets of the WebAssembly runtime:
- The guest code is a WebAssembly binary that runs on a host. This code is supplied by an application developer.
- The host runtime is a WebAssembly runtime (language virtual machine, sometimes called the execution environment) that runs on the host operating system and executes WebAssembly guest code in a sandbox.
The fundamental thesis of the WebAssembly security model is that guest code, even if it has vulnerabilities, should not thereby be able to break out of the host runtime’s sandbox and attack the host.
Note that there are projects to compile Java or JVM languages to WebAssembly
Issue 1: Formatting Strings
The first step in the exploit had to do with allowing users to supply string data that would be evaluated as a formatting string. Common formatting string functions have a signature that looks like this: printf(FORMAT_STRING, arg1, arg2...)
. The format string is interpolated, meaning special sequences of characters are interpreted as commands, and the formatter substitutes values for those commands.
Most mainstream languages support some variant of format strings. Some refuse to compile if the FORMAT_STRING
is not a literal string, but that is an exception rather than the rule. Many languages will happily interpolate a FORMAT_STRING
even if it is populated from a source that should rightly be considered untrusted.
In this case, WebAssembly does not offer much protection, as it is the programming language that is compiled to WebAssembly that must instill the relevant protections. But this is not dire news for WebAssembly, as formatting issues such as this are not enough (on their own) to allow an attacker to break out of the runtime. In the worst cases, string data from the WebAssembly module may have user-supplied information sent to the console or log file.
This represents a case where guest code can have a security vulnerability, but where that vulnerability cannot thereby be exploited to break out of the sandbox.
Issue 2: Environment variable access
While this was not technically part of the Log4Shell attack, the Sophos article showed how untrusted formatting strings could be used to leak environment variables into output. For example, if you could find a way to inject the formatting string username: ${USER}
, you might be able to get a program to output the value of the $USER
environment variable. Since any given environment may have dozens of variables, some of which contain sensitive information, this could lead to disclosure of important things, even including passwords.
WebAssembly’s approach to this issue is interesting, and starts to show where WebAssembly veers in a different direction from other tools.
Core WebAssembly does not allow access to the environment at all. At the base level, then, this is not a problem.
However, the community does in fact want and need access to environment variables for cloud-side WebAssembly (and for other use cases, too). And this is where the WebAssembly System Interface (WASI) comes in: WASI provides a capabilities-based system for accessing certain system resources. And one of the very first features introduced in WASI is support for a guest module to look up environment variables.
However, when a guest module looks up environment variables, it does not get access to the system. It only gets access to a set of name/value pairs that the host allows it to see (and which name/value pairs these are is up to you). These pairs are copied into the host runtime memory and then made available to the WebAssembly module when it loads.
This vastly reduces the attack surface.
In the worst cases, then, a user may configure the WebAssembly runtime to add the environment variable MY_PASSWORD="Sw0rdf1sh”
, and then proceed to write code that allows a user to request the value of $MY_PASSWORD
. But the damage is limited only to the guest module. The guest module can neither read nor alter the host’s environment.
Issue 3: Network Access
Network access is where things really begin to get interesting. In the Log4Shell attack, network access was a precondition for the attacker to be able to fetch their code from a remote source. But the only reason it surfaced in any way as a vulnerability is because it became possible for the attacker to specify a resource to be fetched. In other words, just as with environment variables, it is not the ability to access the network that is a problem, it’s the ability to hijack this facility. Be that as it may, WebAssembly has taken an interesting approach specifically because poor networking implementations can introduce plenty of opportunities for breakouts or break-ins.
- Core WebAssembly does not offer network access
- Currently, no accepted WASI features include network access
- But, there are proposals and implementations of various levels of WASI network access
Without networking in WebAssembly and WASI, we were able to build useful web services using techniques where the runtime re-encodes the request data into WASI types like files and environment variables. This was the approach we took in Wagi. In this model, it’s “impossible” to do network-related attacks because there is no network API surface exposed. (I put impossible in quotes because if there is one thing I have learned about skilled hackers, it’s that sometimes when they think out of the box, they think way out of the box… and succeed.)
But limiting network interactions in this way is not particularly helpful to the developer. It is well within reason for a developer to say, “I need to make an outbound HTTP request to an arbitrary server.” Thus, in our desire to eventually get such a facility into WASI, we have written an experimental host extension to do this. This service supports some level of capabilities. The host can, for example, limit which hosts an outbound HTTP connection can request.
But this is an area where we need to think hard about security. For example, one of the reasons we haven’t put a whole lot of effort into exposing an entire networking stack inside of WebAssembly is that we are justifiably suspicious that doing so would open up too wide of an attack surface. A small omission or flaw could inadvertently let a guest module inspect the other connections on the same host. And that would be a violation of the sandboxing model. So higher-level APIs (like HTTP or perhaps SMTP) might be safer to expose to the guest module.
The bottom line here is that as of this moment, there is not enough standard support for networking to make this a possible vector of attack, and our efforts to add support to WASI are being done with the objective of not introducing an attack surface against the host.
So as with the previous two issues, while WebAssembly can’t protect guest module authors from introducing security bugs in their own code, there is no known way in which this can result in a break out into the host environment. So again, a Log4Shell-style attack is stymied. (But after we cover Issue 4, we’ll see how this could all fall apart).
Issue 4: Executing Helper Code
The coup de grace of the Log4Shell attack is that in the end it could execute untrusted code on the host. And it is here that WebAssembly should be the most resistant. The reason can be explained by a core difference in the assumptions Java’s VM makes versus the assumptions the WebAssembly runtime makes.
For the most part (and, yes, you can go to great lengths to change this about the JVM if you want to), the JVM takes a posture of trust. Any code that runs in the JVM is assumed to be trustworthy, and is thus allowed to access local resources like files, process tables, sockets, and so on. And by default it can do so with the permissions of the system user that is running the JVM. In other words, the guest-to-host surface of the Java VM is “porous” by default. The guest can access many system level things via the host (and as the host). Hard to configure or not, the vast majority of the JVMs out there are configured to allow these porous defaults. (If you are interested in a comparison of WebAssembly to Applets, check out this blog post by Steve Klabnik)
WebAssembly takes the opposite security stance. By default, the guest cannot explicitly access any of the host resources. No files, no processes, no environment variables, no network connects… nothing. WASI allows the host to explicitly enable some sorts of resource access. But even so, it can do that in more constrained ways. For example, if the host provides the guest module access to files, it may do so via a preopen. For that matter, it may merely expose the guest to what looks like a filesystem, but is actually in-memory data. The filesystem may be read only, read-write, or a hybrid where writes are not flushed back to stateful storage. And the interesting thing about this is that the guest has no way of knowing what the underlying configuration is.
When it comes to executing arbitrary code on the host, there is currently no way to do this in WebAssembly. A WebAssembly guest module can neither fork/spawn a process nor can it dynamically load another WebAssembly binary into the host runtime (which is effectively what Java’s classloaders do).
In this case, we can probably make a stronger claim than in the previous cases. And that claim is that we do not know of a way that a guest module developer could load arbitrary code at runtime, let alone accidentally leave a hole open that would allow an attacker to do so.
It would be all well and good at this point to simply write up as a conclusion: No, WebAssembly applications are simply not vulnerable to Log4Shell-style attacks.
But…
A Warning for WebAssembly Runtime Developers
A worrying trend is arising in the WebAssembly ecosystem as individuals and companies begin to build generalized services based on WebAssembly. That trend is to add host-level services that expose new features to guest modules.
Earlier I talked about our very own HTTP client library, which does exactly this. It allows guest code to ask the host to run an HTTP request on its behalf. The host performs the request and then returns the data to the module.
I contrasted that approach with a deeper socket-level model that might allow a guest module to ask the host to create new socket servers or request information about the network configuration of the host. And I pointed out that this lower level API would introduce considerably more attack surface.
The important thing to understand here is that in both of these cases, the new feature is added by the host runtime, over and above what the core WebAssembly runtime or the WASI extensions provide.
This can be a very good thing for the WebAssembly ecosystem. A strategic selection of host extensions opens endless possibilities for the kinds of applications we can write in WebAssembly. And ultimately this is what people like me desire. I want WebAssembly to be a powerful development platform for cloud services.
But we, as WebAssembly ecosystem developers, must—-as an absolute imperative—-be careful about what we build and how we do it. As one of Fermyon’s founding engineers, Ivan Towlson, pointed out, all it takes to render this entire security model moot is for a host extension to directly expose the Log4j API and suddenly there is a nice clean hole from the guest module into a vulnerable part of the host environment.
It is incumbent on the host runtime authors and implementors to build layers of resistance. Here are at least a few rudimentary guidelines that will help us all build attack-resistant host extensions:
- Design with capabilities models in mind:
- The default is no permissions to access any resources
- Host runtime operators can choose (at a fine-grained level) what to allow
- It is impossible for the guest module to change or influence the enforcement of capabilities (e.g. there are no guest-side “escape hatches”)
- Provide small attack surfaces. If a host extension adds 40 new functions, there is a far greater likelihood of introducing a flaw than if the host extension adds only two or three narrowly defined functions. (Likewise with data structures.)
- Favor higher level functionality over low-level functionality. We think it is safer to expose an HTTP client extension than a full socket library extension. Obviously, this criterion must be approached artfully, as this is about striking the right balance.
- Absolutely positively avoid-like-the-plague any extension that would allow guest modules to start new processes or run code on the host platform. This is an instant violation of the entire sandboxing model.
- Whenever possible sandbox, validate, and sanitize data along the host/guest interface. Yes, this might introduce (usually negligible) performance overhead, but that’s almost always a better tradeoff to make when the result is better security.
- Develop host extensions collaboratively as open source. Yes, it didn’t necessarily mitigate the Log4j case. But we should be honest: there are countless security vulnerabilities that have been found and shut down precisely because many eyeballs scan the code.
These guidelines might not make our host extensions bulletproof, but they will at least help us take a resistant-by-default posture. And hopefully we can minimize, if not eliminate, the potential for a host break-out in WebAssembly code.