Prototype-Poisoning
The following is an article written by Eran Hammer. It is reproduced here for posterity with permission. It has been reformatted from the original HTML source to Markdown source, but otherwise remains the same. The original HTML can be retrieved from the above permission link.
A Tale of (prototype) Poisoning
This story is a behind-the-scenes look at the process and drama created by a particularity interesting web security issue. It is also a perfect illustration of the efforts required to maintain popular pieces of open source software and the limitations of existing communication channels.
But first, if you use a JavaScript framework to process incoming JSON data, take
a moment to read up on Prototype
Poisoning
in general, and the specific technical
details of this issue. I'll explain
it all in a bit, but since this could be a critical issue, you might want to
verify your own code first. While this story is focused on a specific framework,
any solution that uses JSON.parse()
to process external data is potentially at
risk.
BOOM
Our story begins with a bang.
The engineering team at Lob (long time generous supporters of my work!) reported a critical security vulnerability they identified in our data validation module — joi. They provided some technical details and a proposed solution.
The main purpose of a data validation library is to ensure the output fully complies with the rules defined. If it doesn't, validation fails. If it passes, your can blindly trust that the data you are working with is safe. In fact, most developers treat validated input as completely safe from a system integrity perspective. This is crucial.
In our case, the Lob team provided an example where some data was able to sneak by the validation logic and pass through undetected. This is the worst possible defect a validation library can have.
Prototype in a nutshell
To understand this story, you need to understand how JavaScript works a bit. Every object in JavaScript can have a prototype. It is a set of methods and properties it "inherits" from another object. I put inherits in quotes because JavaScript isn't really an object oriented language.
A long time ago, for a bunch of irrelevant reasons, someone decided that it
would be a good idea to use the special property name __proto__
to access (and
set) an object's prototype. This has since been deprecated but nevertheless,
fully supported.
To demonstrate:
> const a = { b: 5 };
> a.b;
5
> a.__proto__ = { c: 6 };
> a.c;
6
> a;
{ b: 5 }
As you can see, the object doesn't have a c
property, but its prototype does.
When validating the object, the validation library ignores the prototype and
only validates the object's own properties. This allows c
to sneak in via the
prototype.
Another important part of this story is the way JSON.parse()
— a utility
provided by the language to convert JSON formatted text into objects — handles
this magic __proto__
property name.
> const text = '{ "b": 5, "__proto__": { "c": 6 } }';
> const a = JSON.parse(text);
> a;
{ b: 5, __proto__: { c: 6 } }
Notice how a
has a __proto__
property. This is not a prototype reference. It
is a simple object property key, just like b
. As we've seen from the first
example, we can't actually create this key through assignment as that invokes
the prototype magic and sets an actual prototype. JSON.parse()
however, sets a
simple property with that poisonous name.
By itself, the object created by JSON.parse()
is perfectly safe. It doesn't
have a prototype of its own. It has a seemingly harmless property that just
happens to overlap with a built-in JavaScript magic name.
However, other methods are not as lucky:
> const x = Object.assign({}, a);
> x;
{ b: 5}
> x.c;
6;
If we take the a
object created earlier by JSON.parse()
and pass it to the
helpful Object.assign()
method (used to perform a shallow copy of all the top
level properties of a
into the provided empty {}
object), the magic
__proto__
property "leaks" and becomes x
's actual prototype.
Surprise!
Put together, if you get some external text input, parse it with JSON.parse()
then perform some simple manipulation of that object (say, shallow clone and add
an id
), and then pass it to our validation library, anything passed through
via __proto__
would sneak in undetected.
Oh joi!
The first question is, of course, why does the validation module joi ignore the prototype and let potentially harmful data through? We asked ourselves the same question and our instant thought was "it was an oversight". A bug. A really big mistake. The joi module should not have allowed this to happen. But…
While joi is used primarily for validating web input data, it also has a significant user base using it to validate internal objects, some of which have prototypes. The fact that joi ignores the prototype is a helpful "feature". It allows validating the object's own properties while ignoring what could be a very complicated prototype structure (with many methods and literal properties).
Any solution at the joi level would mean breaking some currently working code.
The right thing
At this point, we were looking at a devastatingly bad security vulnerability.
Right up there in the upper echelons of epic security failures. All we knew is
that our extremely popular data validation library fails to block harmful data,
and that this data is trivial to sneak through. All you need to do is add
__proto__
and some crap to a JSON input and send it on its way to an
application built using our tools.
(Dramatic pause)
We knew we had to fix joi to prevent this but given the scale of this issue, we had to do it in a way that will put a fix out without drawing too much attention to it — without making it too easy to exploit — at least for a few days until most systems received the update.
Sneaking a fix isn't the hardest thing to accomplish. If you combine it with an otherwise purposeless refactor of the code, and throw in a few unrelated bug fixes and maybe a cool new feature, you can publish a new version without drawing attention to the real issue being fixed.
The problem was, the right fix was going to break valid use cases. You see, joi has no way of knowing if you want it to ignore the prototype you set, or block the prototype set by an attacker. A solution that fixes the exploit will break code and breaking code tends to get a lot of attention.
On the other hand, if we released a proper (semantically versioned) fix, mark it as a breaking change, and add a new API to explicitly tell joi what you want it to do with the prototype, we will share with the world how to exploit this vulnerability while also making it more time consuming for systems to upgrade (breaking changes never get applied automatically by build tools).
Lose — Lose.
A detour
While the issue at hand was about incoming request payloads, we had to pause and check if it could also impact data coming via the query string, cookies, and headers. Basically, anything that gets serialized into objects from text.
We quickly confirmed node default query string parser was fine as well as its header parser. I identified one potential issue with base64-encoded JSON cookies as well as the usage of custom query string parsers. We also wrote some tests to confirm that the most popular third-party query string parser — qs — was not vulnerable (it is not!).
A development
Throughout this triage, we just assumed that the offending input with its poisoned prototype was coming into joi from hapi, the web framework connecting the hapi.js ecosystem. Further investigation by the Lob team found that the problem was a bit more nuanced.
hapi used JSON.parse()
to process incoming data. It first set the result
object as a payload
property of the incoming request, and then passed that
same object for validation by joi before being passed to the application
business logic for processing. Since JSON.parse()
doesn't actually leak the
__proto__
property, it would arrive to joi with an invalid key and fail
validation.
However, hapi provides two extension points where the payload data can be inspected (and processed) prior to validation. It is all properly documented and well understood by most developers. The extension points are there to allow you to interact with the raw inputs prior to validation for legitimate (and often security related) reasons.
If during one of these two extension points, a developer used Object.assign()
or a similar method on the payload, the __proto__
property would leak and
become an actual prototype.
Sigh of relief
We were now dealing with a much different level of awfulness. Manipulating the payload object prior to validation is not common which meant this was no longer a doomsday scenario. It was still potentially catastrophic but the exposure dropped from every joi user to some very specific implementations.
We were no longer looking at a secretive joi release. The issue in joi is still there, but we can now address it properly with a new API and breaking release over the next few weeks.
We also knew that we can easily mitigate this vulnerability at the framework level since it knows which data is coming from the outside and which is internally generated. The framework is really the only piece that can protect developers against making such unexpected mistakes.