Our production traffic broke overnight on Nov 13-14 (CET) without us making any changes to our systems. After a lot of debugging we isolated the outage to be cause by Netlify percent-encoding query parameters when proxying them to our backend.
We’ve had the proxy rule in place for a couple of months without any problems, so we’re pretty sure this was changed around the time of our outage.
For a bit more detail: our backend protects certain resources using Nginx Secure Link. This involves adding an
expires (irrelevant for this issue) and an
auth query parameter. The
auth param is a filename-safe base64 encoded md5 hash, basically using the dash (
-) and underscore (
_) characters instead of the more common plus (
+) and slash (
Now here is our problem, the Nginx secure_link module doesn’t decode the query parameters, and therefore detects a mismatch of the md5 hash, returning 403 (Unauthorized).
I think percent-encoding the
_ characters is not necessarily the right move, as they belong to the “unreserved set” of characters. The Wikipedia article on percent-encoding states:
Characters from the unreserved set never need to be percent-encoded.
encodeURIComponent() doesn’t encode
encodeURIComponent('hello-there') // 'hello-there'
We figured out that our frontend (running on Netlify) is not encoding the query parameters, we’ve confirmed this by running the frontend locally, and proxying to the backend using
http-proxy. Our Nginx logs showed unencoded strings coming from the locally hosted frontend. So really all signs point to the Netlify redirects being the culprit here.
We’ve implemented a workaround on our end, but wanted to open this issue regardless, if anything to make you guys aware of the breaking change that was likely introduced.