Set headers depending on http status

Hi,

I have a vuejs App with a router and I am trying to improve my Netlify deployment.

Currently I set the caching for everything in the /assets/ folder to 1 week

/assets/*
  Cache-Control: public, s-max-age=604800

My problem appears when I request an unexisting asset (under /assets/not-found) the caching for 1 week is also set. Is there a way to prevent that and set the cache header depending on the HTTP Status ? (200 => cache for 1 week, 404 => s-max-age=0)

I didn’t find anything like that in the documentation (Custom headers | Netlify Docs)

Hi Oliver,

Good suggestion, but not one we have implemented yet. You can approximate this by scoping your header rule a bit: use a path of /assets/*js instead - but that would still lead to a long cache header on /assets/not-the-right-filename.js 404 response.

Thank you for your answer.

What I was worried about some kind of cache poisoning, for instance:

  • jquery is not yet used on my website
  • an “attacker” requests the non-existing /assets/jquery.js asset (which returns a 404)
  • if I deploy my site with this new assets, all clients (or proxies) will have the 404 cached, breaking there page

This scenario might be unlikely, but if somehow the attacker gets access to my development deploys, he will know in advance which assets will exist in a future release and attempt to break it.

I really think that a new option is needed for the headers, to prevent such attack:

  • either depending on the HTTP Status code
  • or depending on the existence of the file (but it must be taken into consideration for SPA, where all URLs give back the file index.html)

What do you think?


In the meantime, I made a similar hack to what I have done for the _redirects: after the build, scan all the assets and add one header line per file (this is a vue-cli plugin):

module.exports = api => {
  api.registerCommand(
    "generate-headers",
    {
      description: "Generates the _headers file for netlify",
      usage: "vue-cli-service generate-headers"
    },
    () => {
      // Walk part copied from https://stackoverflow.com/a/5827895
      var fs = require("fs");
      var path = require("path");
      var walk = function(dir, done) {
        var results = [];
        fs.readdir(dir, function(err, list) {
          if (err) return done(err);
          var pending = list.length;
          if (!pending) return done(null, results);
          list.forEach(function(file) {
            file = path.join(dir, file);
            fs.stat(file, function(err, stat) {
              if (stat && stat.isDirectory()) {
                walk(file, function(err, res) {
                  results = results.concat(res);
                  if (!--pending) done(null, results);
                });
              } else {
                results.push(file);
                if (!--pending) done(null, results);
              }
            });
          });
        });
      };

      walk("dist/assets/", (err, assets) => {
        if (err) {
          throw err;
        }
        let output = assets
          .map(a => {
            return (
              a.substr("dist".length) +
              "\n  Cache-Control: public, s-max-age=604800\n"
            );
          })
          .join("\n");
        fs.appendFileSync("dist/_headers", output);

        console.log(`Headers:\n`, output);
      });
    }
  );
};

Yup, that’s definitely a potential attack vector if you set non-specific headers like that. That’s why we recommend against setting any custom cache headers - our CDN doesn’t need them:

Yes but those assets have a hash (cache bursting), so there is no good reason to force the client to validate it every time.
It only adds some requests (which is more - useless - work for the client and for your CDN).

My script seems to work fine and to be attack proof, so you can consider this thread as solved :slight_smile:

Thank you for the quick and relevant answers!

We very much suggest not using the hash - it slows down your return visitor loads of repeat content (such as index.html that changed ONLY to point to filenames with updated hash names), that hasn’t itself otherwise changed, as mentioned here:

And, it leads to tons of problems like these:

You can of course configure things however you want, but there are downsides to every config, so you make the call about what works best for you :slight_smile:

1 Like

In my build process (vuejs), if a file didn’t change, the hash does not change as well. So my build is still “making the most of Netlify’s CDN cache” (I checked the last deploy and it has only " 3 new files uploaded")


Yes, but your solution (max-age=0) is probably not perfect as well, since you will serve the newer file (preventing the loading error) which might be incompatible with the rest of the code that the client downloaded earlier! (since a deploy happened between the first download and the file that is being downloaded right now)
The link to sprectrum chat posted in this topic is quite instructive! (and tells me that I shouldn’t use code-splitting without think twice about it).

I don’t have the error Uncaught SyntaxError: Unexpected token <, since my app automatically generates _redirects rules based on my vuejs router: every request that is not there returns an error 404 (so the browser does not try to parse the answer).

Sum-up: there is no perfect solution, and one should be aware of the advantage and limitations of every technique.


I totally agree :slight_smile: (and I will refrain from using code-splitting as much as possible :wink: )

1 Like

Ah - you are in the very rare minority who has things configured well, Oliver - congrats!

For most folks who have hashes in filenames, any build == new hash (cachebusting being the primary motivator), instead of asset fingerprinting which it sounds like you use and is not problematic (though can still result in the chunking error in case index.html is in an out-of-date visitor’s tab and the code has changed underneath them, but it sounds like you’re now aware of those perils and may be able to load code chunks intelligently (== force a hard reload if 404 on a chunk))

Definitely agree with your assessment that it is a hard problem to solve with a single pattern. Thanks for being so constructive during this conversation!

Hi,

I’m having the same issue and would love to have the ability to set header based on http status. Is there a way to do so now?

You can use Edge Functions to do most of that now.