Header path incorrectly matches all paths

I’m trying to write a header rule which sets x-robots-tag: noindex on all content under a certain path (/archive/*), but netlify is applying it to all content. Here is the relevant section of my netlify.toml; the new header I’m adding is the first one:

[[headers]]                                                                      
  for = "/archive/*"                                                             
  [headers.values]                                                               
    x-robots-tag = "noindex"                                                     
                                                                                 
[[headers]]                                                                      
  for = "/*.yaml"                                                                
  [headers.values]                                                               
    content-type = "text/yaml"                                                   
                                                                                 
[[headers]]                                                                      
  for = "/*.yml"                                                                 
  [headers.values]                                                               
    content-type = "text/yaml"                                                   
                                                                                 
[[headers]]                                                                      
  for = "/*.sh"                                                                  
  [headers.values]                                                               
    content-type = "text/x-shellscript"                                          
                                                                                 
[[headers]]                                                                      
  for = "/*.bash"                                                                
  [headers.values]                                                               
    content-type = "text/x-shellscript"

It is correctly applying x-robots-tag when I curl children of that path (/archive/v3.10/)

$ curl -sD - -o /dev/null https://deploy-preview-3429--calico.netlify.app/archive/v3.10/
HTTP/2 200 
cache-control: public, max-age=0, must-revalidate
content-type: text/html; charset=UTF-8
date: Tue, 21 Apr 2020 19:58:23 GMT
etag: "50416202897305cbec1db49442bdc0c1-ssl"
strict-transport-security: max-age=31536000
x-robots-tag: noindex
age: 0
server: Netlify
x-nf-request-id: 05277d69-0353-4822-b15a-a988b4a0517f-16716263

But, it is unexpectedly applying x-robots-tag when I curl pages outside of that path (/introduction/)

$ curl -sD - -o /dev/null https://deploy-preview-3429--calico.netlify.app/introduction/
HTTP/2 200 
cache-control: public, max-age=0, must-revalidate
content-type: text/html; charset=UTF-8
date: Tue, 21 Apr 2020 20:04:14 GMT
etag: "b41e731a7892b6ec7645c4953e0ab9ab-ssl"
strict-transport-security: max-age=31536000
x-robots-tag: noindex
age: 0
server: Netlify
x-nf-request-id: f4103a38-9eab-4ed6-86b1-01f35de1d9be-23403330

Is my understanding of how wildcards are processed by netlify off?

Also interesting is that my netlify deploy preview still says 4 header rules processed: All header rules deployed without errors. despite me adding a 5th rule.

Thanks in advance

Hey @ozdanborne,
This header file looks correct to me, and the behavior you documented is definitely buggy/unintended! Thanks for the detailed report and sorry you’re running into this.

It looks like there have been a bunch of deploys in the last few days. Did you get to the bottom of it or are you still looking for troubleshooting help?

If the latter, could you share what your current setup is now? So:

  • a commit with just the misbehaving netlify.toml file
  • a deploy preview

Thanks for the reply, sorry it took so long for me to respond.

I finally realized that Netlify takes control of specifically the the x-robots-tag header and always sets it for deploy previews and always clears it for main deploys. In my basic experimentation, I found the user cannot modify that behavior.

In light of this, I wrote some code to add the <meta name="robots" content="noindex"> html tag where necessary which accomplishes the same thing.

Cheers!

awesome! thats a great approach. Would you mind sharing your code for others who are trying to accomplish the same thing?