Netlify answers slowly on static file access (after deploy?)

Hi,

I’ve noticed that sometimes, some static files hosted on Netlify are slow to be served to the clients.

By slow, I mean, the first byte might take many seconds before being sent, and the request is pending for a while.
After the first byte is sent, trying to acces the file again with caching disabled it fast, as expected.

I can’t confirm yet but I suspect this happens after a new deployment, probably due to using a lazy strategy to propagate the files across the world/CDN?

The problem I encounter:

  • I have a Gatsby site I modify/deploy often, and some pages have quite low access rates
  • When clicking on a gatsby link, gatsby does code split the js into chunks so it has to download next page’s js chunk before navigating to it
  • Due to some js chunks being served slowly by netlify (probably after a deploy), users are clicking on a link and just get no feedback at all, and then many seconds later, navigation happens

I could probably add some feedback, like a progress bar like github (nprogress etc…).

But I’d rather find a more general solution.
Is there a possibility to initialize the netlify deployment eagerly in such way that no file is ever server slowly? What would you recommend to solve this problem.

Thanks?

your intuition that there is lazy-loading going on is correct; each CDN node (and there may be up to a dozen in one location, this is true east coast and west coast US) maintains its own cache, and that cache is invalidated at deploy time, and refilled only when a request is received. On our non-enterprise CDN, there is very high cache contention so further the file contents will fall out of cache quite quickly, in case your page has only sporadic access on a particular CDN node.

Now, the TTFB still shouldn’t be seconds - more like hundreds of milliseconds - unless you have some misconfiguration in DNS to not use our CDN optimally, or some other odd configuration like a proxy in front of us which is not a supported configuration. Two requests:

  1. can you tell me what the hostname for your site is so I can check DNS config?
  2. If you have such a slow load and can get us the value of the HTTP response header called x-nf-request-id that would be useful for us to see if we can understand if the delay is inside our network or outside (we maintain timing records for every request in our internal logs and can reference them by that ID).

There is no intention that you would push your site contents to the cache, as we don’t want to cache them if there are no requests, so there’s nothing you can do aside from following the advice here to optimize that setup:

Hi and thanks for this answer.

We have a pretty normal setup, no proxy or whatever, and do have seen many seconds (like, 20 seconds) file acces. But maybe it’s my local network, will have to check.

I’ll try to see if I can get a nf-request-id but as it’s not so easy to reproduce it may take some time.

I understand that you need to evict inactive deployments from your caches. Is it possible to have an idea of this eviction policy?

Is there a way to ensure a website is kept in the cache? Other providers like Zeit offer an ability to do so for a limited number of deployment (a “scale” optino I think). That would be cool to ensure that the files of a production deployments could stay in the cache.

We are currently on the free plan and would be fine to pay for something like that, but going from 0$ to 500$ enterprise plan for this is not really an option for us at the moment.

Hi @slorber, even on the free tier you can have a high cache hit rate if your site is accessed often. But the only way to increase it further would be to move to one of our custom or enterprise plans and move on to the Enterprise CDN network. That said, you shouldn’t be having TTFB as high as 20 seconds even on our free tier. Once you provide that x-nf-request-id or at least the URL for an asset that took a long time to load, and the date that you saw it, then we can dig further to see what’s happening.

Hi and thanks.

I’ve run some tests and I see some bad TTFB on a few requests, not necessarily after a redeploy.

Example request:
x-nf-request-id: e66ddb89-2360-4940-a71b-2331115c8dfa-108035021

Actually I’ve run some tests with someone that tries to resolve the website using different netlify ips, and one was significantly worse than the others in term of TTFB: 104.198.14.52

According to the DNS doc it’s your loadbalancer.

Can you tell me how using the loadbalancer ip in the DNS config lead to such results? Is this normal? If this makes TTFB 2 to 10x worse when using the loadbalancer, what about documenting better how this choice can significantly impact performances?

Note you can contact me in private if needed to get perf reports with/without the loadbalancer. I’m not able to contact you privately and I can’t post my client’s url on public forums.

lorber.sebastien@gmail.com or @sebastienlorber on Twitter

I’ve also noticed slow TTFB on deploy preview sites
Ip: 167.99.137.12 / 2a03:b0c0:3:e0::1b:1
Request: e8e4ed40-357f-4220-9f6a-0ad73eb74278-5390482

Are deploy preview sites slower that normal production deployments? As far as I understand, they don’t use your loadbalancer ip, so maybe it’s not only a DNS-related problem.

Note that I have a few concurrent requests happening at the same time, can this be related?

I would also like to chime in to state that I’ve just recently set up my site, and I am getting the same problem.

Content served after page-load (such as ajax requests for json or video file loads) stops at pending for up to 20(!) seconds before being served.

If there are any suggestions I would love to try them.

mind sharing a HAR file of such an experience? It can help us understand the problem: https://toolbox.googleapps.com/apps/har_analyzer/

@fool I am having the same issues. A HAR file can be found here:

https://drive.google.com/file/d/1QMVTzuzq_GXD3zNCda12krNosYclt0QI/view?usp=sharing

That har file seems corrupted, @rshea - can’t load it (it is cut off somewhere in the middle). could you try gathering another one for us?

My bad – @fool try this one? https://drive.google.com/file/d/1l4_4ieKXIyf-spOhUsa0RgP0rj9MSNrU/view?usp=sharing

Thanks. I can definitely see what you’re talking about in there, this request taking 12 seconds seems like a big bummer:

image

…but on our side in our internal logs, we show we sent it in 6ms (the “x-nf-request-id” HTTP response header is a unique value that we can correlate to a specific request in our logs, and so I looked it up).

Was your internet generally working well at the time?

Hi,

On my side I didn’t get any news from my last twitter attempt to solve this problem.

On our side, it turns out that putting CloudFlare on top of Netlify does significantly increase the TTFB performances. It is the exact same deployment, with a single toggle switched on CF side to enable the proxy.

Swyx should have given you both DNS domains (with/without cloudflare proxy) and told me you were studying the problem (fool+gerald), but no news for 1 month.

I feel like I gave you everything I can on my side, and can’t do anything more, except maybe paying 500€/month to solve this problem. CloudFlare is actually a way cheaper solution, and it’s what I end up recommending to my customers currently until this problem is solved. Despite Netlify saying CF on top of Netlify is useless, it turns out it’s not for me.

@fool

I was on a wireless connection, so I just ran this on a fiber 1gbps connection:

Still looks like the load time is happening.

@slorber Interesting – I might try this in the mean time despite using Netlify’s DNS with the hopes that Cloudflare doesn’t improve TTFB.