Does hitting Netlify CDN count towards bandwidth usage?

Hey Netlify,

I have an image-heavy site which has 200 users per day on average. Over the weekend, the site was hit by a bot which meant that I had 6000 users in under 24 hours. This blew my 100GB of bandwidth allowance.

My understanding of a CDN is that if 100 users visit my site from London, only the first user will hit the servers for the content. The following 99 will retrieve it from the CDN. In the case of Netlify CDN, the etag will be the same for all of these users, so it will hit the server to check the etags match and then return a 304, instructing that the content hasn’t changed and can be served from Netlify CDN. Thus, out of 100 users, only 1 would count toward the bandwidth usage quota.

However, based on the volume of traffic to my site from the bots and the size of the requests to that particular page, it looks like every request is being treated as consuming bandwidth from the Netlify server.

Firstly, it’d be good to get clarification of whether retrieving content from the CDN still counts towards the bandwidth quota?

It’s difficult to find a clear definition as to what constitutes bandwidth usage for Netlify in the docs. I’ve seen this question crop up in a quite a few support requests, particularly this recent one: Excessive Bandwidth Usage

Secondly, do you have any alerting tools/emails for this situation? In this case, traffic to my site surged by 300x, a clear indicator that something is not right. Additionally, I couldn’t find a way to investigate my bandwidth consumption in the Dashboard apart from my current bandwidth usage for the month across my team. If I didn’t have GA, I’d be totally blind on the cause and thus wouldn’t be able to find a solution.

Look forward to hearing back

2 Likes

Hey @Jessica,
I’m sorry to hear you were hit by a bot :frowning: Thanks for your very thorough report. I (and my colleague @luke whose knowledge is scattered throughout this response) will try to address the questions you raised and hope it’ll be helpful to you and others who come across this.

To your first question: retrieving content from the CDN does count towards the bandwidth quota. You’re correct that a trip to your site visitor from the CDN node is shorter than if that person went to our origin server. And your understanding of how a CDN works is half correct—we don’t resend things if we don’t have to as described in https://www.netlify.com/blog/2017/02/23/better-living-through-caching/ —but the nitty-gritty is a little more complicated. We cache site assets at our CDN nodes but those caches don’t last forever. If no one requests your assets from that node for a while, the cache will be empty on their next visit and they’ll have to visit our origin after all.

As for 304s, they are specific to each individual web browser, not the CDN node. 304s confirm the content is unchanged from what that specific browser already downloaded and therefore servers don’t send the content again. This depends on that specific web browser already having downloaded a copy of the content and stored it on a local filesystem somewhere.

If 100 visitors all reach the same CDN node, all 100 requests will get a 200 response and are sent the full content. A visitor can’t see an image downloaded by someone else’s computer- we have to send each person’s browser the content. We will send it gzipped automatically but bandwidth is used for all 100 visitors. It doesn’t matter if they all get the same CDN node or 100 different CDN nodes, the bandwidth used will be the same.

Only if the same visitor on the same browser visits the site will the 304 be sent. If that visitor then switches to a new browser on the same device, they won’t have the cached content in that browser. So if they revisit the site on a new browser (same computer/phone though), the CDN node will send a 200 and the full content bandwidth is used again. If the user clears their browser cache in a browser and revisits the site, again there is a 200 and the full bandwidth is used.

Phew, hopefully that was not TMI! All of that said, the real culprit here, as you described, was likely your bot attack. But wanted to fill you in on how our system works.

To your second point about notifications: we agree that real-time notifications would be a great feature. We currently attempt to tell you as you approach (50%, 75%) and cross paid (100%, 200%, etc) thresholds for our metered features. We succeed, as long as your growth pattern is fairly slow! But you’re correct that this currently does not work in cases where traffic spikes very quickly due to an attack. That’s something our engineering team is working on improving.

Please let us know if we can answer any other questions about this.

Hey @jen and @luke

Thanks for your detailed response Jen, definitely not TMI and appreciate you getting into the nitty gritty.

It does clear up a few things around how and why my usage even without a bot attack is higher than I’d expect. I’ve taken some steps to optimise my image assets and reduce my bandwidth consumption.

Thanks again :slight_smile:

1 Like

Great thread though :heart_eyes: love the info!

2 Likes