Trouble connecting to external API during deployment (ENOTFOUND)

Briefly summarize the issues you have been experiencing.

I have a Gatsby website that use gatsby-source-graphql to pull in some data used for the site. The API endpoint it connects to is to a Hasura application that I host myself on Vultr. It stopped working between 8.29am and 10.49am on July 16th (no code actually having been changed) after having been deploying without problems for a week before erroring out with the following error message:

11:28:59 PM: error #11321 PLUGIN request to https://graphql.knutmelvaer.no/v1/graphql failed, reason: getaddrinfo ENOTFOUND graphql.knutmelvaer.no graphql.knutmelvaer.no:443
11:28:59 PM: "gatsby-source-graphql" threw an error while running the sourceNodes lifecycle:
11:28:59 PM: request to https://graphql.knutmelvaer.no/v1/graphql failed, reason: getaddrinfo ENOTFOUND graphql.knutmelvaer.no graphql.knutmelvaer.no:443
11:28:59 PM: See our docs page for more info on this error: https://gatsby.dev/issue-how-to
11:28:59 PM: 
11:28:59 PM:   FetchError: request to https://graphql.knutmelvaer.no/v1/graphql failed, reaso  n: getaddrinfo ENOTFOUND graphql.knutmelvaer.no graphql.knutmelvaer.no:443
11:28:59 PM:   
11:28:59 PM:   - index.js:133 ClientRequest.<anonymous>
11:28:59 PM:     [web]/[node-fetch]/index.js:133:11
11:28:59 PM:   
11:28:59 PM:   - destroy.js:91 emitErrorNT
11:28:59 PM:     internal/streams/destroy.js:91:8
11:28:59 PM:   
11:28:59 PM:   - destroy.js:59 emitErrorAndCloseNT
11:28:59 PM:     internal/streams/destroy.js:59:3
11:28:59 PM:   
11:28:59 PM:   - next_tick.js:63 process._tickCallback
11:28:59 PM:     internal/process/next_tick.js:63:19
11:28:59 PM:   
11:28:59 PM: 
11:29:00 PM: npm ERR! code ELIFECYCLE
11:29:00 PM: npm ERR! errno 1
11:29:00 PM: npm
11:29:00 PM: ERR!

The site builds fine locally.

Please provide a link to your live site hosted on Netlify

What have you tried as far as troubleshooting goes? Do you have an idea what is causing the problem?

My hunch is that it might be a DNS problem since I’m getting “getaddrinfo ENOTFOUND”? FWIW, I actually use Netlify as a nameserver. The Reset cache and deploy function makes no difference.

Do you have any other information that is relevant, such as links to docs, libraries, or other resources?

I run a basic nginx setup with Let’s Encrypt on Vultr.

I concur with your judgment that it was/is a DNS error, but hard to say the root cause. I took a look at frequent causes for DNS failure such as DNSSEC configuration (http://dnsviz.net/d/knutmelvaer.no/dnssec/ reports no errors) and other problems with NS/SOA records, incident listings for our DNS provider (https://www.nsonestatus.net/) and didn’t see anything that would cause it.

I did see another case like this in the helpdesk and upon review of our logs a (very - 5 total) small handful of folks are suffering from it in various guises (using different, possibly truly invalid hostnames), but none as reliably as the builds in your account.

I decided to try to diagnose from inside our build container locally, and it worked well (well, the only thing I personally had to test was ping - dig/nslookup/host aren’t in the build image). However, when running our build network, that DID fail reliably:

5:05:04 PM: ping: unknown host graphql.knutmelvaer.no

I tried the only trick I know which is flushing google’s nameserver cache (we use them sometimes, and the state has been known on more than one occasion to be “funky” in their pool of nameservers), but no effect.

I’ve escalated this to our platform team for their assistance since it appears to be something about our infrastructure at this point. I’m not certain they’ll be able to look at it before the weekend but I suspect we’ll get an answer from them early next week sometime - stay tuned!

1 Like

Thanks for the thorough answer! Glad it seems to be an edge case and nothing on my part or too severe. Fortunately this is just my blog, and I get by with pushing the build directly from the CLI while waiting for a fix. I’ll stay tuned then!

I just had a successful build triggered via a hook, so chances are this is fixed.

1 Like