With the Node.js Hammer, Everything Looks Like a Nail
Sunday, September 08, 2019
request module
(everything wrapped with promises and using async/await), yet the first thing that I ran
into was that requests were timing out.
Unlike a case where one could simply put a few promises into an array and call Promise.all() , this was much more involved.
Finally, I worked around this by adjusting the socket count allocated to the request library, which seemed to help things.
(As a side note, Node's great ecosystem sometimes serves as one of its challenges: should I have been using a different library,
such as request-promise or one of its several forks? You often don't know if a particular library
is going to work until it doesn't work...)
Next, as the process ran, additional network errors popped up. After some more digging, I ended up
at another Stack Overflow post
and found a comment about DNS lookups and Node's thread pool. The suggested fix was to adjust the value of
process.env.UV_THREADPOOL_SIZE , which I'd never even heard of until then. That fix (along
with a few others suggested) seemed to help, so I marched on.
Eventually, I got things to a point where trying to run even a few of the atomic chunks in parallel
caused the API to slow down (throttling, I'm assuming) and ultimately start to reject requests.
This complicated things because, as you may recall, each atomic chunk contained a variable
number of sub-requests which could run async – but all had to complete prior to other
synchronous tasks being run within that chunk. To make matters worse, if the number of
concurrent async sub-requests was too
large, the API would start rejecting requests. This was getting painful.
Finally, I got things working, only to then find that even after all the API requests and database calls
had completed, trying to close the
mssql connection pool would throw an exception – "Connection is closed".
Failing to close the connection pool, however, resulted in the Node process never exiting. It looked like
– despite every request having received a response and every response having been committed
to the database – there were still unresolved
promises floating around somewhere from what I could tell. Ultimately, the only solution that
worked here was waiting for 5-10 seconds prior to closing the connection pool, which seemed to
work consistently and without error (but felt kludgy nonetheless).
I was done – and I was glad to be done, because the entire effort felt like an uphill battle
against Node.
The entire exercise kind of bothered me on principle alone – I felt like I'd used the wrong tool
for the job. Now, call it cosmic guidance or call it masochism, but for whatever reason, I then
opted to write the exact same application in Python, which to me felt better suited to the task
at hand.
The Python version ended up being half the number of lines of code of the Node version, and
it took me less than a third the time to write. Now granted, some of the time saved was the
result of not
having to re-learn the nuances of the third-party API, but I'm confident that had I written the Python version first,
it would have taken (at most) half the time that the Node version took me.
When all was said and done, I delivered both the Node and Python versions of the application,
and I explained why I'd written the Python version. Ultimately, I was told, "Oh, we only asked for it
to be written in Node because that's what we're using elsewhere. Python is fine, too."
The takeaway here is two-fold:
|
||||