Nodejs Event Loop Architecture

Javascript Feb 15, 2020

The Nodejs Event Loop

This is one of the most important concepts that you need to understand about nodejs architecture. from the previous part of the series, I dive into nodejs processes and single thread. The first thing you need to know is that the event loop is where all the application codes inside a callback function are executed. So basically all codes that are not top-level codes will run in the event loop while some expensive tasks will get offloaded to the thread pool. The event loop does this job automatically.

Nodejs is all built around callback functions i.e functions that are called as soon as some work is finished and it works this way because nodejs use an event-driven architecture.

The event loop receives an event each time something happens and will call the necessary callbacks. The event loop, in summary, receives event such as new Http request, call their callback functions and offload the most expensive task to the thread pool. But the question you might be asking is how do all these things happen behind the scenes? and in what order are the callbacks executed... well let's dive deep to understand the question at hand.

Whenever we start our nodejs application, the event loop starts running right away. The event loop has multiple phases and each phase has a callback queue (This is where your asynchronous code gets pushed to, and waits for the execution.).

The Event loop has two stages and in the first stage, we will look at four important phases of the event loop.

Alt Text

The first phase takes care of callbacks of expired timers e.g setTimeout


setTimeout(() => { 
  console.log('timer expired') 
}, 0);

If there are callback functions from timers that have expired, these are the first ones to be processed by the event loop and also if a timer expired later during the time when one of the other phases are being processed, the callback of that timer will only be called as soon as the event loop comes back to the first phase. So callbacks in each queue are processed one by one until there is no callback left in the queue only then the event loop will enter the next phase.

The second phase is polling and execution of I/O (input/output) callbacks: polling basically means looking for I/O events (like file accessing, networking, compression and so on.) that are ready to be processed and put them into the callback queue.

The third phase is for setImmediate callbacks: setImmediate is a special kind of timer that we can use if we want to process callbacks immediately after the I/O polling execution phase which can be important in some advanced use cases.

The fourth phase is for closed callbacks: in this phase, all closed events are processed. for example when a web server or web socket shuts down. these are the most important phase that the event loop has.

Note: there are two other phases; the process.nexttick queue and the microtasks queue (for resolved promises). If you have any callback that needs to be executed in these phases, they will be executed right after the current phase finishes. for example, imagine that a promise resolves and returns some data from an API call while the callback of an expired timer is running so in this case, the promise callback will be executed right after the one from the expired timer finished without having to wait for the four-phases to finish. Well, this is just a high-level overview you can check the nodejs documentation for more information.

In the second stage of the event loop, nodejs checks if there are any pending timers or I/O tasks that are still running in the background and if none nodejs will exit the program and if there are pending tasks then nodejs will continue running the event loop starting from the first phase. for example, when we are listening for incoming Http requests, we are basically running an I/O task therefor nodejs keep running and keep listening for new Http requests instead of exiting the program. Also when we are writing or reading a file in the background this is also an I/O task so it makes sense that nodejs don't exit the program. It is very important for you to understand the event loop so that you can write performant code and also debug your own code when something occurs unexpectedly.

summary: the most important thing that I want you to understand in this part
is that the event loop is what makes asynchronous programming possible in nodejs and set nodejs apart from other platforms. It takes care of incoming request, executes callbacks, and offload the heavy task to the thread pool and run the simple work itself. And remember we need event loop because everything runs in a single thread in nodejs this makes nodejs lightweight and scalable at the same time comes with the danger of blocking the execution of your code which will make the entire app slow or even stop all users from accessing your app.

In languages like PHP running on Apache server, a new thread is created for each user which is way more resource-intensive but on the other hand, there is no danger of blocking the execution of your code.

Here is a couple of guidelines that will help you not to block the event loop

  • Don't use syn versions of functions in the fs, crypto, and zlip modules in your callback functions.
  • Don't perform complex calculations inside a callback.
  • Be careful of JSON in large objects (it will take a long time to parse).
  • Don't use too complex regular expressions

So let's now write some codes in other to get a better understanding of the event loop. it's extremely difficult to simulate the event loop properly because we can't put many callbacks in the call back queues that we talk about all at the same time that situation happens when a lot of requests are coming into our app but here locally it is very hard to replicate but nonetheless, we will still do some interesting experiment using some of the stuff we learn. So we will write a bunch of codes and try to figure out in what order they will be executed in the event loop and analyze if the result that we get actually make sense. So let's get started.

I will start by writing a setTimeout which expire after 0 seconds and log a string to the console

            setTimeout(() => {
              console.log('timer 1 expired')
            }, 0)

let's also use setImmediate and log a string to the console

         setTimeout(() => console.log('timer 1 expired'), 0)

         setImmediate(() => console.log('setImmediate is finished'))

next, we will read a text from a text file. If you are following along you can create a file with the extension (.txt) and generate some lorem text.

      const fs = require('fs');

         setTimeout(() => console.log('timer 1 expired'), 0)

         setImmediate(() => console.log('setImmediate is finished'))

         fs.readFile('text-flie.txt', () => {
         console.log('I/O finished')
         })

finally, let us log a string to the console which is a top-level code because it is not inside a callback

      const fs = require('fs');

         setTimeout(() => console.log('timer 1 expired'), 0)

         setImmediate(() => console.log('setImmediate is finished'))

         fs.readFile('text-flie.txt', () => {
         console.log('I/O finished')
         })

         console.log('this is a top-level code');

so let us try to figure out what should happen when we run the above code. whenever we start a node process, the top-level codes get executed first so right away we should see the console.log output, and all the callbacks will run in the event loop.

   * this is a top-level code
   * timer 1 expired
   * I/O finished
   * setImmediate is finished

The first result that we have here is the console.log which is what we expected then all the other outputs follow. But one thing to note here is that all the other three outputs are not in order because the codes are not in an I/O cycle i.e, the codes are running in an event loop, and in your result, you might see the order differs from mine. So here the order doesn't have to do with the event loop.

But before we run some advance codes let us think why the node process exited after it runs the result to the console. Do you remember how nodejs decide if it should continue running the event loop? well, it does so by asking if there is any timer running in the background and if so it will not exit the program but if there is no timer running in the background then it exit the program and in our case, we don't have any timer running in the background.

Let's move the setTimeout into the callback function

      const fs = require('fs');

     setTimeout(() => console.log('timer 1 expired'), 0)

      fs.readFile('text-flie.txt', () => {
         console.log('I/O finished')
      
         console.log('---------------------------------')
         setTimeout(() => console.log('timer 2 expired'), 0)
         setTimeout(() => console.log('timer 3 expired'), 3000)
         setImmediate(() => console.log('setImmediate 1 is finished'))
      })

      console.log('this is a top-level code');

Note here we copy both the setTimeout and setImmediate into the I/O callback and I also create a new setTimeout that will be expired after 3s. I also log a separator to help us see the result properly so let run the program.

          * this is a top-level code
          * timer 1 expired
          * I/O finished
          * ---------------------------------
          * setImmediate 1 is finished
          * timer 2 expired
          * timer 3 expired

The first thing to note when running this program is that the node process didn't immediately exit the process this is because we have a timer that is running in the background and after 3s the node process exit the program.

Next, the first three outputs are not running in the event loop but immediately after the separator, we have the three logs from the event loop. Let us analyze the result. Initially, you will think that timer 2 should actually be finished before the setImmediate (reference the diagram above) but why does the setImmediate appear before the setTimeout? I really did not want to introduce this concept before now because it will really be confusing. The event loop actually waits for stuff to happen in the polling phase and that is where I/O callbacks are handled. So when the queue is empty, which is in our case and we only have timers left then the event loop will wait in this phase until there is an expired timer but if we schedule a callback using setImmediate, then the callback will be executed right away after the polling phase and even before expired timers. In this case, the timer expired right away but again the event loop waits in the polling phase that is the reason why setImmediate comes before expired setTimeout. I know this sounds really confusing and I totally agree with you but that is really how nodejs work.

Next let's add process.nexttick()

      const fs = require('fs');

     setTimeout(() => console.log('timer 1 expired'), 0)

      fs.readFile('text-flie.txt', () => {
         console.log('I/O finished')
      
         console.log('---------------------------------')
         setTimeout(() => console.log('timer 2 expired'), 0)
         setTimeout(() => console.log('timer 3 expired'), 3000)
         setImmediate(() => console.log('setImmediate 1 is finished'))
         process.nextTick(() => console.log('process.nextTick()'))
      })

      console.log('this is a top-level code');

      

What do you think will happen in this case? let's run the program

    * this is a top-level code
    * timer 1 expired
    * I/O finished
    * ---------------------------------
    * process.nextTick()
    * setImmediate 1 is finished
    * timer 2 expired
    * timer 3 expired

The first callback that was executed is the process.nextTick() so why did process.nextTick() appear first? well, remember that process.nextTick() is that of the microtask queue which gets executed after each phase. This callback actually runs before the phase where setImmediate runs.

I also talk about some expensive task that gets offloaded to the thread pool let us see that in action. In this case, we will use the crypto module that nodejs offers and note that all functions in the crypto module get offloaded to the thread pool. We will use an encryption function called pbkdf2 the implementation of this function is not what we are really concern about for now.

      const fs = require('fs');
      const crypto = require('crypto')

      setTimeout(() => console.log('timer 1 expired'), 0)
      const start = Date.now();

      fs.readFile('text-flie.txt', () => {
         console.log('I/O finished')
      
         console.log('---------------------------------')
         setTimeout(() => console.log('timer 2 expired'), 0)
         setTimeout(() => console.log('timer 3 expired'), 3000)
         setImmediate(() => console.log('setImmediate 1 is finished'))
         process.nextTick(() => console.log('process.nextTick()'))

         crypto.pbkdf2('mypassword', 'salt', 100000, 1024, 'sha512', () => {
           console.log(Date.now() - start, 'password encrypted')
         } )
      })

      console.log('this is a top-level code');

      

    * this is a top-level code
    * timer 1 expired
    * I/O finished
    * ---------------------------------
    * process.nextTick()
    * setImmediate 1 is finished
    * timer 2 expired
    * timer 3 expired
    * 5882 'password encrypted'

Now let's duplicate this hashing function into four and check the time

      const fs = require('fs');
      const crypto = require('crypto')

      setTimeout(() => console.log('timer 1 expired'), 0)
      const start = Date.now();

      fs.readFile('text-flie.txt', () => {
         console.log('I/O finished')
      
         console.log('---------------------------------')
         setTimeout(() => console.log('timer 2 expired'), 0)
         setTimeout(() => console.log('timer 3 expired'), 3000)
         setImmediate(() => console.log('setImmediate 1 is finished'))
         process.nextTick(() => console.log('process.nextTick()'))

         crypto.pbkdf2('mypassword', 'salt', 100000, 1024, 'sha512', () => {
           console.log(Date.now() - start, 'password encrypted')
         });

         crypto.pbkdf2('mypassword', 'salt', 100000, 1024, 'sha512', () => {
           console.log(Date.now() - start, 'password encrypted')
         });

         crypto.pbkdf2('mypassword', 'salt', 100000, 1024, 'sha512', () => {
           console.log(Date.now() - start, 'password encrypted')
         });

         crypto.pbkdf2('mypassword', 'salt', 100000, 1024, 'sha512', () => {
           console.log(Date.now() - start, 'password encrypted')
         });
      })

      console.log('this is a top-level code');

      

    * this is a top-level code
    * timer 1 expired
    * I/O finished
    * ---------------------------------
    * process.nextTick()
    * setImmediate 1 is finished
    * timer 2 expired
    * timer 3 expired
    * 20277 'password encrypted'
    * 22715 'password encrypted'
    * 23993 'password encrypted'
    * 24233 'password encrypted'

The four crypto functions took approximately equal time this is because by default the size of the thread pool has for threads doing the work at the same time and is why the password encryption takes approximately the same time.

Now let's change the thread pool size and see what will happen

      const fs = require('fs');
      const crypto = require('crypto')

      const start = Date.now();
      process.env.UV_THREADPOOL_SIZE = 1

      setTimeout(() => console.log('timer 1 expired'), 0)
      

      fs.readFile('text-flie.txt', () => {
         console.log('I/O finished')
      
         console.log('---------------------------------')
         setTimeout(() => console.log('timer 2 expired'), 0)
         setTimeout(() => console.log('timer 3 expired'), 3000)
         setImmediate(() => console.log('setImmediate 1 is finished'))
         process.nextTick(() => console.log('process.nextTick()'))

         crypto.pbkdf2('mypassword', 'salt', 100000, 1024, 'sha512', () => {
           console.log(Date.now() - start, 'password encrypted')
         });

         crypto.pbkdf2('mypassword', 'salt', 100000, 1024, 'sha512', () => {
           console.log(Date.now() - start, 'password encrypted')
         });

         crypto.pbkdf2('mypassword', 'salt', 100000, 1024, 'sha512', () => {
           console.log(Date.now() - start, 'password encrypted')
         });

         crypto.pbkdf2('mypassword', 'salt', 100000, 1024, 'sha512', () => {
           console.log(Date.now() - start, 'password encrypted')
         });
      })

      console.log('this is a top-level code');

      

    * this is a top-level code
    * timer 1 expired
    * I/O finished
    * ---------------------------------
    * process.nextTick()
    * setImmediate 1 is finished
    * timer 2 expired
    * timer 3 expired
    * 4291 'password encrypted'
    * 8382 'password encrypted'
    * 12911 'password encrypted'
    * 17822 'password encrypted'

So you see they all take much longer time to complete this is because we only have one thread that handles the encryption one after the other. You can always refer back to nodejs documentation to learn more and check other articles to understand this concept very well. In the next part of this series, I will write about Events and Event-Driven Architecture

You can also connect with me via twitter

Calvin puram

I thrive to design and develop ideas into a project that makes life easy, solve problems, and implement innovations.