Node.js is one of the most popular choices for building web applications. However, as your application grows, you may need to scale it to handle a higher volume of traffic. There are two main ways to scale a Node.js application: by creating a monolith app that uses clustering, child processes or worker threads, or by creating a microservice application with multiple Node.js containers and an nginx server for load balancing. In this article, I will explore both options and provide code samples to help you understand when to use each one.
SECTION 1: Monolith App
A monolith app is a single codebase that handles all the functionality of your application. A monolith app is a straightforward way to build and deploy a Node.js application, as it requires only a single process to run the entire application. However, as the application grows, a monolith app can become challenging to scale, as the entire application must be scaled together. However, there are a few things we can do to scale a monolith app. We can use clustering, child processes, or worker threads.
Clustering
Clustering involves creating multiple instances of your Node.js application, each running on a separate core of your CPU. The instances communicate with each other and distribute the load among themselves. It enables the application to make full use of the available CPU cores and memory, resulting in better performance and higher throughput.
Here is a code sample that demonstrates how to achieve clustering in Node.js
const cluster=require("cluster")
const http= require("http")
const numCpus=require("os").cpus().length()
if(cluster.isMaster){
console.log(`master is running, its pid is ${process.id}`)
//fork workers
for (let i=0;i<numCpus;i++){
cluster.fork()
}
}
else{
//workers share the same port
http.createServer((req,res)=>{
//write your logic here
}).listen(8000);
console.log(`worker started`);
}
The code calculates the number of CPU cores available in your machine and creates processes to utilize all the CPU cores.
The master is only in charge of workers. The workers are in charge of handling incoming requests, reading files etc.
Each worker gets its event loop, memory and v8 instance.
When using clustering, the incoming requests to a Node.js application will be distributed across the multiple processes running on different CPU cores.
When a client makes a request to the application, the master process will determine which child process is available to handle the request, and then forward the request to that child process. The child process will then handle the request and send the response back to the client.
There is an awesome package called pm2 that can help you with clustering. It uses the cluster module behind the scenes.
sudo npm i -g pm2
To start your app in cluster mode, run
pm2 start index.js -i max
pm2 will autodetect the number of available CPU cores and run as many processes as possible.
You can also create a JSON file with the pm2 configuration.
{
"name": "my-app",
"script": "index.js",
"instances": "max",
"exec_mode": "cluster",
"max_memory_restart": "1G",
"env": {
"NODE_ENV": "production"
}
}
To use this configuration file, save it as ecosystem.config.json
in the root directory of your Node.js application, and then run the following command:
pm2 start ecosystem.config.json
childprocess.fork()
Forking in Unix language is cloning a process.
child_process.fork()
is a method used to create a new child process that runs a Node.js module. The new child process is an independent instance of the Node.js runtime, which means it can run its own event loop and use its own memory.
When you call child_process.fork()
in your Node.js application, you create a new child process that runs a specified JavaScript file. This child process is completely separate from the parent process, which means it has its own event loop and memory space.
The child process can communicate with the parent process using inter-process communication (IPC). This means that the parent and child processes can exchange messages and data with each other.
child_process.fork()
is often used in Node.js applications to run CPU-intensive tasks in a separate child process so that the main event loop of the application is not blocked. For example, you could use child_process.fork()
to run a script that generates a large amount of data, and then send that data back to the parent process using IPC.
Here is a sample code of how the child_process.fork()
works:
const { fork } = require('child_process');
// Create a new child process
const childProcess = fork('child.js');
// Send data to the child process
childProcess.send({ message: 'Hello from the parent process!' });
// Listen for messages from the child process
childProcess.on('message', (data) => {
console.log(`Received message from child process: ${JSON.stringify(data)}`);
});
// Listen for the child process to exit
childProcess.on('exit', (code) => {
console.log(`Child process exited with code ${code}`);
});
In this example, we are creating a new child process using child_process.fork()
and passing it the path to a child JavaScript file (child.js
). We then send a message to the child process using the send()
method, passing it an object containing a message property.
In the child process (child.js
), we can listen for this message using the process.on('message', ...)
event. When we receive the message, we can log it to the console and send a response back to the parent process using the process.send()
method.
Here's what the child.js
file might look like:
// Listen for messages from the parent process
process.on('message', (data) => {
console.log(`Received message from parent process: ${JSON.stringify(data)}`);
// Send a response back to the parent process
process.send({ message: 'Hello from the child process!' });
});
In this file, we are listening for messages from the parent process using the process.on('message', ...)
event. When we receive a message, we log it to the console and send a response back to the parent process using the process.send()
method.
worker_threads
In Node.js, worker threads are a way to run JavaScript code in separate threads, which can help improve performance for CPU-intensive tasks. Worker threads allow you to take advantage of multiple cores on your machine, by running JavaScript code in parallel.
Worker threads are similar to child processes, in that they allow you to run JavaScript code in a separate thread. However, worker threads are more lightweight than child processes, and they share memory with the main thread. This makes them more efficient than child processes, especially when working with large amounts of data.
Here's an example of how to use worker threads in Node.js:
const { Worker } = require('worker_threads');
// Create a new worker thread
const worker = new Worker('./worker.js');
// Send a message to the worker thread
worker.postMessage({ data: 'Hello from the main thread!' });
// Listen for messages from the worker thread
worker.on('message', (data) => {
console.log(`Received message from worker thread: ${JSON.stringify(data)}`);
});
// Listen for the worker thread to exit
worker.on('exit', (code) => {
console.log(`Worker thread exited with code ${code}`);
});
In this example, we are creating a new worker thread using the Worker
class, and passing it the path to a worker JavaScript file (worker.js
). We then send a message to the worker thread using the postMessage()
method, passing it an object containing some data.
In the worker process (worker.js
), we can listen for this message using the parentPort.on('message', ...)
event. When we receive the message, we can log it to the console and send a response back to the main thread using the parentPort.postMessage()
method.
Here's what the worker.js
file might look like:
const { parentPort } = require('worker_threads');
// Listen for messages from the main thread
parentPort.on('message', (data) => {
console.log(`Received message from main thread: ${JSON.stringify(data)}`);
// Send a response back to the main thread
parentPort.postMessage({ data: 'Hello from the worker thread!' });
});
In this file, we are listening for messages from the main thread using the parentPort.on('message', ...)
event. When we receive a message, we log it to the console and send a response back to the main thread using the parentPort.postMessage()
method.
The Microservices approach
Let's say you have a Node.js application that consists of a server, a database, and a caching layer. You could break this application up into three microservices: one for the server, one for the database, and one for the caching layer. Each microservice would have its codebase and run in its process or container.
You could then use an API gateway, such as Nginx, to manage the communication between these microservices. The API gateway would route incoming requests to the appropriate microservice, based on the URL or other criteria.
Each microservice could be scaled independently of the others, based on the needs of the application. For example, if the server was receiving a lot of traffic, you could add more instances of the server microservice to handle the load. Similarly, if the database was experiencing high read or write requests, you could add more instances of the database microservice.
By breaking up your Node.js application into smaller, independent microservices, you can achieve greater scalability, flexibility, and resilience.
One thing I have done in the past is to build an app using docker-compose and nginx as a load balancer.
When I made a request to port 3000 on my local machine, the request would get forwarded to port 80 of the Nginx container. Then Nginx would forward the request to port 3000 of the many node containers that were running. This way my app would handle huge amounts of traffic efficiently. Here is what a config file for Nginx should look like, although these are not all configurations. I've only included what's important.
Server{
listen 80;
location / {
proxy_pass http://node-app:3000
}
}
Each container has its IP address and DNS will be used to resolve the IP address of the docker container. i.e. node-app:3000, node-app is the name of our node app and it will resolute to an IP address.
So any traffic to nginx will be forwarded to a Node container.
docker-compose.yml up -d --scale node-app=10
That command creates 10 instances of node containers and requests will be load-balanced to all those containers.
Conclusion
That's it for this article, I hope you understand the 2 approaches. Choosing one is a matter of what your project needs.