MongoDB is an open source document-oriented NoSQL database. Unlike in the traditional relational databases, MongoDB makes use of JSON like collections and documents where each document consists of key-value pairs. It is typically used for high volume data storage.
However, as the size of the database increases, the query execution time might tend to increase as with any other database. To resolve such issues, MongoDB provides an option of sharding.
The concept of sharding can be read here. In MongoDB sharding requires three sets of different servers.
- Shards (at least 2 sets of 3 servers replica set): Each shard contains a subset of data. The shards should be deployed as replica sets. Within each shard the data will get replicated across its servers.
- Mongos: It acts as a query router, handling both read and write operations. The query requests get dispatched to the relevant shards and subsequently the result from each shard gets aggregated before the final response is delivered.
- Config (at least a set of 3 servers replica set): It maintains metadata and configuration settings.
Each server requires installation of mongodb. The installation instructions can be found here.
Following steps will help to deploy a sharded MongoDB cluster. The documentation is based on Ubuntu 20.04.5 LTS
- Set up of config servers (repeat the following steps in all servers)
- Create/edit the config file /etc/mongod.conf
net:
port: 27019 #same on all servers
bindIp: 127.0.0.1,<hostname(s)|ip address(es)>
replication:
replSetName: <replica set name>
sharding:
clusterRole: configsvr
- Create a systemd service by creating a file /etc/systemd/system/mongoserver.service
[Unit]
Description=MongoDB Config Server
Documentation=https://docs.mongodb.org/manual
After=network-online.target
Wants=network-online.target
[Service]
User=mongodb
Group=mongodb
EnvironmentFile=-/etc/default/mongod
ExecStart=/usr/bin/mongod --config /etc/mongod.conf
PIDFile=/var/run/mongodb/mongod.pid
# file size
LimitFSIZE=infinity
# cpu time
LimitCPU=infinity
# virtual memory size
LimitAS=infinity
# open files
LimitNOFILE=64000
# processes/threads
LimitNPROC=64000
# locked memory
LimitMEMLOCK=infinity
# total threads (user+kernel)
TasksMax=infinity
TasksAccounting=false
# Recommended limits for mongod as specified in
# https://docs.mongodb.com/manual/reference/ulimit/#recommended-ulimit-settings
[Install]
WantedBy=multi-user.target
- Reload the systemd:
sudo systemctl daemon-reload
- Restart the server:
sudo systemctl restart mongoserver
- Enable the service so that it gets restarted on reboot:
sudo systemctl enable mongoserver
- Initiate the config servers replica set. This can be achieved by connecting to any one of the config servers. This is to be done only once for the complete replica set.
- SSH to any one of the server belonging to the config servers replica set and execute mongosh
mongosh --host localhost --port 27019
Execute the following: (add the replica set name used before and add the server details in the members section)
rs.initiate(
{
_id: "<replica set name>",
configsvr: true,
members: [
{ _id : 0, host : "<host/ip>:27019" },
{ _id : 1, host : "<host/ip>:27019" },
{ _id : 2, host : "<host/ip>:27019" }
]
}
)
- Set up of Shard servers
- Each shard should be a replica set of at least 3 servers.
- Set up of replica set of single shard (repeat the following steps in all servers)
- Create/edit the config file /etc/mongod.conf
net:
port: 27018 #same on all servers
bindIp: 127.0.0.1,<hostname(s)|ip address(es)>
replication:
replSetName: <replica set name>
sharding:
clusterRole: shardsvr
- Create a systemd service by creating a file /etc/systemd/system/mongoserver.service
[Unit]
Description=MongoDB Shard Server
Documentation=https://docs.mongodb.org/manual
After=network-online.target
Wants=network-online.target
[Service]
User=mongodb
Group=mongodb
EnvironmentFile=-/etc/default/mongod
ExecStart=/usr/bin/mongod --config /etc/mongod.conf
PIDFile=/var/run/mongodb/mongod.pid
# file size
LimitFSIZE=infinity
# cpu time
LimitCPU=infinity
# virtual memory size
LimitAS=infinity
# open files
LimitNOFILE=64000
# processes/threads
LimitNPROC=64000
# locked memory
LimitMEMLOCK=infinity
# total threads (user+kernel)
TasksMax=infinity
TasksAccounting=false
# Recommended limits for mongod as specified in
# https://docs.mongodb.com/manual/reference/ulimit/#recommended-ulimit-settings
[Install]
WantedBy=multi-user.target
- Reload the systemd:
sudo systemctl daemon-reload
- Restart the server:
sudo systemctl restart mongoserver
- Enable the service so that it gets restarted on reboot:
sudo systemctl enable mongoserver
- Initiate the shard servers replica set. This can be achieved by connecting to any one of the servers of shard replica set. This is to be done only once for the complete replica set.
- SSH to any one of the server belonging to the config servers replica set and execute mongosh
mongosh --host localhost --port 27018
Execute the following: (add the replica set name used before and add the server details in the members section)
rs.initiate(
{
_id: "<replica set name>",
members: [
{ _id : 0, host : "<host/ip>:27018" },
{ _id : 1, host : "<host/ip>:27018" },
{ _id : 2, host : "<host/ip>:27018" }
]
}
)
- Repeat the same steps for each shard replica set but with different port.
- Set up of mongos query router server.
- Create/edit the config file /etc/mongod.conf
Comment the storage section and edit the following.
net:
port: 27017
bindIp: 127.0.0.1,<hostname(s)|ip address(es)> # Include public IP if required
sharding:
configDB: <config replica set name>/<config server ip>:27019,<config server ip>:27019 # at least one member of the replica set in <replSetName>/<host:port> format
- Create a systemd service by creating a file /etc/systemd/system/mongoserver.service
[Unit]
Description=MongoDB Query Server
Documentation=https://docs.mongodb.org/manual
After=network-online.target
Wants=network-online.target
[Service]
User=mongodb
Group=mongodb
EnvironmentFile=-/etc/default/mongod
# NOTE: command must be mongos
ExecStart=/usr/bin/mongos --config /etc/mongod.conf
PIDFile=/var/run/mongodb/mongod.pid
# file size
LimitFSIZE=infinity
# cpu time
LimitCPU=infinity
# virtual memory size
LimitAS=infinity
# open files
LimitNOFILE=64000
# processes/threads
LimitNPROC=64000
# locked memory
LimitMEMLOCK=infinity
# total threads (user+kernel)
TasksMax=infinity
TasksAccounting=false
# Recommended limits for mongod as specified in
# https://docs.mongodb.com/manual/reference/ulimit/#recommended-ulimit-settings
[Install]
WantedBy=multi-user.target
- Reload the systemd:
sudo systemctl daemon-reload
- Restart the server:
sudo systemctl restart mongoserver
- Enable the service so that it gets restarted on reboot:
sudo systemctl enable mongoserver
- Add shards to the cluster
That's it. The MongoDB cluster should be ready now.
Helpful references: