Tutorial

How To Back Up, Restore, and Migrate a MongoDB Database on Ubuntu 14.04

Published on April 15, 2016
How To Back Up, Restore, and Migrate a MongoDB Database on Ubuntu 14.04
Not using Ubuntu 14.04?Choose a different version or distribution.
Ubuntu 14.04

Introduction

MongoDB is one of the most popular NoSQL database engines. It is famous for being scalable, powerful, reliable and easy to use. In this article we’ll show you how to back up, restore, and migrate your MongoDB databases.

Importing and exporting a database means dealing with data in a human-readable format, compatible with other software products. In contrast, the backup and restore operations create or use MongoDB-specific binary data, which preserves not only the consistency and integrity of your data but also its specific MongoDB attributes. Thus, for migration its usually preferable to use backup and restore as long as the source and target systems are compatible.

Prerequisites

Before following this tutorial, please make sure you complete the following prerequisites:

Except otherwise noted, all of the commands that require root privileges in this tutorial should be run as a non-root user with sudo privileges.

Understanding the Basics

Before continue further with this article some basic understanding on the matter is needed. If you have experience with popular relational database systems such as MySQL, you may find some similarities when working with MongoDB.

The first thing you should know is that MongoDB uses json and bson (binary json) formats for storing its information. Json is the human-readable format which is perfect for exporting and, eventually, importing your data. You can further manage your exported data with any tool which supports json, including a simple text editor.

An example json document looks like this:

Example of json Format
{"address":[
    {"building":"1007", "street":"Park Ave"},
    {"building":"1008", "street":"New Ave"},
]}

Json is very convenient to work with, but it does not support all the data types available in bson. This means that there will be the so called ‘loss of fidelity’ of the information if you use json. For backing up and restoring, it’s better to use the binary bson.

Second, you don’t have to worry about explicitly creating a MongoDB database. If the database you specify for import doesn’t already exist, it is automatically created. Even better is the case with the collections’ (database tables) structure. In contrast to other database engines, in MongoDB the structure is again automatically created upon the first document (database row) insert.

Third, in MongoDB reading or inserting large amounts of data, such as for the tasks of this article, can be resource intensive and consume much of the CPU, memory, and disk space. This is something critical considering that MongoDB is frequently used for large databases and Big Data. The simplest solution to this problem is to run the exports and backups during the night or during non-peak hours.

Fourth, information consistency could be problematic if you have a busy MongoDB server where the information changes during the database export or backup process. There is no simple solution to this problem, but at the end of this article, you will see recommendations to further read about replication.

While you can use the import and export functions to backup and restore your data, there are better ways to ensure the full integrity of your MongoDB databases. To backup your data you should use the command mongodump. For restoring, use mongorestore. Let’s see how they work.

Backing Up a MongoDB Database

Let’s cover backing up your MongoDB database first.

An important argument to mongodump is --db, which specifies the name of the database which you want to back up. If you don’t specify a database name, mongodump backups all of your databases. The second important argument is --out which specifies the directory in which the data will be dumped. Let’s take an example with backing up the newdb database and storing it in the /var/backups/mongobackups directory. Ideally, we’ll have each of our backups in a directory with the current date like /var/backups/mongobackups/01-20-16 (20th January 2016). First, let’s create that directory /var/backups/mongobackups with the command:

  1. sudo mkdir /var/backups/mongobackups

Then our backup command should look like this:

  1. sudo mongodump --db newdb --out /var/backups/mongobackups/`date +"%m-%d-%y"`

A successfully executed backup will have an output such as:

Output of mongodump
2016-01-20T10:11:57.685-0500    writing newdb.restaurants to /var/backups/mongobackups/01-20-16/newdb/restaurants.bson
2016-01-20T10:11:57.907-0500    writing newdb.restaurants metadata to /var/backups/mongobackups/01-20-16/newdb/restaurants.metadata.json
2016-01-20T10:11:57.911-0500    done dumping newdb.restaurants (25359 documents)
2016-01-20T10:11:57.911-0500    writing newdb.system.indexes to /var/backups/mongobackups/01-20-16/newdb/system.indexes.bson

Note that in the above directory path we have used date +"%m-%d-%y" which gets the current date automatically. This will allow us to have the backups inside the directory /var/backups/01-20-16/. This is especially convenient when we automate the backups.

At this point you have a complete backup of the newdb database in the directory /var/backups/mongobackups/01-20-16/newdb/. This backup has everything to restore the newdb properly and preserve its so called “fidelity”.

As a general rule, you should make regular backups, such as on a daily basis, and preferably during a time when the server is least loaded. Thus, you can set the mongodump command as a cron job so that it’s run regularly, e.g. every day at 03:03 AM. To accomplish this open crontab, cron’s editor like this:

  1. sudo crontab -e

Note that when you run sudo crontab you will be editing the cron jobs for the root user. This is recommended because if you set the crons for your user, they might not be executed properly, especially if your sudo profile requires password verification.

Inside the crontab prompt insert the following mongodump command:

Crontab window
3 3 * * * mongodump --out /var/backups/mongobackups/`date +"%m-%d-%y"`

In the above command we are omitting the --db argument on purpose because typically you will want to have all of your databases backed up.

Depending on your MongoDB database sizes you may soon run out of disk space with too many backups. That’s why it’s also recommended to clean the old backups regularly or to compress them. For example, to delete all the backups older than 7 days you can use the following bash command:

  1. find /var/backups/mongobackups/ -mtime +7 -exec rm -rf {} \;

Similarly to the previous mongodump command, this one can be also added as a cron job. It should run just before you start the next backup, e.g. at 03:01 AM. For this purpose open again crontab:

  1. sudo crontab -e

After that insert the following line:

Crontab window
3 1 * * * find /var/backups/mongobackups/ -mtime +7 -exec rm -rf {} \;

Completing all the tasks in this step will ensure a good backup solution for your MongoDB databases.

Restoring and Migrating a MongoDB Database

By restoring your MongoDB database from a previous backup (such as one from the previous step) you will be able to have the exact copy of your MongoDB information taken at a certain time, including all the indexes and data types. This is especially useful when you want to migrate your MongoDB databases. For restoring MongoDB we’ll be using the command mongorestore which works with the binary backup produced by mongodump.

Let’s continue our examples with the newdb database and see how we can restore it from the previously taken backup. As arguments we’ll specify first the name of the database with the --db argument. Then with --drop we’ll make sure that the target database is first dropped so that the backup is restored in a clean database. As a final argument we’ll specify the directory of the last backup /var/backups/mongobackups/01-20-16/newdb/. So the whole command will look like this (replace with the date of the backup you wish to restore):

  1. sudo mongorestore --db newdb --drop /var/backups/mongobackups/01-20-16/newdb/

A successful execution will show the following output:

Output of mongorestore
2016-01-20T10:44:47.876-0500    building a list of collections to restore from /var/backups/mongobackups/01-20-16/newdb/ dir
2016-01-20T10:44:47.908-0500    reading metadata file from /var/backups/mongobackups/01-20-16/newdb/restaurants.metadata.json
2016-01-20T10:44:47.909-0500    restoring newdb.restaurants from file /var/backups/mongobackups/01-20-16/newdb/restaurants.bson
2016-01-20T10:44:48.591-0500    restoring indexes for collection newdb.restaurants from metadata
2016-01-20T10:44:48.592-0500    finished restoring newdb.restaurants (25359 documents)
2016-01-20T10:44:48.592-0500    done

In the above case we are restoring the data on the same server where the backup has been created. If you wish to migrate the data to another server and use the same technique, you should just copy the backup directory, which is /var/backups/mongobackups/01-20-16/newdb/ in our case, to the other server.

Conclusion

This article has introduced you to the essentials of managing your MongoDB data in terms of backing up, restoring, and migrating databases. You can continue further reading on How To Set Up a Scalable MongoDB Database in which MongoDB replication is explained.

Replication is not only useful for scalability, but it’s also important for the current topics. Replication allows you to continue running your MongoDB service uninterrupted from a slave MongoDB server while you are restoring the master one from a failure. Part of the replication is also the operations log (oplog), which records all the operations that modify your data. You can use this log, just as you would use the binary log in MySQL, to restore your data after the last backup has taken place. Recall that backups usually take place during the night, and if you decide to restore a backup in the evening you will be missing all the updates since the last backup.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the authors
Default avatar
Toli

author


Default avatar
Tammy Fox

editor


Still looking for an answer?

Ask a questionSearch for more help

Was this helpful?
 
10 Comments


This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Thanks for the awesome tutorial. MongoDB is little bit complicated for me but this tutorial made everything easy.

Thanks for the tutorial.

I wanted to point out one issue with it since it took me a while to figure out why my cron jobs weren’t running.

The cron line for the dump above is:

3 3 * * * mongodump --out /var/backups/mongobackups/`date +"%m-%d-%y"`

BUT the % character has special meaning to cron and will result in that job not running. If you watch the syslog when the job runs you’d see the command logged as:

mongodump --out /var/backups/mongobackups/`date +"

The % chars need to be escaped (\%) like:

3 3 * * * mongodump --out /var/backups/mongobackups/`date +"\%m-\%d-\%y"`

and then the job should run properly.

I think deleting of old backups can by simplified by using delete flag:

3 1 * * * find /var/backups/mongobackups/ -mtime +7 -delete

hI

I tryied the mongodump command, as described above but i get this error message:

Failed: error getting collections for database ibyc_db: error running listCollections. Database: ibyc_db Err: not authorized on ibyc_db to execute command { listCollections: 1, cursor: {} }

could someone help please?

thx

Awesome tutorial, how would you go about setting up backups for a replication set? Do you set backup and cleanup cron jobs on each of the nodes (primary & secondary)?

Thank in advance for you input!

Since MongoDB 3.2 you can now archive and compress with 2 new command line options: https://www.mongodb.com/blog/post/archiving-and-compression-in-mongodb-tools

So you can now use this:

3 3 * * * mongodump --archive=/var/backups/mongobackups/`date '+%FT%H-%M-%S-%Z'`.archive --gzip

thank you… I was looking for hours to find this article

I’m concerning about the data duplication between backup and restore times. Does mongodb have a mechanism to detect and de-duplicate data ?

For anyone having problems removing the old backups, replace the cronjob given in the tutorial with this:

To remove backups older than 7 days at 3:01 am every day (just before main backup cronjob):

1 3 * * * find /var/backups/mongobackups/* -maxdepth 0 -ctime +7 -exec rm -fr {} +

For anyone having problems with the backup-removal cronjob, try this instead, worked for me:

1 3 * * * find /var/backups/mongobackups/* -maxdepth 0 -ctime +7 -exec rm -fr {} +

Try DigitalOcean for free

Click below to sign up and get $200 of credit to try our products over 60 days!

Sign up

Join the Tech Talk
Success! Thank you! Please check your email for further details.

Please complete your information!

Become a contributor for community

Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

DigitalOcean Documentation

Full documentation for every DigitalOcean product.

Resources for startups and SMBs

The Wave has everything you need to know about building a business, from raising funding to marketing your product.

Get our newsletter

Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.

New accounts only. By submitting your email you agree to our Privacy Policy

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

Get started for free

Sign up and get $200 in credit for your first 60 days with DigitalOcean.*

*This promotional offer applies to new accounts only.