Challenges with Backup and Recovery for Scale-Out Applications
When you think big data you think Hadoop, Cassandra, MongoDB and other scale-out NoSQL databases. All these applications share an important trait; they are scale-out applications. Scale-out applications are built to scale from single node to hundreds of nodes. They dynamically distribute the data between nodes. They also replicate data between nodes for improved read performance and guard against hardware failures.
So the prevailing thought is that these applications do not require backups as the redundancy built into these applications can protect against hardware failure. This is partly true, but there are other use cases that require point in time copies of these applications:
- Inadvertent changes of data, data loss or data corruption. When this happens, data loss or data corruption is replicated to all copies of the data that will result in permanent data loss.
- Test and development require a copy of production environment. In industries like financial, different teams run different hypothesis on big data so they need access to a point-in-time copy of production data.
But traditional backup solutions or even storage based LUN snapshots do not provide a good solution to these architectures for various reasons:
- These architectures distribute the data across multiple nodes of the cluster. So the data consistency is spread across multiple nodes of the cluster. As the size of the cluster grows, the complexity of achieving a consistent backup of the cluster grows with it. Hence taking a single node or a LUN snapshot at a time does not provide a good backup solution.
- These architectures are also dynamic. When a new node is added or an existing node is removed, these applications rebalance the data between nodes to achieve optimal performance. So it is important that the application is stopped from performing resource-rebalancing tasks before taking a cluster snapshot.
- Incremental backups
- The big data deployments tend to be large, unless native tools support capturing incremental changes, performing full backups each time can be very expensive.
- The support for incremental changes in application specific tools varies widely and developing a common solution for these applications can be challenging.
- Object-level restore
- Since data records are distributed across nodes, object(s) level restores require restoring entire cluster, as it existed at the time of backup operation. This requires taking back up of cluster configuration along with cluster data during the backup operation.
Applications such as Hadoop, MongoDB, and Cassandra provide either tools or APIs to support backup functionality. We will review each of these tools, their capabilities and their shortcomings.
Standard tool is mongodump, which can take a dump of all data files in a node. The solution works great for a single node. For multi-mode configuration, the backup is a multi-step process:
- Stop config server(s)
- Take lock on a replica on each node
- Perform mongodump on each node
- Copy data files to backup server
- Unlock replica on each node
- Start config servers
Apart from the complexity of backup process, mongodump tool has some more limitations:
- Mongodump does not support incremental backup. MongoDB suggest using file system level snapshot functionality to take incremental backups hence your ability to backup mongodb depends on the capability of underlying file system.
- Mongodump based backup process requires taking lock on each replica which means the replica won’t receive any new updates until the replica is unlocked. MongoDB can only tolerate oplog size worth of changes between the replicas. If the locked replica lags more than oplog worth changes, mongodump performs complete resync of the replica when it is unlocked. So unless you have an efficient backup process, every time backup is performed, MongoDB may end up resynchronizing entire replica.
- Mongodump only supports full restore. That means the current production cluster should be clobbered before the data can be restored from your backup media. The size of cluster being restored should match the size of the cluster when backup was performed.
- Object-level restore can be very tricky and resource intensive. If you want to restore a subset of objects from the backup media, you need to recreate identical cluster that existed at the time of backup, restore the backup data to the cluster and retrieve the objects you need.
Cassandra is a peer-to-peer highly scalable database system. Data is replicated among multiple nodes across multiple data centers. Single or even multi-node failures can be recovered from surviving nodes as long as that data exists on these nodes.
Cassandra saves data as SSTables where each SSTable is a file that contains key/value pairs. Cassandra creates numerous SSTables over a period of time. Though SSTables are immutable, Cassandra periodically performs compaction and merging of these files to consolidate and free up storage.
Cassandra uses linux hardlinks for creating snapshots. During snapshot operation, Cassandra creates hardlinks for each SSTables in a designated snapshot directory. As long as SSTables are not changed, the snapshot does not consume any additional storage. However during compaction or merge operation, the file system makes a copy of SSTable and hence require equal amount of free space on the system. Cassandra supports incremental snapshots as it creates hardlinks to SSTables that were created since last full snapshot operation.
Though Cassandra supports incremental snapshots, it still suffers from the same limitations as MongoDB native tool backup solution
- Though Cassandra supports incremental backups, it retains only one set of incremental backups. This incremental backup only covers changes made from the last snapshot. So to leverage incremental backups feature, your backup process should actively backup the incremental backups and delete them so that Casandra can create new incremental backup. Since these incremental backups are changes since last snapshot, the size of incremental backups increase as days pass by forcing your backup process to take a full snapshot periodically in order to keep size of incremental backups manageable.
- Usually backup scripts compress and tar SSTables before uploading them to non-local storage. These operations use system resources aggressively putting additional constraints on cluster resources.
- Cassandra does not provide restore utility. However the restore procedure is well documented. Restore operation can be performed per keyspace. This avoids performing a complete restore of the cluster. However the process is quite involving:
- Shutdown Cassandra
- Clear all files in the commit log directory, assuming that cluster shutdown flushes commit logs.
- Remove all contents of the active keyspace
- Copy contents of desired snapshot to active key space
- If incremental backups feature is enabled, then copy the contents from the full snapshot and copy of the contents of latest incremental backup on top of the restored snapshot.
- Repeat step c-e on all nodes.
At Trilio, we are hard at work solving these problems. To learn more, please contact us.