Python, the versatile programming environment, has a variety of uses. This article will familiarise the reader with a few Python based tools that can be used for secured backup and recovery of data.
We keep data on portable hard disks, memory cards, USB Flash drives or other such similar media. Ensuring the long term preservation of this data with timely backup is very important. Many times, these memory drives get corrupted because of malicious programs or viruses; so they should be protected by using secure backup and recovery tools.
Popular tools for secured backup and recovery
For secured backup and recovery of data, it is always preferable to use performance-aware software tools and technologies, which can protect the data against any malicious or unauthenticated access. A few free and open source software tools which can be used for secured backup and recovery of data in multiple formats are: AMANDA, Bacula, Barcos, CloneZilla, Fog, Rsync, BURP, Duplicata, BackupPC, Mondo Rescue, GRSync, Areca Backup, etc.
Python as a high performance programming environment
Python is a widely used programming environment for almost every application domain including Big Data analytics, wireless networks, cloud computing, the Internet of Things (IoT), security tools, parallel computing, machine learning, knowledge discovery, deep learning, NoSQL databases and many others. Python is a free and open source programming language which is equipped with in-built features of system programming, a high level programming environment and network compatibility. In addition, the interfacing of Python can be done with any channel, whether it is live streaming on social media or in real-time via satellite. A number of other programming languages have been developed, which have been influenced by Python. These languages include Boo, Cobra, Go, Goovy, Julia, OCaml, Swift, ECMAScript and CoffeeScript. There are other programming environments with the base code and programming paradigm of Python under development.
Python is rich in maintaining the repository of packages for big applications and domains including image processing, text mining, systems administration, Web scraping, Big Data analysis, database applications, automation tools, networking, video processing, satellite imaging, multimedia and many others.
Python Package Index (PyPi): https://pypi.python.org/pypi
The Python Package Index (PyPi), which is also known as Cheese Shop, is the repository of Python packages for different software modules and plugins developed as add-ons to Python. Till September 2017, there were more than 117,000 packages for different functionalities and applications in PyPi. This escalated to 123,086 packages by November 30, 2017.
The table in Figure 1 gives the statistics fetched from ModuleCounts.com, which maintains data about modules, plugins and software tools.
Python based packages for secured backup and recovery
As Python has assorted tools and packages for diversified applications, security and backup tools with tremendous functionalities are also integrated in PyPi. Descriptions of Python based key tools that offer security and integrity during backup follow.Python based packages for secured backup and recovery
Rotate-Backups is a simplified command line tool that is used for backup rotation. It has multiple features including flexible rotations on particular timestamps and schedules.
The installation process is quite simple. Give the following command:
$ pip install rotate-backups
The usage is as follows (the table at the bottom of this page lists the options):
$ rotate-backups [Options]
The rotation approach in Rotate-Backups can be customised as strict rotation (enforcement of the time window) or relaxed rotation (no enforcement of time windows).
The timeline and schedules of the backup can be specified on the configuration file as follows:
# /etc/rotate-backups.ini: [/backups/mylaptop] hourly = 24 daily = 7 weekly = 4 monthly = 12 yearly = always ionice = idle [/backups/myserver] daily = 7 * 2 weekly = 4 * 2 monthly = 12 * 4 yearly = always ionice = idle [/backups/myregion] daily = 7 weekly = 4 monthly = 2 ionice = idle [/backups/myxbmc] daily = 7 weekly = 4 monthly = 2 ionice = idle
Bakthat is a command line tool with the functionalities of cloud based backups. It has excellent features to compress, encrypt and upload the files with a higher degree of integrity and security. Bakthat has many features of data backup with security, including compression with tarfiles, encryption using BeeFish, uploading of data to S3 and Gracier, local backups to the SQLite database, sync based backups and many others.
Installation is as follows:
$ pip install bakthat
For source based installation, give the following commands:
$ git clone https://github.com/tsileo/bakthat.git $ cd bakthat $ sudo python setup.py install
For configuration with the options of security and cloud setup, give the command:
$ bakthat configure
Usage is as follows:
$ bakthat backup mydirectory
To set up a password, give the following command:
$ BAKTHAT_PASSWORD=mysecuritypassword bakthat mybackup mydocument
You can restore the backup as follows:
$ bakthat restore mybackup $ bakthat restore mybackup.tgz.enc
For backing up a single file, type:
$ bakthat backup /home/mylocation/myfile.txt
To back up to Glacier on the cloud, type:
$ bakthat backup myfile -d glacier
To disable the password prompt, give the following command:
$ bakthat mybackup mymyfile --prompt no
BorgBackup (or just Borg, for short) refers to a deduplicating backup tool developed in Python, which can be used in software frameworks or independently. It provides an effective way for secured backup and recovery of data.
The key features of BorgBackup include the following:
- Space efficiency
- Higher speed and minimum delays
- Data encryption using 256-bit AES
- Dynamic compression
- Off-site backups
- Backups can be mounted as a file system
- Compatible with multiple platforms
$ borg init -e repokey /PathRepository
To create a backup archive, use the command given below:
$ borg create /PathRepository::Saturday1 ~/MyDocuments
For another backup with deduplication, use the following code:
$ borg create -v --stats /path/to/repo::Saturday2 ~/Documents --------------------------------------------------------- Archive name: MyArchive Archive fingerprint: 612b7c35c... Time (start): Sat, 2017-11-27 14:48:13 Time (end): Sat, 2017-11-27 14:48:14 Duration: 0.98 seconds Number of files: 903 --------------------------------------------------------- Original size Compressed size Deduplicated size This archive: 6.85 MB 6.85 MB 30.79 kB All archives: 13.69 MB 13.71 MB 6.88 MB Unique chunks Total chunks Chunk index: 167 330 ---------------------------------------------------------
In MongoDB NoSQL, the backup of databases and collections can be retrieved using MongoDB Backup without any issues of size. The connection to Port 27017 of MongoDB can be directly created for the backup of instances and clusters.
Installation is as follows:
$ pip install mongodb-backup
The documentation and help files help keep track of the commands with the options that can be integrated with MongoDB Backup:
$ mongodbbackup --help
To take a backup of a single, standalone MongoDB instance, type:
$ mongodbbackup -p <port> --primary-ok <Backup-Directory>
To take a backup of a cluster, config server and shards, use the following command:
$ mongodbbackup --ms-url <MongoS-URL> -p <port> <Backup-Directory>
You can use any of these reliable packages available in Python to secure data and back it up, depending on the data that needs to be protected.