Backup Solutions
VERY IMPORTANT URCF provides very limited data backups in the form of daily "snapshots" with retention of 3 months for home directories and 6 months for group directories. Please contact urcf-support@drexel.edu if you require data recovery. For more security, it is strongly suggested that you arrange your own backups to an off-site location.
This article will list several ways of backing up your data. Please contact mailto:urcf-support@drexel.edu to discuss possible backup solutions for your specific situation.
Drexel CrashPlan
Drexel has a CrashPlan service for under $100 per year per user (up to 4 PCs per user), with no storage limits. It also protects against ransomware as it provides snapshots of data. Drexel's plan includes a Linux client, downloadable from https://software.drexel.edu/ This Linux client is available only to faculty and staff who have purchased CrashPlan through Drexel.
Please see: https://drexel.edu/it/computers-software/backup/
BackBlaze
BackBlaze has a cloud storage service called B2 that can be set as the backend for Duplicity and Restic (see below). Please see BackBlaze documentation, and some details on setting up Duplicity with BackBlaze B2. Other backup applications may be used with B2, e.g. Duplicati. Please see the previous links.
Dropbox
Dropbox[1] can be run in "headless" mode, i.e. without a graphical user interface and without GUI integration (Windows Explorer, Mac OS X Finder, Linux Nautilus). Follow the instructions for installing a private copy, using the 64-bit version: https://www.dropbox.com/install?os=lnx
[juser@proteusa01 ~]$ cd ~ && wget -O - "
https://www.dropbox.com/download?plat=lnx.x86_64
" | tar xzf -
This installs the Dropbox daemon (background process) into
~/.dropbox-dist/
. Start the daemon:
[juser@proteusa01 ~]$ ~/.dropbox-dist/dropboxd &
Note that this program needs to be running to sync data. If the login node is ever rebooted, this program will need to be restarted. See below for a way to start the daemon every time you login.
Following the suggestion of Dropbox, you may download a Python script to control Dropbox. Place
it in a directory in your PATH, and make sure it is executable
(chmod +x dropbox.py
). Then, to see the status of Dropbox sync:
[juser@proteusa01 ~]$ dropbox.py status
Syncing (2,222 files remaining)
Indexing 1,101 files...
There is help available:
[juser@proteusa01 ~]$ dropbox.py help
Dropbox command-line interface
commands:
Note: use dropbox.py help
to view usage for a specific command.
status get current status of the dropboxd
throttle set bandwidth limits for Dropbox
help provide help
puburl get public url of a file in your dropbox's public folder
stop stop dropboxd
running return whether dropbox is running
start start dropboxd
filestatus get current sync status of one or more files
ls list directory contents with current sync status
autostart automatically start dropbox at login
exclude ignores/excludes a directory from syncing
lansync enables or disables LAN sync
sharelink get a shared link for a file in your dropbox
proxy set proxy settings for Dropbox
Restic
See: Restic
Duplicity + Duply
Duplicity[2] is open source Linux based software. Duply[3] provides a friendlier front-end to Duplicity. Both are installed on the login nodes. Duplicity supports many remote file server protocols, including the following (see the Duplicity web site for an up-to-date list):
- acd_cli
- Amazon S3
- Backblaze B2
- Copy.com
- DropBox
- ftp
- GIO
- Google Docs
- Google Drive
- HSI
- Hubic
- IMAP
- local filesystem
- Mega.co
- Microsoft Azure
- Microsoft Onedrive
- par2
- Rackspace Cloudfiles
- rsync
- Skylabel
- ssh/scp
- SwiftStack
- Tahoe-LAFS
- WebDAV
Duplicati
Duplicati is open source software that is free to use. Similar to Dulicity, it provides multiple backends for backups. See the Duplicati official website.
Google Drive
Google Drive provides 15 GB for free, with more storage available for purchase. Unlike Dropbox, there is no official Linux sync software. However, there are several third party Linux sync agents. Two which provide a command-line interface are odeke-em/drive (open source) and InSync; there are instructions for running InSync in command line (headless mode).
oedeke-em/drive
Please read the full instructions at https://github.com/odeke-em/drive
First, download the executable named drive_linux
and put it in a
directory in your search path, such as ~/bin
. See Installing Software
for Your Research
Group for
details.
This is NOT an automatic sync like Dropbox. You have to manually "pull" and "push" to sync files one way or the other.
Once you have the drive_linux
executable downloaded and placed in
~/bin
, proceed with the following.
We suggest creating a subdirectory based on the Google account you want to use.
[juser@proteusa01 ~]$ mkdir -p GoogleDrive/myname@gmail.com
[juser@proteusa01 ~]$ drive_linux init ~/GoogleDrive/myname@gmail.com
It will then print out a link that you should paste into your browser.
The browser will then show a page asking for permission for "drive" to
access your Google Drive; allow it. Then, the browser will show a string
that you can paste into the terminal. Once that is done, you can "pull"
all files from Google Drive into that directory by doing
"drive_linux pull
". N.B. this pulls your entire Google Drive contents
down by default.
InSync
InSync is a commercial client. Currently (25 May 2017), it costs $30 (one-time). This is an automatic sync
client like Dropbox. Their CentOS/RHEL Linux clients are in beta. Download an
"installer" rather than a "package" or "repository". This should be a
file named something like insync-portable_1.3.16.36155_amd64.tar.bz2
(the version numbers may vary). Extract it to some location, e.g.
[juser@proteusa01 ~] mkdir apps
[juser@proteusa01 ~] cd apps
[juser@proteusa01 ~] mv ~/Downloads/insync-portable_1.3.16.36155_amd64.tar.bz2 .
[juser@proteusa01 ~] tar xf insync-portable_1.3.16.36155_amd64.tar.bz2
Then, edit your ~/.bashrc to include the directory
~/apps/insync-portable
into the PATH. Then execute the bashrc by doing
[juser@proteusa01 ~] . ~/.basrhc
After that, follow the instructions in this detailed blog post.
InSync is also available for Windows and macOS.
Microsoft OneDrive
Drexel provides 5 TB (5000 GB) of cloud storage on Microsoft OneDrive. InSync (commercial software), mentioned above, now has MS OneDrive support. However, Drexel domain policy restrictions prevent this from working.
The workaround is to use OneDrive on a Windows machine, and sync your data between Picotte and the Windows machine. Or if the amount of data is small, you can run Firefox on Picotte to use the web interface for OneDrive.
Rclone
Rclone allows for manual syncing of data. CAUTION OneDrive may arbitrarily throttle data transfers which means syncs may not be timely.
Duplicity using MS OneDrive
There is support for OneDrive in Duplicity. However, there does not seem to be support for OneDrive via institutional accounts such as Drexel's. Please watch this space for updates.
Other OneDrive clients for Linux
Besides InSync, here is an article detailing several options: https://medium.com/@glmdev/onedrive-sync-for-linux-ubuntu-2bcbf6777ee4 Again, Drexel domain policies would likely prevent these from working.
None of these solutions have been tested here on Picotte. If you have experience with using any of these, please contact urcf-support@drexel.edu
References
[1] DropBox website
[3] Duply web site