Monday, 1 April 2013


OK, you say, I get it, we should back up our data! And then you make a half hearted attempt and move on. But deep down you know about MTBF, the Mean Time Between Failures. The one thing we can say for certain about mechanical systems (such as your Hard Disk) is: IT WILL FAIL. The MTBF might give you the confidence that "my hard disk is likely to go on for another 3 years" and I sure hope it does. And when it does fail, often its not completely dead and you can get much of your data off it...

So my suggestion is to keep your data fully replicated on multiple devices. In order to do this easily I find it best to have one single directory underneath which everything of importance goes. (Think about it: When your disk blows in 3 years your computer is old and you will replace it with the shiniest new one that your budget allows. It will have a new version of Windows on it with new versions of the applications you use (and icons all looking different and in unusual places.) You don't need a backup of the Operating System and the Programs; you just need a backup of your data. [Yes, a list of the programs you use will be helpful! Perhaps you should go off and create just such a list in Evernote right now?]

I suggest you have one directory on the root of the filesystem (say c:\) with the name of the person. Example:  c:\Tom

Then you need to consider what kind of data you have. Some of this will be insensitive such as eBooks, mp3s, movies whilst you may feel other data is somewhat more private in nature like your salary slips, tax filings, accounts or contracts. You can grant and revoke permissions at a directory level so I suggest you create the insensitive directories directly under the main directory and create a Private directory for the more sensitive stuff. I.e.:


You need to decide where the pictures should go. You probably want to share them with friends and family so they would more likely go into the main directory than the "Private" directory. (If aunt Mathilda is sitting next to you do you really want to be clicking around in the "Private" directory?)

I find it very useful to have a "ToDo" folder. This is supposed to be empty but will take all temporary stuff that you haven't filed properly yet.

The goal is to have all your data somewhere under c:\Tom and nothing on your Desktop, nothing in c:\documents and settings\local user\My Pictures\ and other crazy locations. There is an added advantage to this because some programmers seem to think that they can freely create junk files in your "Documents and Settings" folder. You have no idea what these files are and don't know if you can delete them without breaking anything. By having your data in your own structure they can freely use those locations and you will just walk away from that pile of junk when you upgrade to your next computer.

Now you are ready to do something about your backups! In the simplest form you just copy the entire c:\Tom folder to an external Hard Disk. Buy a large one and call the copy something like "\Tom-Backup-2018-03-01" and the next one "\Tom-Backup-2018-04-01". This allows you to go back to an old backup if you discover a file was corrupted or you accidentally lost half the text of your thesis some time in March.

I own multiple computers and like to have the whole directory replicated to each machine (in the belief that not all disks will fail at the same time). The problem you get into is that between copies different files will be modified on each of the machines. You need clever software to figure out what files were modified on which machine so that the latest version can be copied over. My favourite software for this is Unison File Synchronizer. It works really fast on huge directories between two Linux machines and works well (but slower) when comparing two directories (one local, one remote) on Windows.

For backups I recommend Box Backup. This backup software looks for changes on the filesystem and encrypts the changes and uploads them to the backup server. By searching for the changes it doesn't have to upload all 100MB of the file, just the parts that actually changed. Because it stores the changes it can reconstruct a file from several changes back. Because it encrypts the data on the client the person running the server can't decrypt the data. It runs in the background and figures out what to do completely on it's own. The client comes for Linux and Windows whilst the server needs to run Linux. The downside is that it is difficult to set up (especially the bit with the cryptography keys). Also most home users are throttled on the uplink of their internet connection which makes backups very slow. At worst the Internet will seem slow because the page requests have to queue up behind large backup packets.

Update on 19 May 2013: looks like an interesting alternative to Unison for directory synchronisation.

No comments:

Post a Comment