[lang]

Present Perfect

Personal
Projects
Packages
Patches
Presents
Linux

Picture Gallery
Present Perfect

Home storage strategy

Filed under: Hacking — Thomas @ 23:54

2010-04-05
23:54

Drives fail.

Somehow they fail more often around me.

I used to swear by software RAID-1. I loved it because if either disk fails, the other disk still contains all the data and you can get to it and copy it off and so on.

Except that in reality other things get in the way when something fails. Drives may fail at the same time (because of a bad power supply, a power surge, a computer failure, ...). A particular low point was the time when my home server's RAID had a failing disk (which had all my backups on it), and I took out the drive, and added in a new drive to start copying, with both drives hanging out a little outside of the computer case, and at precisely that point a motherboard box dropped out of the shelves two meters higher, and landed with one of its points right on top of the good RAID drive, breaking it. I lost all my backups.

So I learned the hard way that most problems with RAID happen precisely at the time you need to phsyically manipulate the system when one of the drives fail.

Ever since I was wondering how I can do my storage and backup strategy better. I have some other ideas on how I want to back up my stuff from four computers in two different locations (three if you count 'laptop' as one of them) depending on what types of files they are (my pictures I need to make sure I don't lose, while most music I can re-rip if I really have to), which should go in a different post (and if you know of any good descriptions of home backup approaches for all these files we seem to collect, please share!).

But it's clear that part of the solution should involve storing files across more than one computer, preferably in a transparant way. I thought about setting up one drive in one machine, then doing a nightly rsync or dirvish snapshot of that drive, but with lots of big media files moving around it might not be the best solution.

So I was excited when I came across drbd, which implements a distributed file system that mirrors to a secondary disk over the network.

I got two new 2TB drives last week, installed them in my media server and home server (which took quite some fiddling, including blowing out a power supply which exploded in a green flash, sigh), and read through the pretty clear and decent docs, and after 50 hours of syncing empty disks, I now have this:

[root@davedina thomas]# cat /proc/drbd
version: 8.3.6 (api:88/proto:86-91)
GIT-hash: f3606c47cc6fcf6b3f086e425cb34af8b7a81bbf build by bachbuilder@, 2010-02-12 18:23:20

1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
ns:1905605948 nr:0 dw:4 dr:1905606717 al:0 bm:116305 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
[root@davedina thomas]# df --si /mnt/split/
Filesystem Size Used Avail Use% Mounted on
/dev/drbd1 2.0T 206M 1.9T 1% /mnt/split

Sweet! Now to actually copy some files and run some failover tests before trusting my important files to it...

7 Comments »

  1. drbd is actually not a distributed filesystem but more an replicating block device :)

    Comment by fabian — 2010-04-06 @ 01:05

  2. You might check out rdiff-backup for backing up. Both raid and drbd are more high-availability solutions, not backups. So if your not running a critical service, you might be happier with an actual backup tool.

    After all, disk failures are only one of many ways to lose data.

    Comment by js — 2010-04-06 @ 02:34

  3. RAID is a waste of time and money for home systems.

    It’s not good enough to reliably protect you data, so the primary use of RAID is for performance or availability reasons… neither which are terribly important for a personal system. Getting a decent, automated-as-possible backup strategy is what is actually important and thus is what you should spend your money on.

    Unfortunately people concentrate to much on maximizing storage or performance and thus backup systems are much more rarer and difficult to setup.

    Duplicating data across multiple machines is a fantastic way to keep things backed up. It’s a lot better then trying to back everything up to blueray disks or DVD disks and have them silently oxidize in some box somewhere, and having USB-attached storage is not much better then having a second internal drive as ‘backup’. If you have a drive go out on one system that system is useless until you get a replacement drive and copy the data back over the from the existing systems… so the natural tendency is to get that fixed and going as quickly as possible. Conversely, if you were relying on tape backups and your tape drive goes out… then the tendency is for the average person to put off the expense of replacing it for as long as possible.

    Unfortunately dealing with disconnected changes (distributed system) and keeping things fully automated is very troublesome.

    What is needed is a way to keep /home/ directories duplicated in a N-way fashion. Rsync works best one-way, things like Unison work well 2-way, and central repository control systems work in a many-to-one situation… but to truely be effective you need to have a many-to-many replication service that has some limited revision control features to deal with conflicts and disconnected usage. Make things easy to resolve problems and be fully automated would be a huge challenge.

    I’ve tried using Git for doing that, but source code control systems only work really well for source code and text. Binary files and compressed files are not handled very well by anything.

    I envision something based around HTTPS, TLS, and Webdav and is integrated into the desktop in a way that it tries to check with the other machines for changes during login and automatically syncs on logout. Webdav because it’s pretty much universal in it’s support (so your files are accessible from many different OSes and many different means), TLS is automatable, a bunch of other services like Caldav and whatnot use webdav for syncing, https easily works through internet firewalls and blocked ports on home ISPs, there are anonymizing services like Tor, all sorts of things like ssh can be proxied through a web server, and a whole bunch of other fun stuff like that.

    Comment by nate — 2010-04-06 @ 07:20

  4. I used to have a backup server at home, which was used just for that purpose. I use rdiff-backup to copy everything and keep a history, it also works pretty good with remote systems.

    But in an effort to reduce electricity usage I changed my home network from always-on media-server, firewall, workstation and backup-server to just the workstation and on-demand media-server.

    So now I rely on RAID-1 in the machines, mainly because I find it annoying if a drive fails and I have to set up a machine again. And one (soon two) external-usb drives, which are connected to the workstation and still used together with rdiff-backup to backup local and remote machines.

    It isn’t perfect, but I reduced my power usage and it is a lot less hassle and cheaper too.

    I would like to use an on-line solution, but I am not looking forward to upload 1T through my ADSL line.

    Comment by Christof — 2010-04-06 @ 10:50

  5. I like to keep my backups low-tech. No fancy replication setups or other complicated layers that can become problematic.
    I backup all my systems by rsnapshotting (over ssh) to a raid5 array (with hotspare), and every once in a while, I rsync to a nas drive.
    Ideally I would have 2 nasses, one of which I would keep off-site, and update each in turn, but i’m not that far yet :)

    Dieter

    Comment by Dieter@be — 2010-04-06 @ 11:24

  6. I’ve been investigating this myself for long time, finally ended up using OpenSolaris with ZFS. As was commented above, RAID is really about availability not reliability — you typically don’t need the former at home.

    What ZFS gives you (and btrfs will on linux when ready) is checksumming of your data. That’s your first line of defense, and protects against data corruption (where RAID is completely useless). With increasing data densities on hdd platters these kind of errors are becoming statistically significant.

    The second line of defense is redundancy, so when you detect an error (checksum or full drive) you don’t lose data. This can be either a local or a remote mirror, preferably both. I have two 1.5TB drives mirrored in my ZFS pool for convenience, this setup can automatically repair checksum errors, and I have fail-over in case one of them goes poof. I’m still in the process of building out remote replication in the family, most probably using http://crashplan.com — it can do local, remote and could syncing. Until that happens I’m somewhat covered by data on the now offline disks that I never wiped after migrating to ZSF and the fact that most important new data first lands on my laptop that I then regularly push to my home server.

    The server itself is an atom board with 2 green drives and a separate system drive (ZFS trade-off), consuming around 20 watts. I don’t keep it on all the time, but I certainly could… I wasn’t too thrilled about having to learn yet another OS, but it was actually simple to set up and doesn’t need much fiddling after that. I share over NFS, SMB and HTTP.

    Greg

    Comment by Greg — 2010-04-06 @ 14:13

  7. I love nslu, a small ARM device where you can install a debian. It works with a pendrive and a usb disks that can be spinned down. The rest of the backup tool is rsync and cron. Awesome, no fans, low power consumption, cheap and you can even run rtorrent, cups…

    Comment by guillem — 2010-04-08 @ 22:39

RSS feed for comments on this post. TrackBack URL

Leave a comment

picture