Protecting Your Home Computer Data

Data Overflow

There is never enough storage. I'm sure when computer punch cards were all the rage, you would have to constantly add more storage cabinets to hold all the cards that you collected with those way-cool math programs that would print out pi to 500 places. Then it was mag-tape reels, floppies (8-inch, 5.25-inch, 3.5-inch), optical media (CD/DVDs), and the brief popularity of the removable storage formats (remember SyQuestZip/Jaz drives? I still have some, in case I need to ummmm, boot System 7 on my Mac Quadra).

Your computer, when you bought it, came with a "massive" 80gb hard drive, then you bumped it to 320gb a couple years later. Then maybe you bumped it again to 750gb. Then you started tacking on one or two external drives to handle backups or maybe your iTunes files overflow. Now we have hard drives coming out of our ears, with 2TB drives coming on the market. Now we run out of drive bays in the computer, or have a rats nest of external hard drives and their cables strewn everywhere.

The stacks of older smaller hard drives start to pile up, and are almost like floppies now. And why not? There's no point trying to keep enough drive enclosures around, even if you get some monster 12-drive chassis. You can get these external hard drive docks that let you plug in a bare drive or two (usually SATA drives only) and you can treat your hard drives just like enormous 250gb brick-sized floppies.

Then sooner or later, with all those hard drives in play, one is bound to give up the ghost and fail on you (see my adventure with the dreaded Drive-cicle). Even worse, your drives are failing silently via bit rot and you may never know it until one day you're trying to show your friends the pictures of that time when you passed out drunk and your friends drew obscene things all over your face and body, and suddenly your hard drive is telling you "ERROR: can't read file 'drunk_again.jpg".

Protecting your Primary Storage

Over the years, storage vendors have taken a break from trying to sell us the next great media format and have been producing various external storage solutions to help make your primary data storage more robust (translation: keep all your game torrents, DVD rips, and lolcat^H^H^H^H^H^Hnudie pics safe and sound). Most common are the dual-hard drive external drives, which provide hardware-based mirroring of two drives via RAID-1. Variants include network-accessible RAID-1 devices (NAS + RAID) so that you can share the storage with multiple computers on your home network.

RAID Card

Also popular for more advanced users is adding a RAID controller card to get RAID-1 or RAID-5 for a set of internal drives inside your computer. RAID-5, of course, let's you group multiple drives (minimum of 3) together to form a larger redundant storage device, of which you can survive the complete failure of one of the drives at any given time. RAID-5 is often used in more upmarket purposes in enterprise environments and businesses, although it's not uncommon to see it in homes now. RAID-5 has been around a long time, but it's not without it's disadvantages. Beyond the reach of most home setups are RAID-6 environments which provide double-parity (i.e., can survive the total failure of 2 drives before you start to lose data). RAID-6 has generally been limited to businesses, due to the cost of products that support RAID-6 (i.e., more than you make in a year). Fortunately, there's ZFS, which we will get to.

Drobo

An interesting alternative to typical RAID-5 solutions has been the Data Drobotics Drobo (get a discount code) product line, which is a very compact (internally) cable-less, screwless 4-drive chassis that provides a proprietary RAID architecture where (similar to RAID-5, but is not RAID-5), you can lose 1 drive out of 4 and still be fine. You just then yank out the bad drive and put in a new one and it will automatically rebuild the failed drive on the new one.

In fact, you can actually lose even up to 3 drives(!) out of 4, and still be okay, IF the amount of your data on your Drobo is small enough. The Drobo is not using RAID-5 but rather a mixed grouping of RAID1 and RAID5. It's actually quite interesting, but I won't go into it. If you want to know the details, read about BeyondRAID and a ZDnet review by George Ou) .

One of the nice Drobo features is that mixed-sized drives are supported without any fuss. RAID-x setups generally need to be using the same drive size across the entire array (larger drives get truncated down to the size of the smaller drives). The Drobo's primary advantage though is that it's dead simple to use. You just stuff it full of drives and plug it into your computer, with a little bit of initial configuration.

One caveat with Drobo is that if you have wildly varying drive sizes, say 80gb, 250gb, 750gb, 1.0TB, the 1.0TB drive is going to be mostly reserved for redundancy. The amount of usable storage is roughly the sum of all the disks minus the size of the largest disk. So in this example you'd get about 1TB of usable space, with 700MB reserved for redundancy. However, with 4x1TB, you would see about 2.7TB of usable space. (These numbers are from Drobo's own web site, using their Drobolater interactive calculator.

I'm not specifically endorsing for Drobo, as I own none of their products. But it's definitely an option, especially if you have more money than time!

Remember that none of these solutions though are any replacement for a true backup. These are all for the purpose of making your primary storage system more robust so that you can survive some disk failures without having to immediately rebuild from backups.

ZFS

Finally to the good stuff. As luck would have it, UNIX-vendor Sun Microsystems has been working on a new filesystem-volume management technology since late 2004, called "ZFS" for their Solaris UNIX operating system. Even better for power-users is that the open-source variant of Solaris, OpenSolaris, not only has ZFS functionality, but often has features ahead of the main Solaris 10 release, acting sort of as a testing and proving ground. So this filesystem that's worthy of enterprise data centers is also easily available for home use, with surprisingly little pain required.

There's lots of web sites that already cover the benefits of ZFS, so I won't try to repeat all of it. A good starting place is the Wikipedia entry and the ZFS overview from Sun.

ZFS and Mac OS X

While I use ZFS quite a bit at work,it wasn't until it was discovered that Mac OS X 10.5 aka "Leopard" was to have ZFS support included, albeit in very limited fashion, that I really started to think of ZFS for the home. Partly it's because I prefer Mac OS as my primary work environment. Also this was due to plenty of Mac hardware at home, but any x86 gear I have is far too old to consider installing OpenSolaris on.

Another thing that got my juices flowing was seeing this crazy 10-drive PowerMac G5 setup. I happen to still have a PowerMac G5, so this was right up my alley. I had already added on of those 3-bay drive cages (called the Sonnet G5 "Jive", you dig?) that bolt into the empty front portion of the PowerMac G5 chassis. Although I'm not crazy enough to stuff 10 drives into my Mac, 7 was not out of the question: 3 in the Jive, 2 in the PCI slot area, and 2 in the two standard drive bays. (Okay, okay, so I could NOT figure out how the heck to stuff 2 drives up above the optical drive!!)

But with 6 drives (7 minus 1 for the boot drive), that's enough to do some decent mirroring or RAID configuration. So that's the approach I'm taking for now. But before I shell out for some actual hard drives, I'm going to do some pre-visualization of my ZFS set up to see what configuration I may want (mirrored, single-parity, or double-parity), and how many spares, if any.

What follows will be some basic ZFS primer articles that are both for the reader and for myself to make sure I know what all is really involved in running a ZFS storage system.