Should I bother with raid

Dust0741@lemmy.world · 1 day ago

Should I bother with raid

MangoPenguin@lemmy.blahaj.zone · 7 hours ago

RAID means that if a drive fails you don’t have some downtime while your backups restore. It depends on how you feel about waiting for that.

Possibly linux@lemmy.zip · 2 hours ago

Also it is easier to hit replace

Boomkop3@reddthat.com · 5 hours ago

I want my personal system down until it is back in proper condition tbh

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 11 hours ago

RAID 1 is mirroring. If you accidentally delete a file, or it becomes corrupt (for reasons other than drive failure), RAID 1 will faithfully replicate that delete/corruption to both drives. RAID 1 only protects you from drive failure.

Implement backups before RAID. If you have an extra drive, use it for backups first.

There is only one case when it’s smart to use RAID on a machine with no backups, and that’s RAID 0 on a read-only server where the data is being replicated in from somewhere else. All other RAID levels only protect against drive failure, and not against the far more common causes of data loss: user- or application-caused data corruption.

whodatdair@lemmy.blahaj.zone · edit-2 4 hours ago

I know it’s not totally relevant but I once convinced a company to run their log aggregators with 75 servers and 15 disks in raid0 each.

We relied on the app layer to make sure there was at least 3 copies of the data and if a node’s array shat the bed the rest of the cluster would heal and replicate what was lost. Once the DC people swapped the disk we had automation to rebuild the disks and add the host back into the cluster.

It was glorious - 75 servers each splitting the read/write operations 1/75th and then each server splitting that further between 15 disks. Each query had the potential to have ~1100 disks respond in concert, each with a tiny slice of the data you asked for. It was SO fast.

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 4 hours ago

And that, kids, is a great use of RAID: under some other form of data redundancy.

Great story!

Mikelius@lemmy.ml · 14 hours ago

Raid 1 has saved my server a couple of times over from disaster. I make weekly cold backups, but I didn’t have to worry about it when my alert came in notifying me which drive went dead - just swap, rebuild, move along. So yeah I’d say it’s definitely worth it. Just don’t treat raid as a backup solution - and yes, continue to use an external cold storage backup solution as you mentioned. Fires, exploding power supplies, ransomware, etc don’t care if you’re using raid or not.

Possibly linux@lemmy.zip · 2 hours ago

It is also useful to stop silent corruption

Atemu@lemmy.ml · edit-2 16 hours ago

It depends on your uptime requirements.

According to Backblaze stats on similarly modern drives, you can expect about a 9% probability that at least one of those drives has died after 6 years. Assuming 1 week recovery time if any one of them dies, that’d be a 99.997% uptime.

If that’s too high of a probability for needing to run a (in case of AWS potentially very costly) restore, you should invest in RAID. Otherwise, that money is better spent on more backups.

Admiral Patrick@dubvee.org · edit-2 1 day ago

I always do some level of RAID. If for no other reason, I’m not out of commission if a disk fails. When you’re working with multi TB, restoring from a backup can take a while. If rapid recovery from a disk failure is not a high priority for you, then you could probably do without RAID.

Either way, make sure you test your backups occasionally.

Another way to put it: With RAID, a disk failure is like your Check Engine light coming on. You can still drive, but you should address the problem as soon as you can. Without RAID, it’s like your engine has seized up and you have to tow it for repair and are without your car until it’s fixed.

Dust0741@lemmy.world · 1 day ago

Hmm that’s a good point.

Aws also can cost a good chunk if you restore un-optimally

BakedCatboy@lemmy.ml · 1 day ago

Keep in mind that if you set up raid using zfs or btrfs (idk how it works with other systems but that’s what I’ve used) then you also get scrubs which detect and fix bit rot and unrecoverable read errors. Without that or a similar system, those errors will go undetected and your backup system will backup those corrupted files as well.

Personally one of the main reasons I used zfs and now btrfs with redundancy is to protect irreplaceable files (family memories and stuff) from those kinds of errors, as I used to just keep stuff on a hard drive until I discovered loads of my irreplaceable vacation photos to be corrupted, including the backups which backed up the corruption.

If your files can be reacquired, then I don’t think it’s a big deal. But if they aren’t, then I think having scrubs or integrity checks with redundancy so that issues can be repaired, as well as backups with snapshots to prevent errors or mistakes from messing up your backups, is a necessity. But it just depends on how much you value your files.

Atemu@lemmy.ml · 17 hours ago

Note that you do not need any sort of redundancy to detect corruption.

Redundancy only gains you the ability to have that corruption immediately and automatically repaired.

While this sounds nice in theory, you have no use for such auto repair if you have backups handy because you can simply restore that data manually using your backups in the 2 times in your lifetime that such corruption actually occurs.
(If you do not have backups handy, you should fix that before even thinking about RAID.)

It’s incredibly costly to have such redundancy at a disk level and you’re almost always better off using those resources on more backups instead if data security is your primary concern.
Downtime mitigation is another story but IMHO it’s hardly relevant for most home users.

beastlykings@sh.itjust.works · 2 hours ago

Can you explain this to me better?

I need to work on my data storage solution, and I knew about bit rot but thought the only solution was something like a zfs pool.

How do I go about manually detecting bit rot? Assuming I had perfect backups to replace the rotted files.

Is a zfs pool really that inefficient space wise?

Count042@lemmy.ml · 5 hours ago

backups in the 2 times in your lifetime that such corruption actually occurs.

What are you even talking about here? This line invalidates everything else you’ve said.

RedEye FlightControl@lemmy.world · 1 day ago

Yes yes yes yes yes

Raid1 that thing and sleep easier. Good on you for having a cold spare, and knowing to buy your drives at different locations/times to get different batches. Your head is in the right place! No reason to leave that data unprotected if you have the underlying tech and hardware.

kn33@lemmy.world · edit-2 1 day ago

It’s up to you. Things to consider:

Size of data
Recovery speed (Internet speed)
Recovery time objective
Recovery point objective (If you’re backing up once per day, is it okay to lose 23 hours of data when a disk fails?)

If your recovery objectives can be met with the anticipated data size and recovery speed, then you could do RAID 0 instead of RAID 1 to get higher speeds and capacity. Just know that if you do that, you better be on top of your backups because they will be needed eventually.

sugar_in_your_tea@sh.itjust.works · 1 day ago

I absolutely would, for a few reasons:

restoring from backup is a last resort and involves downtime; swapping a disk is comparatively easier and less disruptive
it’s possible your backup solution fails, so having some redundancy is always good
read performance - not a major factor, but saturating a gigabit link is always nice

Atemu@lemmy.ml · 17 hours ago

Read perf would be the same or better if you didn’t add redundancy as you’d obviously use RAID0.

RAID is never in any way something that can replace a backup. If the backup cannot be restored, you didn’t have a backup in the first place. Test your backups.
If you don’t trust 1 backup, you should make a second backup rather than using RAID.

The one and only thing RAID has going for it is minimising downtime. For most home use-cases though, the 3rd 9 which this would provide is hardly relevant IMHO.

sugar_in_your_tea@sh.itjust.works · 17 hours ago

Read perf would be the same or better if you didn’t add redundancy

RAID 1 can absolutely be faster than a single disk for read perf, and on Linux it is tuned to be faster. It’s not why you’d use it, but it is a feature of RAID. Intuitively, since both disks have exactly the same data, each disk could read different things. Likewise, for writes, you don’t have to write at the same time, as long as they’re always correct (e.g. don’t flip the metadata segment until both have written the data), so you can even get a write boost.

If performance is all you care about, then yeah, go ahead and use RAID 0. But you do get a performance boost with mirroring as well.

Yes, a backup should be tested, but it shouldn’t be relied on. Internet can go down, services can have maintenance, etc, so it’s a lot better to never need it. If you can afford a mirror, it’s having.

Moonrise2473@feddit.it · 1 day ago

i was also thinking like this, then i had to restore everything from a backup when the ssd suddenly died. I wasted so much time setting everything back as before

Atemu@lemmy.ml · 17 hours ago

If you needed to spend any time “setting everything back as before”, you didn’t have a full backup.

Moonrise2473@feddit.it · 3 hours ago

the reason OP was thinking of doing this, was saving disk space and avoiding buying another hdd. So if it’s a 1:1 full disk image, then there’s almost no difference with the costs of raid1. Setting exclusions, avoiding certain big files, and so on. In this case he’s talking about restic, which can restore data but very hard to do a full bootable linux system - stuff needs to be reinstalled

fuckwit_mcbumcrumble@lemmy.dbzer0.com · 1 day ago

Depends, how much do you value your data? Is it all DVD rips where you still have the DVDs? Nah you don’t really need raid. Are they precious family photos where your only backup copy is S3? Yeah I’d use raid for that, plus having a second copy stored elsewhere.

Plus as others have mentioned there’s checks on your data for bitrot, which absolutely does happen.

Atemu@lemmy.ml · 17 hours ago

RAID does not protect your data, it protects data uptime.

RAID does cannot ensure integrity (i.e bitrot protection). Its one and only purpose it to mitigate downtime.

fuckwit_mcbumcrumble@lemmy.dbzer0.com · 16 hours ago

ZFS or other software RAIDs can though. Does anyone stll use hardware raid anyways?

maxprime@lemmy.ml · 1 day ago

RAID is a great backup alternative.

/s