RAID array filesystem recommendations
from Svinhufvud@sopuli.xyz to selfhosted@lemmy.world on 10 Nov 2024 15:09
https://sopuli.xyz/post/18991669

I am planning on creating a home server with either 2 (RAID1) or 3 (RAID5) HDDs as bulk storage and 1 SSD as bcache.

The question is, what file system should I use for the HDDs? I am thinking of ext4 or xfs, as I heard btrfs is not recommended for my use case for some reason.

Do you all have some advice to give on what file system to use, as well as some other tips?

#selfhosted

threaded - newest

umami_wasbi@lemmy.ml on 10 Nov 2024 15:23 next collapse

I would just skip RAID, add all disk to a single BTRFS and use the built in profiles for (meta)data redundancy.

Cache I don’t know much tho.

btrfs.readthedocs.io/en/latest/btrfs-device.html

Svinhufvud@sopuli.xyz on 10 Nov 2024 15:37 next collapse

Are there some advantages of btrfs over raid? I understand how raid works but btrfs for redundancy is foreign to me.

umami_wasbi@lemmy.ml on 10 Nov 2024 15:41 next collapse

I use BTRFS for snapshots, and auto compression. Maybe it can be done with raids with LVM? AFAIK BTRFS redundancy is basically the same as traditional RAID, similar to using mdadm. Still, you would want a backup strat instead relying on the disk redundancy. I learn that the hardway.

CondorWonder@lemmy.ca on 10 Nov 2024 16:01 collapse

BTRFS has RAID built into the file system - instead of using MD you use BTRFS profiles which tell the system how to handle data.

For instance

  • file system data (critical for the file system to function): raid1c3 which means 3 copies of core P file system data on 3 different devices
  • user data: raid1 (so duplicating all your data on two different devices)

With this set up you could lose one device (of n, the total doesn’t matter), and not lose any data, and still be able to boot to recover with too much hassle.

BTRFS does block checksums, can scan for bit rot and recover from it, and generally tries to make your data safe. It technically supports raid5/6 for user data, the issue is around unclean shutdowns and a potential write hole where you could lose data, but if your system has a UPS backup and is on a relatively recent kernel it’s not any more dangerous than MD raid5/6 as I understand it.

Limonene@lemmy.world on 10 Nov 2024 15:46 collapse

The man page at btrfs.readthedocs.io/en/latest/mkfs.btrfs.html says:

RAID5/6 has known problems and should not be used in production.

So those profiles have unknown, unspecified problems.

But btrfs is safe on top of md-based raid1/5/6. It also has the advantage that you only need to encrypt one volume.

Svinhufvud@sopuli.xyz on 10 Nov 2024 20:58 next collapse

Could you elaborate on btrfs on top of md raid?

This one seems the most likely solution for me.

Limonene@lemmy.world on 11 Nov 2024 14:28 collapse

Sure. First you set up a RAID5/6 array in mdadm. This is a purely software thing, which is built into the Linux kernel. It doesn’t require any hardware RAID system. If you have 3-4 drives, RAID5 is probably best, and if you have 5+ drives RAID6 is probably best.

If your 3 blank drives are sdb1, sdc1, and sdd1, run this:

mdadm --create --verbose /dev/md0 --level=5 -n 3 /dev/sdb1 /dev/sdc1 /dev/sdd1

This will create a block device called /dev/md0 that you can use as if it were a single large hard drive.

mkfs.btrfs /dev/md0

That will make the filesystem on the block device.

mkdir /mnt/bigraid
mount /dev/md0 /mnt/bigraid

This creates a mount point and mounts the filesystem.

To get it to mount every time you boot, add an entry for this filesystem in /etc/fstab

Svinhufvud@sopuli.xyz on 11 Nov 2024 15:21 next collapse

Thanks for the info!

Svinhufvud@sopuli.xyz on 11 Nov 2024 16:06 collapse

Do you need to do some maintenance to keep the data in the array intact?

I read of some btrfs scrub commands and md checks and such, but I am unsure how often to do them, and what they actually do.

Atemu@lemmy.ml on 13 Nov 2024 01:03 next collapse

You should scrub your data regularly with btrfs. That’s just a mean to verify the data is in-tact though; to detect corruption.

You cannot really do anything actively to keep the data in-tact. Failure can and will happen. To keep your data safe, you must plan for failure to happen:

Expect a power surge to fry all your disks at the same time.
Expect your house to burn down or flood.
Expect to run the wrong command and istantly hose your entire array.
Expect your backup server to get ransomware’d.

Only if you effectively mitigate these dangers will your data stay safe.

Limonene@lemmy.world on 13 Nov 2024 17:58 collapse

In my system, the raid arrays seem to do periodic data scrubbing automatically. Maybe it’s something that’s part of Debian, or maybe it’s just a default kernel setting. I don’t think it helps much with data integrity – I think it helps more just by ensuring the continued functionality of the drives.

When it’s running, you can type cat /proc/mdstat to see the progress.

That command will also show you if there is a failing drive, so that you can replace it.

umami_wasbi@lemmy.ml on 10 Nov 2024 18:31 collapse

Ops. Missed that part.

just_another_person@lemmy.world on 10 Nov 2024 15:35 next collapse

Btrfs still has some issues, but it’s not like it’s dangerous or anything.

Xfs is going to give more flexibility for managing volumes, better performance than ext4 across multiple disks, and more fault protection.

Ext4 doesn’t really have any benefits in this race but being stable I suppose. An argument could be made it might be slightly faster under LUKS.

Zfs is more complex, but a bit more flexible than XFS, has CoW, snapshots, built in encryption and dynamic storage allocation.

farcaller@fstab.sh on 10 Nov 2024 15:36 next collapse

I would absolutely recommend a file system with snapshot capabilities for a home server. One of btrfs mirror, dm-raid (raid5) with btrfs, or zfs would work. The practical differences would be negligible at this scale and you can just pick whatever you fancy.

alwayssitting@infosec.pub on 10 Nov 2024 15:37 next collapse

Personally I would go for ZFS with the SSD as a L2ARC. But among the options you listed I would do BTRFS RAID1 if you’re only gonna use two HDDs, and mdadm RAID5 with BTRFS on top if using three.

possiblylinux127@lemmy.zip on 10 Nov 2024 17:07 next collapse

L2ARC will kill a SSD faster than normal wear would.

alwayssitting@infosec.pub on 10 Nov 2024 18:33 collapse

It will yeah, although with modern SSDs it really isn’t a big problem. I’ve used an Samsung 840 EVO as L2ARC for 8 years now.

Svinhufvud@sopuli.xyz on 10 Nov 2024 18:03 collapse

What are the advantages of this over mdadm raid and bcache?

possiblylinux127@lemmy.zip on 10 Nov 2024 17:06 next collapse

ZFS for it all and maybe btrfs if you are ok with its limitations

jaypg@lemmy.jaypg.pw on 10 Nov 2024 18:46 next collapse

The BTRFS thing is cutting the power or losing the disks in the middle of a write which corrupts your data. If you don’t think that will be a problem then BTRFS is fine. I recommend ZFS personally, but it sounds like you want to use mdadm instead so basically anything will work.

If you might need to shrink your filesystem later then avoid XFS. EXT4 is relatively featureless but ol’ reliable. ZFS is good for long term data integrity and protection. BTRFS is similar to ZFS. BcacheFS is new but like a swirl of EXT4 and BTRFS. Just pick the one with the features you want.

Svinhufvud@sopuli.xyz on 10 Nov 2024 19:46 collapse

Powerloss might happen as I don’t have a ups.

And when it comes to mdadm, it just happens to be the first and only redundancy tool I know. I am however open to learn and try new things.

ZFS seems interesting, but: I read that ZFS would require quite a lot of RAM, and I was going for 32 GBs only, would it be enough?

KaninchenSpeed@sh.itjust.works on 11 Nov 2024 12:13 next collapse

ZFS doesn’t require lots of RAM, more RAM just improves the caching (ARC) it can do. You can set ZFS to use all unused RAM as ARC, so it doesn’t interfere with other services running on the same PC. I ran ZFS with lots of VMs on an old office PC with 16GB RAM and it was still able to max out a 10gig nic.

jaypg@lemmy.jaypg.pw on 11 Nov 2024 18:24 next collapse

ZFS doesn’t require a lot of RAM, but it will use more RAM if it’s available. 32G would be plenty for a home setup. I think my home file server has 24 or 32G of RAM and ZFS. If it’s important data then stick to what you know; there’s nothing wrong with mdadm.

ShortN0te@lemmy.ml on 10 Nov 2024 20:57 collapse

1 GB of RAM for every TB of storage is recommended but you can do with way less for ZFS.

Shimitar@feddit.it on 12 Nov 2024 07:16 collapse

Many suggest zfs, I want to spend a word on ext4 instead. Solid, reliable, well proven. Does the job and works pretty well.

Been on ext4 on RAID1 for decades, since it got stable. Never had an issue, except when I borked it by my mistake.

It has maybe less features than zfs, but doesn’t need external kernel patches or complex tools, and again its solid, well proven and very stable

Edit: ext4 on top of Linux software raid (mdadm)