Hi everyone, I’ve been working on my homelab for a year and a half now, and I’ve tested several approaches to managing NAS and selfhosted applications. My current setup is an old desktop computer that boots into Proxmox, which has two VMs:

  • TrueNAS Scale: manages storage, shares and replication.
  • Debian 12 w/ docker: for all of my selfhosted applications.

The applications connect to the TrueNAS’ storage via NFS. I have two identical HDDs as a mirror, another one that has no failsafe (but it’s fine, because the data it contains is non-critical), and an external HDD that I want to use for replication, or some other use I still haven’t decided.

Now, the issue is the following. I’ve noticed that TrueNAS complains that the HDDs are Unhealthy and has complained about checksum errors. It also turns out that it can’t run S.M.A.R.T. checks, because instead of using an HBA, I’m directly passing the entire HDDs by ID to the VM. I’ve read recently that it’s discouraged to pass virtualized disks to TrueNAS, as data corruption can occur. And lately I was having trouble with a selfhosted instance of gitea, where data (apparently) got corrupted, and git was throwing errors when you tried to fetch or pull. I don’t know if this is related or not.

Now the thing is, I have a very limited budget, so I’m not keen on buying a dedicated HBA just out of a hunch. Is it really needed?

I mean, I know I could run TrueNAS directly, instead of using Proxmox, but I’ve found TrueNAS to be a pretty crappy Hypervisor (IMHO) in the past.

My main goal is to be able to manage the data that is used in selfhosted applications separately. For example, I want to be able to access Nextcloud’s files, even if the docker instance is broken. But maybe this is just an irrational fear, and I should instead backup the entire docker instances and hope for the best, or maybe I’m just misunderstanding how this works.

In any case, I have some data that I want to store and want to reliably archive, and I don’t want the docker apps to have too much control over it. That’s why I went with the current approach. It has also allowed for very granular control. But it’s also a bit more cumbersome, as everytime I want to selfhost a new app, I need to configure datasets, permissions and mounting of NFS shares.

Is there a simpler approach to all this? Or should I just buy an HBA and continue with things as they are? If so, which one should I buy (considering a very limited budget)?

I’m thankful for any advice you can give and for your time. Have a nice day!

  • ikidd@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 months ago

    Yes. So my debian docker host has some datasets attached:

    mounted via fstab:

    and I specify that path as the datadir for NCAIO:

    Then when PBS calls a backup of that VM, all the datasets that Proxmox is managing for that backup take a snapshot, and that’s what’s backed up to PBS. Since it’s a snapshot, I can backup hourly if I want, and PBS dedups so the backups aren’t using a lot of space.

    Other docker containers might have a mount that’s used as a bind mount inside the compose.yml to supply data storage.

    Also, I have more than one backup job running on PBS so I have multiple backups, including on removable USB drives that I swap out (I restart the PBS server to change drives so it automounts the ZFS volumes on those removable drives and is ready for the next backup).

    You could mount ZFS datasets you create in Proxmox as SMB shares in a sharing VM, and it would be handled the same.

    As for documentation, I’ve never really seen any done this way but it seems to work. I’ve done restores of entire container stacks this way, as well as walked the backups to individually restore files from PBS.

    If you try it and have any questions, ping me.

    • thelemonalex@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      Wow, that’s awesome. I think that’s actually the approach I’m going to go for. This way I don’t need to buy hardware, and I don’t need to work with TrueNAS anymore.

      Where you talk about “walking the backups”, do you mean that you can actually see the entire file structure of the container? I mean, I don’t know how virtual disks are stored on the dataset. Like, as far as I know, a VM virtualized disk is just a file, right? So you’d have a ZFS dataset with a single file, for example? Could you then try and navigate the files inside this VM disk file, without the VM? Or did I misunderstand, and you’re mounting the dataset, somehow, directly inside the VM? Is that like a passthrough for datasets?

      In any case, thank you for sharing so much information and for offering help. I may take you up on that, as it seems that this is the approach that I feel most comfortable with.

      • ikidd@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        So if I want a new container stack, I make a new Proxmox “disk” in the ZFS filesystem under the Hardware tab of the VM. This adds a “disk” to the VM when I reboot the VM (there are ways of refreshing the block devices online, but this is easier). I find the new block device and mount it in the VM at a subfolder of /stacks, which will be the new container stack location. I also add this mount point to fstab.

        So now I have a mounted volume at /stacks/container-name. I put a docker-compose.yml in there and all data that the stack will use will be subfolders of that folder with bind mounts in the compose file. When I back up, that ZFS dataset that contains everything in that compose stack is snapshotted and backed up as a point-in-time. If that stack has a postgres database, it and all the data it references is internally consistent because it was snapshotted before backup. If I restore the entire folder from backup, it just thinks it had a power outage, replays it’s journals in the database, and all’s well.

        So when you have a backup in PBS, from your Proxmox node you can access the backups via the filesystem browser on the left.

        When you go to that backup, you can choose to do a File Restore instead of restoring the entire VM. Here I am walking the storage for my nextcloud data within the backups, and I can walk this storage for all discrete backups.

        If I want to just restore a container, I will download that “partition” and transfer it to the docker VM. Down the container stack in question, blow out everything in that folder and then restore the contents of the download to the container folder. Start up the docker stack for that folder and it’s back to where it was. Alternatively, I could just restore individual files if I wanted.