#Netdata has saved my butt - I'd only set it up to get temp monitoring for my #homelab, but today I woke up with an alert notifying me that one of my #Proxmox node's #ZFS pool has degraded.

Opening up my Proxmox web interface, checking the main "Summary" and "Disks" tab, nothing would suggest that anything is out of the ordinary - even the failed disk is showing that it had passed its
#SMART test. It is only when I dug deeper into the "ZFS" section that I could see that one of my #SiliconPower NVME disk (avoid at all cost btw) has failed and the ZFS pool has degraded.

Fortunately I've an
#ADATA SX8200 Pro NVME lying around that I will use as the replacement disk, I've never done this outside of #TrueNAS's web UI, so I'll have to do so while referring to a guide. I'll also link to my guide on setting up Netdata, and will probably write up a guide on how to replace a failed disk in a ZFS pool on my repo after I've done so myself.

🔗 https://github.com/irfanhakim-as/homelab-wiki/blob/master/topics/proxmox.md#monitoring

▶️ https://youtu.be/IQA7aTezrVE
homelab-wiki/topics/proxmox.md at master · irfanhakim-as/homelab-wiki

Wiki about everything Homelab. Contribute to irfanhakim-as/homelab-wiki development by creating an account on GitHub.

GitHub
@irfan I’ve been running https://github.com/AnalogJ/scrutiny specifically for drive monitoring on everything available.
GitHub - AnalogJ/scrutiny: Hard Drive S.M.A.R.T Monitoring, Historical Trends & Real World Failure Thresholds

Hard Drive S.M.A.R.T Monitoring, Historical Trends & Real World Failure Thresholds - AnalogJ/scrutiny

GitHub