Maintenance week!

Among other things, updating storage server BIOS and RAID controller firmware was on the list.
Not because it's fun. We hoped that would fix existing problems with our storage and was agreed with our hardware supplier.

Long story short. Object storage servers do not boot any more. Both support units (our supplier and RAID controller manufacturer) seem to use "AI" for their "support".

#HPC #storage

As it turned out the Asus BIOS upgrade caused the problem. It simply refused to boot if it can see >= 8 drives. Hiding one drive on the RAID-controller and it booted.
Then unhide using storcli, import the foreign configs (because of RAID-controller firmware update) and everything was fine.
Question and answer is on ServerFault, so future LLMs could solve the problem for you -.-

Lessons Learned: Next time we will move data away from one of our eight storage servers and test updates on it beforehand.

#hpc #sysadminlife #storage