arriving very late to the "how to pronounce #icinga" party
how late, you say? .wma late
arriving very late to the "how to pronounce #icinga" party
how late, you say? .wma late
Got the green light at work to begin working on migrating out Icinga monitoring from V1 to V2.
Going to fun getting all ~1600 hosts converted and imported.
Thankfully we can run the two in parallel, so we won't lose any monitoring. Though the long term plan will be to strip out NRPE and replace it with Icinga 2 Agent
You know what? The http 500 errors were not the fault of the provider but the Icinga API responses. When you delete a host group that you don't have permission to delete, the status is not 403 as it should be but 404 No objects found. What's the point of deleting a missing object? There is no way to know if you have permission or not. This "security" feature is so annoying. If you create the resource again, you have a 500 error without error message because the object already exists. I'm going to patch my fork to handle those use cases.
Even if the project seems dead, I have created the pull requests anyway:
- https://github.com/lrsmith/go-icinga2-api/pull/19
- https://github.com/lrsmith/go-icinga2-api/pull/20
If there is any Icinga maintainer that would like to fork this library to make it official, I would love to contribute to this fork 🙏
It's the other way around. The one from openHPI is not maintained while the official is. But the official misses the "zone" attribute to create hosts. And there is no downtime. So I have created and published a fork https://registry.terraform.io/providers/jouir/icinga2/latest/docs
I knew I needed to create my own version to fix the issues we had in the past but I didn't remember the details.
After a whole day of switching thousands of servers to my new provider, the issues came to life. When the API times out, the plan fails but the host is created. At the second run, Terraform tries to create the host because it doesn't exist in the state but Icinga returns a 500 Internal Server Error because the host already exists. The solution is to remove the host from Icinga then apply again.
I will not do this for the last 52 plans in error with more or less 100 servers. No. I don't know if it's idiomatic to Terraform but I don't care, I will run a GET before creation and create only if it doesn't exist. Same for deletion. This will ease the life of our team having to deal with the flacky API at some point.
I will contribute back to the upstream project of course. If it's refused, no worries, I'll keep my own version.
I have finally managed to develop a Terraform provider locally to add a feature we miss for a long time on Icinga.
Unfortunately, the Icinga provider doesn't seem to be maintained so I'll patch the one we use, from openHPI, and hope it'll get merged. If not, we'll probably create another fork.
- https://github.com/Icinga/terraform-provider-icinga2/
- https://github.com/openHPI/terraform-provider-icinga2
Dieses Wochenende stand die Einrichtung von Grafana, InfluxDB3 und Telegraf im Homelab an.
Grafana und InfluxDB3 als Container, Telegraf nativ.
Hintergrund ist die Entscheidung von Icinga nur gegen einen Obulus von 5000€ pro Jahr Zugriff auf deren RPM-Pakete für RockyLinux, RHEL, etc zu gewähren.
Bis jetzt habe ich beruflich immer Nagios Monitoring Tools eingesetzt.
Jetzt möchte ich Grafana vom Beiwerk für schöne Visualisierungen in Icinga zum voll ausgebauten Monitoring Tool befördern und später auch die Alarmierung testen.
Demnächst installiere ich noch Elasticsearch, Kibana und die Beats bzw. den Elastic-Agent um Daten zu erfassen und erstelle aus der Datenquelle weitere Panels und Dashboards.
#observability #monitoring #homelab #linux #influxdb #docker #icinga