Note to self: do not kill a postgres process running ALTER TABLE in a throwaway database thinking it was the application while running pg_upgrade. It was pg_upgrade, not the app.
The "quick" rollback worked in terms of service (9 minutes of downtime) but it was an operational nightmare:
- outdated internal documentation
- very long Ansible loops
- unexpected errors when starting Debezium connectors that were stopped (why "stop/resume" and not "stop/start"?)
- pg_basebackup error: could not read COPY data: server closed the connection unexpectedly
And you, how's your week so far?