Shit happens, when they happen, it’s essential to document the steps and decisions made while solving it, so you can Use postmortems to learn from incidents.
You don’t need to be fancy. A simple document (shared with everyone involved) that is updated as you go is enough. I like to keep timestamps of events to make it easier to review later.
Part of Tips for Software Engineers