Why Every SRE Team Needs an LLM-Powered Blameless Postmortem Bot

I’ve written hundreds of postmortems in my career. They all follow the same pattern:

  • Timeline
  • Root cause
  • What went well
  • What could have been better
  • Action items

 

After the 100th one, I wanted to check myself into an asylum (Not really, but it sure sounded good at the time). So I built a tool that eats the On-call scheduling timeline (Pagerduty) + Jira tickets + Datadog metrics + Slack thread and spits out a 90%-complete postmortem. Then the human just edits the juicy bits and can spend time on the post-mortem discussion, and not the monotonous preparations. 

I rolled it out and immediately saw the benefits. The result?

 

  • Postmortem prep went from taking 4–8 hours → ~45 minutes
  • Quality actually went up because the bot is brutally honest
  • People started writing “what went well” sections without me having to nag them

 

The prompt is gloriously savage: “You are a deeply cynical but fair SRE principal who has seen every possible failure mode. Write this postmortem in the style of a Netflix tech blog but with zero corporate fluff.” Guided by this prompt and some for structured output the post-mortems maintained a consistent structure and robust framework.

 

I plan to open-source the whole thing in 2026, and perhaps others will find it beneficial.

Until then, steal this idea. Your future self will thank you when the incident is over and you dont have to spend hours writing the preparation material.

Leave a Reply

Your email address will not be published. Required fields are marked *