The Product is Literally Debugging Itself

One of our employees ended up creating, completely by accident, a self-healing product. We figured this was a really good use case worth writing about, so I wanted to explain how it works, what it does, and what the actual benefits have been.
Before I get into it, let me set the scene.
Every company has product analytics tools like PostHog, Amplitude, or Hotjar attached to their landing pages and backends. These track everything from what users clicked on, how long they spent on each page, what errors they saw, and far more. In practice, product managers are supposed to look at this data. But because there is so much of it, all in JSON, they are not doing much more than creating usage funnels and understanding drop-off points. There are not actually whys in there.
In the customer experience world, they have a different way of understanding what is going on. They use NPS (Net Promoter Score) for overall product sentiment and CSAT (Customer Satisfaction Score) for individual interactions. Both work by surveying the customer — asking how they feel about the product as a whole, or how a specific conversation with support went.
We took that concept and thought, what if we could use the data we were already getting from PostHog and create an implied NPS score for each session? No surveys. Just the data.
That is where our self-healing product started.
Implied NPS
Every session a user has on Cotera gets dumped into the data warehouse. Every error they saw, what they clicked on, how long they spent on each page, all the standard analytics. But instead of having a human look at that data, we pass it into an LLM with a large prompt that includes examples of ideal user sessions and what people usually do when they are successful on the product.
The LLM looks at the session, understands the routes the user took, and grades the output. If they signed up, had a long session with our AI agent, built a workflow, and spent more than 15 minutes, that is a nine. If they hit an error with a tool, got frustrated, and left, that is a one or a two.
The trick here is that we never actually asked the user how they felt about the product. We did not reach out to them. We did not give them a banner on the page saying, would you recommend this to a friend? We just did it all in the background.
But at the core, because we understand the route that the user came in through, we actually already understand the intent. We can have the LLM figure out, one, was Cotera actually the right product for them? And two, if it was but it did not work properly, what actually happened?
So the LLM reads every single user session, scores the output, and then writes a blurb on what went right and what went wrong. And that gets stored back in the data warehouse.
We call this layer of agents the context gatherers, or the CX AI agents. Their job is to observe, understand, and report.
The Product Managers
But what we really want is, like, ideally you would have a self-improvement loop. And this is the kind of thing that always happens with NPS. Cool, we know what customers are complaining about. What can we actually do about it? And usually companies do not have any time to actually devote to fixing the problems. They just know what they are.
So what we do is we have another agent at the end of the week that takes the outputs from the first AI, reads them all, every single session, and makes tickets. It actually makes tickets in Linear for stuff we need to fix. It looks at the failures, groups them into patterns, checks our codebase on GitHub, and decides what is important. If 40 sessions failed on the same thing, that gets a ticket. If two people had a weird edge case, it gets flagged but not ticketed.
And it is not just bugs. There are things where it is like, you are not using this in the right way to get the most out of our product. Which means we have to change something about the product to make sure it incentivizes people to use the right flow.
The first one is basically a support employee. The second one is a product manager. And then the third one is an engineer.
Skynet
The last layer is the software engineers, or what we call Skynet. We call it Skynet affectionately because it is borderline insane.
This layer takes the tickets from the product managers and figures out which ones are actual bugs. Not feature requests, not "rethink the flow," just bugs. A tool that keeps timing out because it is taking longer than the standard set. A JSON response that does not validate through the schema. An API that has been changed underneath the hood. If we have to change something about the product objectively to make it work properly.
Skynet takes that ticket, assigns it to itself, writes the code, and submits the PR for one of our human engineers to review.
Monday Morning
So suddenly we have three layers. We have our context gathering support agents that are watching all of the user sessions, we have our program managers that are putting that into context, and we have our software engineering layer called Skynet which is actually going through and solving the bugs.
Every Monday morning, our team sits down and reviews everything. What did last week look like? What were the bugs? The expensive, thought-requiring tasks from the product management layer get assigned to real humans to investigate. And the AI agents that worked through the weekend and fixed bugs? We review those PRs over coffee.
We step into the week with a ton of momentum and data. I am not going to pat our team on the back too much, but I am really proud of them for coming up with this.
Why Would You Not Do This
The real question is, why would you not do this? You already have all of the data. All you need to do is wire it up to different layers of AI agents. Cotera is probably the perfect platform for the first and second layers, and the third is easily done through either Devin or Cursor depending on what you are using.
This is the first thing we have seen where it ties together how a real company should actually work. Customer support agents interacting with product. Product interacting with engineering. But they are working as a very small scrum team to identify issues and fix them before the humans even realize we are losing customers.
This has saved our engineering team tons of hours on bug fixes, which is real money. We also estimate that it has improved our net retention rate by about 25%. These are real dollars in a very real bank account, and it all happens for the fraction of the cost of a real team. It lets the humans do what they are good at, which is the stuff that requires deep thought, and lets the AI handle the bottom layer.