Harmonic Design runs retros at the end of every engagement. The stickies get written, people say the right things, and the next project starts with the same problems. The gap wasn't in the retros — it was in the journey from retro to change. A colleague and I had been working that gap for months: combing Miro boards, pulling transcripts, interviewing people to clarify ambiguous stickies, grouping findings into 40 themes across 9 categories.
Then, in a studio Spotlight, we asked the whole team to rate every theme on two scales: severity (1–4) and frequency (1–7). Twenty-five people, forty themes, two scales — 2,000 possible cells. We got 821 actual ratings. That's a dataset that deserved real statistical handling. I rebuilt it as a synthetic dataset first (so I could use my own tools without exposing internal Harmonic data), then ran it through two AI surfaces to see what would happen.