Harmonic Design (internal) · 2026 Service designer & analysis lead

What changes when the AI shares your environment

821

ratings across 40 themes analyzed

AI surfaces · same data, same goal

Live site

deployed to studio leadership

Client: Harmonic Design (internal)
Role: Service designer and analysis lead (with a colleague on upstream synthesis)
Team: Whitney Masulis with a colleague on synthesis · 25 studio members in the rating exercise
Duration: Started 2025; agentic AI experiment May 2026
Deliverables: Interactive HTML synthesis site (deployed) · two-round AI comparison · studio Spotlight demonstration
Outcome: Live deliverable available to studio leadership; demonstration now referenced in Harmonic's AI practice materials

This case started as an internal synthesis project — Harmonic's own retros, analyzed with the studio's own participation data. It became a live demonstration of something my colleagues needed to see. I ran the same dataset through two AI surfaces back to back: a regular chat window, then an agent with access to my filesystem. Same data, same goal, completely different working relationship. The case is about what actually changes when you move from one to the other — and why that shift matters more for service designers than any question about AI output quality.

Chapter 01

An internal synthesis project, and a question about what agents actually do differently

Harmonic Design runs retros at the end of every engagement. The stickies get written, people say the right things, and the next project starts with the same problems. The gap wasn't in the retros — it was in the journey from retro to change. A colleague and I had been working that gap for months: combing Miro boards, pulling transcripts, interviewing people to clarify ambiguous stickies, grouping findings into 40 themes across 9 categories.

Then, in a studio Spotlight, we asked the whole team to rate every theme on two scales: severity (1–4) and frequency (1–7). Twenty-five people, forty themes, two scales — 2,000 possible cells. We got 821 actual ratings. That's a dataset that deserved real statistical handling. I rebuilt it as a synthetic dataset first (so I could use my own tools without exposing internal Harmonic data), then ran it through two AI surfaces to see what would happen.

image: Miro Spotlight rating board — grid of individual pairwise matrix frames, one per studio member, most stamped DONE. Source: data-viz-miro-spotlight.png

The rating board from the studio Spotlight — 25 individual matrices. The DONEs are satisfying.

Chapter 02

Round one: chatbot as helpdesk — useful, slow, me as the bridge

I started the way most people start. Chat tab open, spreadsheet open, copying and pasting back and forth. Within a couple of hours I had a bubble chart, two heatmaps, a filtered list of the seven critical themes, and an outline for a presentation. Useful. Entirely produced by my keystrokes.

Three moments from that round show what this working mode is actually like. The bubble chart was wrong — Sheets was reading the vote-count column as a data series instead of the size dimension. The chatbot walked me through the menu changes; I clicked them. One bubble remained hidden behind another no matter what the chatbot suggested. It couldn't reach into my chart. Then the chatbot started building a story I had to stop: a heatmap framed around Leads "absorbing all the pain." I stopped and asked why we were showing count instead of average. Count was biased by participation rate, not severity. That catch wasn't the AI. That catch was my own critical reasoning running fast enough to notice before the story set.

What I came out with: a bubble chart that wasn't quite telling a story, two heatmaps in a Google Sheet, and a text document outlining what a presentation might look like. It felt like having a patient helpdesk on the other end of a chat window who could answer any question about Sheets but couldn't actually open Sheets.

image: round 1 bubble chart — the crummy version with CR2 bubble hidden behind another bubble. Source: data-viz-bubble-v1.png. Caption: "Can you see the dark blue CR2 bubble? Can you??"

Can you see the dark blue CR2 bubble? Can you??

Chapter 03

Round two: agent inside my environment — directives, not menu walkthroughs

For round two I used Claude Code from my terminal, with access to my filesystem, my repos, and a deploy target. Same CSV, same goal. The first thing I noticed was that I stopped describing chart types and started describing what I wanted to learn.

Audience is leadership. Use whatever charts and structure best answer those questions. End with discussion questions, not conclusions. Don't fabricate data. Show me what you compute and where it comes from. When you pick a visualization, briefly explain why you chose it.

That last sentence was the one that mattered. Asking the model to justify each visualization gave me something to push back against — and a running tutorial on what chart shape fits what question. The agent read the CSV directly, computed aggregations in Python, built an interactive HTML site, deployed it to a public URL, and copied both files to my laptop. I never opened a spreadsheet. I never wrote a formula. I have still never opened a Python file.

The prompts that changed the deliverable weren't "make a bubble chart." They were: "The role-average bars aren't telling me anything. Show me what categories each role rated highest, and whether those overlap." "The frequency scale is bucketed labels — an average of 3.5 means 'between most projects and once per project,' not 'daily friction.' Make the prose reflect that." That last edit would have been an afternoon with the chatbot. With the agent, it was one pass across nine sections in seconds.

image: round 2 output — the clean interactive bubble chart, or a screenshot of the deployed site at pain-points.whitney-masulis.workers.dev showing the full deliverable

The agent-built deliverable: interactive bubble chart, critical-7 cards, role × category grid, discussion questions. Deployed to a public URL. I never opened a spreadsheet.

Chapter 04

What's actually different — and why the bottleneck shift matters for service designers

The difference isn't that the agent produces better outputs, though it often does. The difference is where the bottleneck lives. With the chatbot, the limit is execution: how well I can drive Sheets, how cleanly I can move output between the AI's window and my work. I am the bridge. With the agent, that bridge collapses. We share the same environment. The bottleneck shifts from execution to articulation — how clearly I can describe what I want, what I notice when something's wrong, and when to push back.

That is a different muscle. And it is closer to the muscle service designers already have. Briefing. Critiquing. Naming what's missing. Articulating the shape of a good answer before you have it. The transition from round one to round two isn't a technical leap. It's a framing shift about where your attention goes.

I did not write the HTML. I wrote the brief, made the synthesis judgments, sharpened the language, and decided what to cut. That division of labor is the point.

The reframe

I expected the agent to produce better outputs

I expected running the same data through an agent would produce better charts than running it through a chatbot. It did — but that's not the finding. The finding is that the working relationship changed. With the chatbot, I was describing tools and menus. With the agent, I was describing problems and criteria. The shift wasn't about the model being smarter. It was about what the working mode asks from the person using it. An agent asks for the kind of articulation that design practice builds — briefing, critiquing, naming what's missing. That's why this is a note to my colleagues, not a note to data analysts.

What stays behind

The move from round one to round two scales

The same rough pattern — a tagged synthesis deliverable feeding an agentic analyst that produces an interactive output — could shorten the back half of a research engagement substantially. The constraints on client work are real: data sensitivity, NDAs, audience appetite for "an AI made the charts." Those constraints are manageable. The underlying move scales.

Inside Harmonic, the gap between "I have read about agents" and "I am using them in my actual work" is the move I most want my colleagues to make. The thing that gets you across isn't technical skill. It's a willingness to describe what good looks like in your own words, and keep pushing until you get there. Service designers already do that work. The model can now meet them in their environment to do it together.