Product updates

Getting the names right

Dylan de Heer

In a meeting transcript, getting the names right is the difference between a summary you can trust and a summary that is quietly lying to you.

If the model thinks Alex said something that Maria actually said, that line ends up in Alex's column. The summary credits the wrong person. The follow-up email goes to the wrong inbox. The decision gets pinned on someone who didn't make it. And if you weren't in the meeting, or you were and it was long and tiring, you'd never spot the mistake.

So we put real time into the work of figuring out, from the audio alone, which voice belongs to which person. It has a slightly dry technical name — speaker diarisation — and a lot more nuance than the name suggests.

What goes wrong with the obvious approach

Two patterns come up enough in real meetings to be worth solving.

The first is the meeting with two people where one of them sounds much quieter. Usually that's a remote participant on a worse microphone. A simple approach will sometimes split the louder person into "speaker A" and "speaker B" and miss the quieter one entirely. Or it will go the other way and merge both into one person.

The second is the meeting that starts with two people and adds a third partway through. By the time the third person joins, the model has already decided who is who, and the new voice gets classified as a louder version of someone who's already there.

Either way, the summary that comes out the other side is wrong about who said what. And the kind of wrong that's invisible until you find yourself confused by it.

What we changed

There is no single trick that solves this. What there is, instead, is a stack of careful judgments that each fix a small piece.

Weeve adjusts how strictly it groups voices based on how the meeting was captured. A microphone in a quiet room gets different treatment from a Zoom call where the laptop is recording everyone through its speakers. When the model wants to collapse two voices into one because they sound similar, Weeve will sometimes push back and try a different grouping. When two consecutive moments of speaking obviously belong to the same person with a short pause in between, Weeve stitches them together.

None of those decisions are dramatic on their own. Together they are the difference between speaker labels that are background noise and speaker labels you can trust.

What this changes for you

If you've been using Weeve for a while, you'll notice the change first in long meetings. Names are sharper. The summary credits the right person more often. The decisions surface, which depends entirely on knowing who actually made the call, is meaningfully more accurate.

The open-source library we build this on top of is called FluidAudio. It's a well-made piece of work and the kind of contribution the Apple development community is better for. We're contributing back where we can.

Keep reading

No items

Weeve

Your work woven together.

Support
Stay in the loop

Get updates on new features and company updates.

© 2026 Weeve. All rights reserved.

Weeve

Your work woven together.

Support
Stay in the loop

Get updates on new features and company updates.

© 2026 Weeve. All rights reserved.

Weeve

Your work woven together.

Support
Stay in the loop

Get updates on new features and company updates.

© 2026 Weeve. All rights reserved.