Mastodawn

the noise suppression model I am rolling out has a ~10% failure rate with regard to perfect suppression of dog barks, but a ~50% failure rate with regard to perfect suppression of cat meows. this, i believe, aligns with customer expectations

Show thread

halcy

Mar 23

this is funny, but not really a joke, these are absolutely metrics we track and also dogs barking are louder, more distracting and more common in a call than cats meowing generally

Show thread

sleepy duck pond appreciator 🦆✨🏳️‍🌈Mar 23

@halcy my only expectation is there are little snippets in the transcript for when they come through the noise suppression like:

1️⃣ ok team how do we feel about this plan?
🐈 meow~
2️⃣ agreed

very important business, if the cat has approved it, it’s good to go, the cat approval should go on the record

Show thread

halcy

Mar 23

@0x47df I'm tempted to spend 2 dollars of my employers money to put the cat sounds test set through the transcription tooling to see what we get

Show thread

sleepy duck pond appreciator 🦆✨🏳️‍🌈Mar 23

@halcy wait, 2 dollars of time? how many cat sounds have you collected? i guess it makes sense for training the model

Show thread

Nick Appleton Mar 23

@halcy but how does it perform when I’m talking while continuously crinkling a plastic chip packet? Or while I’m taking a teams meeting on a laptop on a train platform while some monster diesel train direct from hell screams past me?

Show thread

halcy

Mar 23

@nickappleton surprisingly, these are comparatively easy scenarios that it generally performs okay at, though they are probably rare. We don't have chips or trains, but we have "vacuum cleaner" (20% less than perfect) and "sirens" (25% less than perfect), which should be a good approximation.

the real trouble is interfering speakers for the most part, humans sound a lot like other humans, annoyingly

Show thread

Nick Appleton Mar 23

@halcy does it help if I sing? :) what’s your latency?

Show thread

halcy

Mar 23

@nickappleton I don't have singing validation set numbers for this model, it was doing fine on emotional data, though, which is similar usually

we introduce 10ms latency, it's a prod RTC model and more is intolerable

Show thread

Nick Appleton Mar 23

@halcy We’re still using 20ms for most of ours. Getting good performance with such low lookahead still blows my mind. Nice one 👍