computer-science
Why removing 'um' from a recording is harder than it sounds
Key takeaways
- Linguists have a word for the ums, uhs, ers, and elongated versions (ummmm, uhhhhh) that pad spoken English: disfluencies.
- I don t record a lot of voice audio, but a few friends do, and they tell me editing those out by hand is miserable.
- uvx erm input.wav That s the whole interface for the common case.
Linguists have a word for the ums, uhs, ers, and elongated versions (ummmm, uhhhhh) that pad spoken English: disfluencies.
I don t record a lot of voice audio, but a few friends do, and they tell me editing those out by hand is miserable. So I built erm to do it.
uvx erm input.wav That s the whole interface for the common case. It writes a cleaned .wav and a JSON cut list next to the input. This post walks through how it works, because the obvious approach doesn t sound very good and most of the code is the stuff that fixes that.
Article preview — originally published by Hacker News. Full story at the source.
Read full story on Hacker News →
More top stories
Aggregated and edited by the Scoop newsroom. We surface news from Hacker News alongside other reporting so you can compare coverage in one place.
Editorial policy · Corrections · About Scoop