Cool. Though still waiting for a good open source, locally runnable (without needing 64GB of RAM for itself alone) speech to text transcribing model that isn’t awful which I can use to generate subtitles for things from less than perfect audio samples. Doesn’t exist apparently as even Youtube’s transcription isn’t great (though Apple podcast’s transcription is actually really good by comparison).
Cool. Though still waiting for a good open source, locally runnable (without needing 64GB of RAM for itself alone) speech to text transcribing model that isn’t awful which I can use to generate subtitles for things from less than perfect audio samples. Doesn’t exist apparently as even Youtube’s transcription isn’t great (though Apple podcast’s transcription is actually really good by comparison).
looks like Dia runs with just 10gb already
https://github.com/nari-labs/dia/