Whishper: a complete transcription suite. (github.com)
from pluja@lemmy.world to selfhosted@lemmy.world on 28 Aug 2023 05:44
https://lemmy.world/post/3992624

Hi everyone!

A few days ago I released Whishper, a new version of a project I’ve been working for about a year now.

It’s a self-hosted audio transcription suite, you can transcribe audio to text, generate subtitles, translate subtitles and edit them all from one UI and 100% locally (it even works offline).

I hope you like it, check out the website for self-hosting instructions: whishper.net

#selfhosted

threaded - newest

morethanevil@lemmy.fedifriends.social on 28 Aug 2023 06:03 next collapse

I saw your project on Codeberg before. Then it was whisper plus. Since whisper+ it did not work anymore for me. I uploaded a file and it did not start. The old whisper worked. Did not try it for months anymore with whisper plus.

Maybe I give it another try. Can I use bind mounts or are there special permissions? Anyway thanks for your work.

pluja@lemmy.world on 28 Aug 2023 08:31 collapse

Whisper+ had some problems, that’s why I rewrote everything. This new version should fix almost (maybe there are some bugs I haven’t found) everything.

If you take a look at the docker-compose file, you’ll see it is already using bind mounts. The only special permission needed is for the LibreTranslate models folder, which runs as non-root with user 1032.

webghost0101@sopuli.xyz on 28 Aug 2023 06:10 next collapse

Does this need to connect to openai or does it function fully independently? Its for offline use.

pcouy@lemmy.pierre-couy.fr on 28 Aug 2023 06:47 next collapse

The readme mentions “transcription time on CPU” so it’s probably running locally

pluja@lemmy.world on 28 Aug 2023 08:28 collapse

No, it’s completely independent, it does not rely on any third-party APIs or anything else. It can function entirely offline once the models have been downloaded.

midas@ymmel.nl on 28 Aug 2023 06:13 next collapse

Awesome will give this a try

micha@lemmy.sdf.org on 28 Aug 2023 07:00 next collapse

Congratulations on the launch and thanks for making this open-source! Not sure if this supports searching through all transcriptions yet, but that’s what I’d find really helpful. E.g. search for a keyword in all podcast episodes.

pluja@lemmy.world on 28 Aug 2023 08:32 collapse

That’s a great idea! I’ll attempt to implement that feature when I find some time to work on it.

rikudou@lemmings.world on 28 Aug 2023 07:37 next collapse

Nice, congrats!

<img alt="a meme with a photo of Richmond Valentine from Kingsman, the bottom text says whishper" src="https://i.imgur.com/vxeQgIL.jpg">

ares35@kbin.social on 28 Aug 2023 08:58 next collapse

how does whisper do transcribing technical documents. like for lawyers, doctors, engineers and what not? or speakers with heavy accents?

pluja@lemmy.world on 28 Aug 2023 09:04 collapse

Whisper models have a very good WER (word error ratio) for languages like Spanish, English, French… if you use the english-only models it also improves. Check out this page on the docs:

whishper.net/reference/models/#languages-and-accu…

orizuru@lemmy.sdf.org on 28 Aug 2023 08:59 next collapse

Congrats, and thank you for releasing this!

Maybe there’s a couple of personal projects I could use it for…

Axiochus@lemmy.world on 28 Aug 2023 10:05 next collapse

Oh, awesome! Does it do speaker detection? That’s been one of my main gripes with Whisper.

pluja@lemmy.world on 28 Aug 2023 10:10 collapse

Unfortunately, not yet. Whisper per se is not able to do that. Currently, there are few viable solutions for integration, and I’m looking at this one, but all current solutions I know about need GPU for this.

jherazob@kbin.social on 28 Aug 2023 11:33 collapse

VERY understandable, requiring a GPU would limit it's application and spread, i hope a good GPU-less solution is found eventually

fmstrat@lemmy.nowsci.com on 28 Aug 2023 11:59 next collapse

How does it compare to github.com/guillaumekln/faster-whisper?

I’ve been using Faster Whisper for a while locally, and its worked out better than raw whisper and benchmarks really well. Just curious if there are any reasons to switch.

pluja@lemmy.world on 28 Aug 2023 12:59 collapse

Whishper uses faster-whisper in the backend.

Simply put, it is a complete UI for Faster-Whisper with extra features like transcription translation, edition, download options, etc…

fmstrat@lemmy.nowsci.com on 29 Aug 2023 02:46 collapse

Nice! Thanks.

tvcvt@lemmy.ml on 28 Aug 2023 17:50 next collapse

This is excellent timing for me. I was just taking a break from working on setting up whisper.cpp with a web front end to transcribe interviews. This is a much nicer package than I ever had a chance of pulling together. Nice work!

Railcar8095@lemm.ee on 28 Aug 2023 18:15 next collapse

Massive kudos. I had the need for something like this in the past and it would have been a blessing.surely it will be for somebody else

optissima@lemmy.world on 29 Aug 2023 00:30 next collapse

I am looking for open source live transcription software, does this offer that, or is it only file-based?

obinice@lemmy.world on 29 Aug 2023 04:51 next collapse

I’ve been looking for a tool to do this for YEARS, my god! Years!!! ❤️❤️

UberMentch@lemmy.world on 29 Aug 2023 17:51 next collapse

Would love to deploy this, but unfortunately I’m running server equipment that apparently doesn’t support MongoDB 5 (Error message MongoDB 5.0+ requires a CPU with AVX support, and your current system does not appear to have that!). Tried deploying with both 4.4.18 and 4.4.6 and can’t get it to work. If anybody has some recommendations, I’d appreciate hearing them!

Edit: Changed my proxmox environment processor to host, fixed my issue.

pluja@lemmy.world on 30 Aug 2023 05:36 collapse

I’m glad you were able to solve the problem, I add the comment I made to another user with the same problem:

Didn’t know about this problem. I’ll try to add a MariaDB alternative database option soon.

Konraddo@lemmy.world on 29 Aug 2023 18:07 next collapse

Just tried this out but couldn’t get it to work until downgrading mongo to 4.4.6 because my NAS doesn’t ha``ve AVX support. But then, mongo stays unhealthy. No idea why.

pluja@lemmy.world on 30 Aug 2023 05:34 collapse

Didn’t know about this problem. I’ll try to add a MariaDB alternative database option soon to solve this.

crazygoat@lemmy.world on 24 Dec 2023 20:18 collapse

Even this is an good sound to text converter and a good ai transcription service