WikiPlus
Audio · 2 tools

Audio

The WikiPlus audio transcriber converts speech in audio files into time-stamped text using a browser-embedded Whisper model. Drop an MP3, WAV, M4A, or OGG into the page, pick the source language from …

100% private processing

All operations happen on your device using WebAssembly. Nothing is uploaded — perfect for sensitive documents.

Filter

The WikiPlus audio transcriber converts speech in audio files into time-stamped text using a browser-embedded Whisper model. Drop an MP3, WAV, M4A, or OGG into the page, pick the source language from over 90 supported options, and watch the transcript assemble paragraph-by-paragraph as inference progresses. Then copy to clipboard, export as TXT or SRT, and move on. The model downloads once and runs entirely locally, so podcasts, meeting recordings, and interview audio stay fully private.

Every tool on this page runs entirely inside your browser. Nothing is uploaded to our servers, nothing is cached for later, and no account is required. Files are processed on your own device using WebAssembly modules and the open-source libraries that power each utility, which means confidential documents stay confidential — even if you disconnect from the internet after the page loads, most tools will still finish their job. Pick the utility you need below and start working straight away.

Frequently asked questions

How long can an audio file be?
Practically unlimited. Whisper processes audio in 30-second chunks, so 2-hour podcasts or 3-hour interviews work fine — expect roughly real-time processing (a 60-minute file takes about 60 minutes on a mid-range laptop). Longer files use more memory.
Will Whisper transcribe music and singing?
Partially. Whisper targets speech, so instrumental music produces empty or garbled output. Lyrics in clearly sung vocals can be transcribed, but quality varies — for music lyric extraction, dedicated tools perform better.
Does the transcriber translate between languages?
Yes. Whisper has a built-in translation mode that outputs English regardless of source language. The transcriber exposes this as a toggle — handy for making non-English podcasts, meetings, or interviews searchable in English.