Question 1

How long can an audio file be?

WikiPlus · Accepted Answer

Practically unlimited. Whisper processes audio in 30-second chunks, so 2-hour podcasts or 3-hour interviews work fine — expect roughly real-time processing (a 60-minute file takes about 60 minutes on a mid-range laptop). Longer files use more memory.

Question 2

Will Whisper transcribe music and singing?

WikiPlus · Accepted Answer

Partially. Whisper targets speech, so instrumental music produces empty or garbled output. Lyrics in clearly sung vocals can be transcribed, but quality varies — for music lyric extraction, dedicated tools perform better.

Question 3

Does the transcriber translate between languages?

WikiPlus · Accepted Answer

Yes. Whisper has a built-in translation mode that outputs English regardless of source language. The transcriber exposes this as a toggle — handy for making non-English podcasts, meetings, or interviews searchable in English.

Audio

Audio Transcriptor

Audio Trimmer

Frequently asked questions