POST
v1
/
replicate
/
stt
curl --request POST \
  --url http://localhost:9000/api/v1/replicate/stt \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form 'url=<string>' \
  --form 'task=<string>' \
  --form batch_size=123 \
  --form 'timestamp=<string>'

Query

model
string
default:
"turian/insanely-fast-whisper-with-video"
required

The model to be used;

Body

audio
file
required

Audio file. Either this or url must be provided.

url
string

Video URL for yt-dlp to download the audio from. Either this or audio must be provided.

task
string
default:
"transcribe"

Task to perform: transcribe or translate to another language. (default: transcribe).

batch_size
int
default:
"64"

Number of parallel batches you want to compute. Reduce if you face OOMs. (default: 64).

timestamp
string
default:
"chunk"

Whisper supports both chunked as well as word level timestamps. (default: chunk).