POST
v1
/
replicate
/
musicgeneration
curl --request POST \
  --url http://localhost:9000/api/v1/replicate/musicgeneration \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form 'prompt=<string>' \
  --form 'model_version=<string>' \
  --form duration=123 \
  --form continuation=true \
  --form continuation_start=123 \
  --form continuation_end=123 \
  --form multi_band_diffusion=true \
  --form 'normalization_strategy=<string>' \
  --form top_k=123 \
  --form top_p=123 \
  --form temperature=123 \
  --form classifier_free_guidance=123 \
  --form 'output_format=<string>'

Query

model
string
default:
"meta/musicgen"
required

The model to be used;

Body

prompt
string
required

A description of the music you want to generate.

model_version
string
default:
"stereo-melody-large"

Model to use for generation

input_audio
file

An audio file that will influence the generated music. If continuation is True, the generated music will be a continuation of the audio file. Otherwise, the generated music will mimic the audio file’s melody.

duration
int
default:
"8"

Duration of the generated audio in seconds.

continuation
bool
default:
"False"

If True, generated music will continue from input_audio. Otherwise, generated music will mimic input_audio’s melody.

continuation_start
int
default:
"0"

Start time of the audio file to use for continuation.

continuation_end
int
default:
"0"

End time of the audio file to use for continuation. If -1 or None, will default to the end of the audio clip.

multi_band_diffusion
boolean
default:
"False"

If True, the EnCodec tokens will be decoded with MultiBand Diffusion. Only works with non-stereo models.

normalization_strategy
string
default:
"peak"

Strategy for normalizing audio.

top_k
int
default:
"250"

Reduces sampling to the k most likely tokens.

top_p
float
default:
"0.0"

Reduces sampling to tokens with cumulative probability of p. When set to 0 (default), top_k sampling is used.

temperature
float
default:
"1.0"

Controls the ‘conservativeness’ of the sampling process. Higher temperature means more diversity.

classifier_free_guidance
int
default:
"3"

Increases the influence of inputs on the output. Higher values produce lower-varience outputs that adhere more closely to inputs.

output_format
string
default:
"mp3"

Output format for generated audio.