audio-diffusion-ddim-256#
README(From Huggingface)#
tags:
audio
spectrograms datasets:
teticio/audio-diffusion-256
De-noising Diffusion Implicit Model trained on teticio/audio-diffusion-256 to generate mel spectrograms of 256x256 corresponding to 5 seconds of audio. The code to convert from audio to spectrogram and vice versa can be found in https://github.com/teticio/audio-diffusion along with scripts to train and run inference.
Model Files#
README.md (395.0 B)
mel/mel_config.json (193.0 B)
model_index.json (250.0 B)
scheduler/scheduler_config.json (278.0 B)
unet/config.json (849.0 B)
unet/model_state.pdparams (433.6 MB)