
# audio-diffusion-ddim-256
---


## README([From Huggingface](https://huggingface.co/teticio/audio-diffusion-ddim-256))

---
tags:
- audio
- spectrograms
datasets:
- teticio/audio-diffusion-256
---
De-noising Diffusion Implicit Model trained on teticio/audio-diffusion-256 to generate mel spectrograms of 256x256 corresponding to 5 seconds of audio. The code to convert from audio to spectrogram and vice versa can be found in https://github.com/teticio/audio-diffusion along with scripts to train and run inference.



## Model Files

- [README.md](https://paddlenlp.bj.bcebos.com/models/community/teticio/audio-diffusion-ddim-256/README.md) (395.0 B)

- [mel/mel_config.json](https://paddlenlp.bj.bcebos.com/models/community/teticio/audio-diffusion-ddim-256/mel/mel_config.json) (193.0 B)

- [model_index.json](https://paddlenlp.bj.bcebos.com/models/community/teticio/audio-diffusion-ddim-256/model_index.json) (250.0 B)

- [scheduler/scheduler_config.json](https://paddlenlp.bj.bcebos.com/models/community/teticio/audio-diffusion-ddim-256/scheduler/scheduler_config.json) (278.0 B)

- [unet/config.json](https://paddlenlp.bj.bcebos.com/models/community/teticio/audio-diffusion-ddim-256/unet/config.json) (849.0 B)

- [unet/model_state.pdparams](https://paddlenlp.bj.bcebos.com/models/community/teticio/audio-diffusion-ddim-256/unet/model_state.pdparams) (433.6 MB)


[Back to Main](../../)