AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
GPTKB entity
Statements (31)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:academic_journal
gptkb:text-to-audio_generation_model |
| gptkbp:assesses |
Foley sound generation
audio captioning text-to-audio retrieval |
| gptkbp:author |
gptkb:Mark_D._Plumbley
gptkb:Qiuqiang_Kong gptkb:Wenwu_Wang Yuxuan Wang Yong Xu Haohe Liu |
| gptkbp:citation |
100+
|
| gptkbp:contribution |
proposes a latent diffusion model for text-to-audio generation
|
| gptkbp:field |
gptkb:machine_learning
generative models audio synthesis |
| gptkbp:input |
text prompt
|
| gptkbp:output |
audio waveform
|
| gptkbp:publicationYear |
2023
|
| gptkbp:publishedIn |
gptkb:arXiv
|
| gptkbp:relatedTo |
gptkb:Stable_Diffusion
text-to-image generation |
| gptkbp:repository |
https://github.com/haoheliu/AudioLDM
|
| gptkbp:trainer |
gptkb:Clotho
AudioCaps ESC-50 |
| gptkbp:url |
https://arxiv.org/abs/2301.12503
|
| gptkbp:uses |
latent diffusion models
|
| gptkbp:bfsParent |
gptkb:AudioLDM
|
| gptkbp:bfsLayer |
8
|
| https://www.w3.org/2000/01/rdf-schema#label |
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
|