AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
GPTKB entity
Statements (31)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:academic_journal
text-to-audio generation model |
gptkbp:assesses |
Foley sound generation
audio captioning text-to-audio retrieval |
gptkbp:author |
gptkb:Mark_D._Plumbley
gptkb:Qiuqiang_Kong gptkb:Wenwu_Wang Yuxuan Wang Yong Xu Haohe Liu |
gptkbp:citation |
100+
|
gptkbp:contribution |
proposes a latent diffusion model for text-to-audio generation
|
gptkbp:field |
gptkb:machine_learning
generative models audio synthesis |
https://www.w3.org/2000/01/rdf-schema#label |
AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
|
gptkbp:input |
text prompt
|
gptkbp:output |
audio waveform
|
gptkbp:publicationYear |
2023
|
gptkbp:publishedIn |
gptkb:arXiv
|
gptkbp:relatedTo |
gptkb:Stable_Diffusion
text-to-image generation |
gptkbp:repository |
https://github.com/haoheliu/AudioLDM
|
gptkbp:trainer |
gptkb:Clotho
AudioCaps ESC-50 |
gptkbp:url |
https://arxiv.org/abs/2301.12503
|
gptkbp:uses |
latent diffusion models
|
gptkbp:bfsParent |
gptkb:AudioLDM
|
gptkbp:bfsLayer |
7
|