SeamlessAlign

GPTKB entity

Statements (16)
Predicate Object
gptkbp:instanceOf multilingual text alignment dataset
gptkbp:contains 270,000 hours of aligned speech and text
gptkbp:createdBy gptkb:Meta_AI
gptkbp:hostedBy gptkb:Hugging_Face
https://www.w3.org/2000/01/rdf-schema#label SeamlessAlign
gptkbp:license CC BY-NC 4.0
gptkbp:officialWebsite https://ai.facebook.com/research/publications/seamlessalign-a-massive-multilingual-aligned-corpus-for-speech-and-text/
gptkbp:releaseDate 2023
gptkbp:supportsLanguage 100+ languages
gptkbp:type speech-text pairs
text-text pairs
gptkbp:usedFor training machine translation models
training speech translation models
gptkbp:usedIn gptkb:SeamlessM4T
gptkbp:bfsParent gptkb:SeamlessM4T
gptkbp:bfsLayer 6