New bomb dropped from asian researchers: YuE: Open Music Foundation Models for Full-Song Generation
Only few days ago a r/LocalLLaMA user was going to give away a kidney for this.
YuE is an open-source project by HKUST tackling the challenge of generating full-length songs from lyrics (lyrics2song). Unlike existing models limited to short clips, YuE can produce 5-minute songs with coherent vocals and accompaniment. Key innovations include:
- A semantically enhanced audio tokenizer for efficient training.
- Dual-token technique for synced vocal-instrumental modeling.
- Lyrics-chain-of-thoughts for progressive song generation.
- Support for diverse genres, languages, and advanced vocal techniques (e.g., scatting, death growl).
Check out the GitHub repo for demos and model checkpoints.