Zero-shot voice cloning and natural-language-driven voice design across multiple languages and dialects
16GB RAM recommended. 25GB+ storage recommended.
macOS 15+: Supports both Intel and M-series chips.
Windows 10/11: Intel/AMD GPUs supported, NVIDIA GPU recommended.
Note: For NVIDIA GPUs, install a newer driver.June 2, 2026 Update: Mac computers with Apple M-series chips (macOS) can run MLX-optimized AI models, leveraging unified memory and GPU/NPU acceleration for significantly faster content generation. Users who installed earlier can click "Reinstall" to get the MLX version.
Qwen3-TTS is an open-source Text-to-Speech (TTS) model series developed by Alibaba Cloud's Qwen Team. More than just a text-reader, it is an intelligent speech system that understands emotions and masterfully mimics voices. It’s the perfect tool for content creators, developers, and anyone looking to build a unique AI persona.
Highlights:
Technology & Team: Developed by the prestigious Qwen Team at Alibaba, world leaders in LLM and multimodal research.