English
Discover amazing AI tools and applications
An upgraded TTS system featuring multilingual support, real-time style switching and efficient inference
Generate 1-minute videos quickly with only 6GB of low VRAM.
a node-based user interface for Stable Diffusion.
Turn Text into Realistic Podcasts with Multi-Speaker, Multilingual & Emotional Speech
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Zero-shot voice cloning, supports multiple languages, allows voice parameter control
Animate Digital Humans' Lip Movements
Automatically handles video translation, subtitle generation and dubbing
Supporting Mandarin, English and Cantonese, with natural speech synthesis and zero-shot voice cloning
Supporting 600+ languages, voice design, voice cloning, natural speech, and ultra-fast inference
Zero-shot voice cloning and natural-language-driven voice design across multiple languages and dialects
Zero-shot cloning and emotional control across 23 languages
An ultra-fast, lightweight open-source music model, delivering commercial-grade audio on local hardware with less than 4GB VRAM
Clones voices from short audio and generates natural speech.
Turns lyrics into songs in seconds, generates music by style keywords.
Upgraded model architecture with greatly improved sound quality and song integrity, supports ultra-long duration and multilingual creation
A speech synthesis system that generates multi-speaker conversations with voice cloning and multilingual support.
Clone voice in 5 seconds — GPT-SoVITS enables multilingual AI speech.
A lightweight and easy-to-use multi-terminal personal AI assistant, supporting multi-channel connection and custom skills
Multilingual support for 52 languages/dialects and exceptional robustness in song and contextual transcription
An open-source fast-thinking multilingual translation model supporting 33 languages
Multi-language emotional expression, and real-time streaming generation to enable human-like natural speech with low-resource cross-scenario deployment.
Multilingual speech recognition, emotion & audio event detection—efficient and accurate
multilingual, real-time/offline recognition, easy to use and efficient
Zero-shot voice cloning, emotion expression capabilities
Real-time talking-head framework, high-fidelity, long-duration stable audio-visual synchronization
An open-source translation model, 33 languages + 5 dialects, accurate & flexible
A joint zero-shot SVS project supporting multilingual synthesis and dual-mode control
A desktop client for ChatGPT, Claude and other LLMs
Turns static portraits into video/audio-driven 3D models in real time
Generate high-quality classical music, supports generation by period, composer and instrumentation
Generates depth-aware 3D panoramas and scene models from single images
An all-in-one local AI assistant supporting cross-platform operation and mobile remote control
A desktop graphical tool developed by ValueCell-ai based on OpenClaw, featuring one-click installation
PartPacker enables part-level 3D object generation from single-view images
Supporting high-quality TTS and zero-shot voice cloning with extremely high timbre similarity
A large model runner with a visual interface.
Supporting 17 languages, with accurate dialect and low-volume recognition
Generation and understanding, featuring high-fidelity song synthesis and controllable structural creation
Taming Bad Noise for Effective Video Object Removal
Removing hard-coded subtitles from videos and text watermarks from images with lossless resolution
An open-source enterprise AI assistant integrating RAG pipelines, multi-modal interaction, and workflow orchestration.
Add perfectly fitting foley sounds to silent videos.
Get up and running with large language models
A workflow automation platform supporting no-code/code dual-mode building
Generates pixel-aligned high-fidelity 3D models with PBR textures from a single image
A security-first workflow automation tool with enhanced reliability and efficiency, supporting visual drag-and-drop operation
An easy-to-deploy, extensible open-source AI chatbot
A smart-searchable library of 2000+ ready-to-use n8n automation workflows
A lightweight, efficient open-source document parser that accurately converts PDFs, images, and e-books into Markdown/JSON
Zero-shot voice cloning and expressive editing of emotion, style, and paralinguistic cues
Making local speech-to-text and translation easy.
Zero-shot voice cloning, emotion expression and streaming inference
An ultra-lightweight TTS tool, supporting multilingual speech synthesis and zero-shot voice cloning on ordinary CPU
A free video editor featuring drag-and-drop simplicity, rich effects, and 4K video export
Easily displays technical and metadata information of audio and video files
A self-hosted machine translation tool supporting multi-language translation, usable offline with controllable data privacy
Automatically generates HD short videos with script, footage, voiceover, subtitles and music
Creates high-quality realistic sound effects from text or video with only English prompts supported