AI Applications | Jianshanxing Technology

Jianshanxing Technology

All

TTS

Agent

Voice Conversion

Chat

Video

Music

Workflow

ASR

Image

Multimedia

Translation

OCR

Foley Sound

IndexTTS 2

An upgraded TTS system featuring multilingual support, real-time style switching and efficient inference

FramePack

Generate 1-minute videos quickly with only 6GB of low VRAM.

ComfyUI

a node-based user interface for Stable Diffusion.

SoulX-Podcast

Turn Text into Realistic Podcasts with Multi-Speaker, Multilingual & Emotional Speech

LatentSync

Animate Digital Humans' Lip Movements

VoxCPM2

Supporting Mandarin, English and Cantonese, with natural speech synthesis and zero-shot voice cloning

GPT-SoVITS

Clone voice in 5 seconds — GPT-SoVITS enables multilingual AI speech.

pyVideoTrans

Automatically handles video translation, subtitle generation and dubbing

Qwen3-TTS

Zero-shot voice cloning and natural-language-driven voice design across multiple languages and dialects

Spark-TTS

Zero-shot voice cloning, supports multiple languages, allows voice parameter control

Chatterbox TTS

Zero-shot cloning and emotional control across 23 languages

ACE-Step 1.5

An ultra-fast, lightweight open-source music model, delivering commercial-grade audio on local hardware with less than 4GB VRAM

MoneyPrinterTurbo

Automatically generates HD short videos with script, footage, voiceover, subtitles and music

OmniVoice

Supporting 600+ languages, voice design, voice cloning, natural speech, and ultra-fast inference

VibeVoice TTS

Highly expressive, long-form, multi-speaker conversational audio generation

ACE-Step 1.5 XL

Upgraded model architecture with greatly improved sound quality and song integrity, supports ultra-long duration and multilingual creation

Woosh

Creates high-quality realistic sound effects from text or video with only English prompts supported

PilotTTS

Voice cloning with text-tag control over 11 emotions, 4 paralinguistic sounds (like laughter/breathing), and 14 Chinese dialects

dots.tts

Featuring ultra-realistic 48kHz zero-shot voice cloning with rich emotional and lifelike expressiveness

ACE-Step 1

Turns lyrics into songs in seconds, generates music by style keywords.

FireRedTTS2

A speech synthesis system that generates multi-speaker conversations with voice cloning and multilingual support.

Qwen3-ASR

Multilingual support for 52 languages/dialects and exceptional robustness in song and contextual transcription

HY-MT 2

An open-source fast-thinking multilingual translation model supporting 33 languages

CosyVoice

Multi-language emotional expression, and real-time streaming generation to enable human-like natural speech with low-resource cross-scenario deployment.

VoxCPM

Clones voices from short audio and generates natural speech.

SenseVoice

Multilingual speech recognition, emotion & audio event detection—efficient and accurate

IndexTTS 1

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

FunASR

multilingual, real-time/offline recognition, easy to use and efficient

SoulX-FlashHead

Real-time talking-head framework, high-fidelity, long-duration stable audio-visual synchronization

HY-MT 1.5

An open-source translation model, 33 languages + 5 dialects, accurate & flexible

SoulX-Singer

A joint zero-shot SVS project supporting multilingual synthesis and dual-mode control

MOSS-TTS 1.5

An expressive open-source text-to-speech model supporting 31 languages, featuring stable zero-shot voice cloning and precise inline pause control

Chatbox AI

A desktop client for ChatGPT, Claude and other LLMs

LivePortrait

Turns static portraits into video/audio-driven 3D models in real time

NotaGen

Generate high-quality classical music, supports generation by period, composer and instrumentation

LobsterAI

An all-in-one local AI assistant supporting cross-platform operation and mobile remote control

ClawX

A desktop graphical tool developed by ValueCell-ai based on OpenClaw, featuring one-click installation

LM Studio

A large model runner with a visual interface.

GLM-ASR

Supporting 17 languages, with accurate dialect and low-volume recognition

HeartMuLa

Generation and understanding, featuring high-fidelity song synthesis and controllable structural creation

MOSS-TTS-Nano

An ultra-lightweight TTS tool, supporting multilingual speech synthesis and zero-shot voice cloning on ordinary CPU

MiniMax-Remover

Taming Bad Noise for Effective Video Object Removal

PartPacker

PartPacker enables part-level 3D object generation from single-view images

MOSS-SoundEffect-v2.0

Turns a simple description into a high-fidelity sound effect up to 30 seconds long, perfect for video voiceovers and game asset creation.

VSR

Removing hard-coded subtitles from videos and text watermarks from images with lossless resolution

DreamCube

Generates depth-aware 3D panoramas and scene models from single images

LongCat-AudioDiT

Supporting high-quality TTS and zero-shot voice cloning with extremely high timbre similarity

ThinkSound

Add perfectly fitting foley sounds to silent videos.

Ollama

Get up and running with large language models

n8n 1.x

A workflow automation platform supporting no-code/code dual-mode building

n8n 2.x

A security-first workflow automation tool with enhanced reliability and efficiency, supporting visual drag-and-drop operation

AstrBot

An easy-to-deploy, extensible open-source AI chatbot

n8n Workflows

A smart-searchable library of 2000+ ready-to-use n8n automation workflows

MinerU

A lightweight, efficient open-source document parser that accurately converts PDFs, images, and e-books into Markdown/JSON

F5-TTS

Zero-shot voice cloning, emotion expression capabilities

Step-Audio-EditX

Zero-shot voice cloning and expressive editing of emotion, style, and paralinguistic cues

Whisper-WebUI

Making local speech-to-text and translation easy.

GLM-TTS

Zero-shot voice cloning, emotion expression and streaming inference

Pixal3D

Generates pixel-aligned high-fidelity 3D models with PBR textures from a single image

MaxKB

An open-source enterprise AI assistant integrating RAG pipelines, multi-modal interaction, and workflow orchestration.

OpenShot

A free video editor featuring drag-and-drop simplicity, rich effects, and 4K video export

MediaInfo

Easily displays technical and metadata information of audio and video files

LibreTranslate

A self-hosted machine translation tool supporting multi-language translation, usable offline with controllable data privacy

TripoSplat

Generates lightweight interactive 3D Gaussian scenes compatible with mainstream 3D software from one single image rapidly

QwenPaw

Seamless integration with chat apps like WeChat, DingTalk, and Feishu for multi-agent collaborative workflows

CoPaw

A lightweight and easy-to-use multi-terminal personal AI assistant, supporting multi-channel connection and custom skills