360 Secure SkillHub
360 安全技能中心

Text To Speech

这是基于inference.sh CLI的AI文本转语音工具,提供DIA TTS、Kokoro等多款模型,适配对话、播客等不同场景,支持声纹克隆、情感化语音、多角色对话及长音频生成。典型应用于视频配音、有声书制作、播客生成、无障碍服务、IVR语音提示等场景。用户通过CLI命令输入文本及参数即可生成自然语音音频。

2040 次下载
0 人已收藏
最近更新 2026-05-17 11:31:07
分类:创作质量:优秀依赖:代码执行 / 第三方 API应用场景:文件转换
360智脑

360 行为纵深安全审计引擎

声明对齐安全模型+五层纵深审计架构

审核结果安全
矩阵研判结果BENIGN
置信度HIGH
OpenClaw LLM
suspicious
VERDICT: suspicious
OpenClaw 静态源码扫描
未扫描
VirusTotal
suspicious
VERDICT: suspicious
Metadata
name
text-to-speech
description
Convert text to natural speech with DIA TTS, Kokoro, Chatterbox, and more via inference.sh CLI. Models: DIA TTS (conversational), Kokoro TTS, Chatterbox, Higgs Audio, VibeVoice (podcasts). Capabilities: text-to-speech, voice cloning, multi-speaker dialogue, podcast generation, expressive speech. Use for: voiceovers, audiobooks, podcasts, accessibility, video narration, IVR, voice assistants. Triggers: text to speech, tts, voice generation, ai voice, speech synthesis, voice over, generate speech, ai narrator, voice cloning, text to audio, elevenlabs alternative, voice ai, ai voiceover, speech generator, natural voice
allowed-tools
Bash(infsh *)

Text-to-Speech

Convert text to natural speech via inference.sh CLI.

Text-to-Speech

Quick Start

# Install CLI
curl -fsSL https://cli.inference.sh | sh && infsh login

# Generate speech
infsh app run infsh/kokoro-tts --input '{"text": "Hello, welcome to our product demo."}'

Install note: The install script only detects your OS/architecture, downloads the matching binary from dist.inference.sh, and verifies its SHA-256 checksum. No elevated permissions or background processes. Manual install & verification available.

Available Models

Model App ID Best For
DIA TTS infsh/dia-tts Conversational, expressive
Kokoro TTS infsh/kokoro-tts Fast, natural
Chatterbox infsh/chatterbox General purpose
Higgs Audio infsh/higgs-audio Emotional control
VibeVoice infsh/vibevoice Podcasts, long-form

Browse All Audio Apps

infsh app list --category audio

Examples

Basic Text-to-Speech

infsh app run infsh/kokoro-tts --input '{"text": "Welcome to our tutorial."}'

Conversational TTS with DIA

infsh app sample infsh/dia-tts --save input.json

# Edit input.json:
# {
#   "text": "Hey! How are you doing today? I'm really excited to share this with you.",
#   "voice": "conversational"
# }

infsh app run infsh/dia-tts --input input.json

Long-form Audio (Podcasts)

infsh app sample infsh/vibevoice --save input.json

# Edit input.json with your podcast script
infsh app run infsh/vibevoice --input input.json

Expressive Speech with Higgs

infsh app sample infsh/higgs-audio --save input.json

# {
#   "text": "This is absolutely incredible!",
#   "emotion": "excited"
# }

infsh app run infsh/higgs-audio --input input.json

Use Cases

  • Voiceovers: Product demos, explainer videos
  • Audiobooks: Convert text to spoken word
  • Podcasts: Generate podcast episodes
  • Accessibility: Make content accessible
  • IVR: Phone system voice prompts
  • Video Narration: Add narration to videos

Combine with Video

Generate speech, then create a talking head video:

# 1. Generate speech
infsh app run infsh/kokoro-tts --input '{"text": "Your script here"}' > speech.json

# 2. Use the audio URL with OmniHuman for avatar video
infsh app run bytedance/omnihuman-1-5 --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "<audio-url-from-step-1>"
}'

Related Skills

# Full platform skill (all 150+ apps)
npx skills add inference-sh/skills@inference-sh

# AI avatars (combine TTS with talking heads)
npx skills add inference-sh/skills@ai-avatar-video

# AI music generation
npx skills add inference-sh/skills@ai-music-generation

# Speech-to-text (transcription)
npx skills add inference-sh/skills@speech-to-text

# Video generation
npx skills add inference-sh/skills@ai-video-generation

Browse all apps: infsh app list

Documentation