ChatTTS: ChatTTS is a generative speech model specifically designed for daily dialogue scenarios, such as LLM assistants. It offers natural and expressive speech with fine-grained control over prosodic features like laughter, pauses, and interjections.; claude-video-vision: A Claude Code plugin that gives Claude the ability to watch and understand videos. It extracts frames via ffmpeg and processes audio through multiple backends (Gemini, local Whisper, or OpenAI). Claude receives frames as images and audio transcriptions with timestamps, acting as a perception layer.
Providing voice output for LLM assistants in dialogue scenarios
Analyze a video file by providing its path and optionally asking a specific question