claude-video-vision
Give Claude the ability to watch and understand videos — Claude Code plugin with frame extraction and multimodal audio analysis
A Claude Code plugin that gives Claude the ability to watch and understand videos. It extracts frames via ffmpeg and processes audio through multiple backends (Gemini, local Whisper, or OpenAI). Claude receives frames as images and audio transcriptions with timestamps, acting as a perception layer.
Features
Compatibility
Quick start
Use cases
Alternatives
Related searches
Comments
- RRiver WhiteMay 24, 2026
The multimodal audio integration works, but processing longer videos can take some time.
- EEmerson PatelMay 6, 2026
This completely changes how I debug UI tests. Claude can actually see where the selector failed.
- JJustice ThompsonApr 20, 2026
Perfect for feeding Claude visual context from UI recordings to debug frontend glitch behaviors.
- PParker DavisApr 19, 2026
Make sure you have ffmpeg installed globally, otherwise the frame extraction will fail silently.