AgentIndex icon
AgentIndex
ToolsCategoriesTrendingNewCompare
Submit Tool
Home/
Browse AI Tool Categories/
Vision / Multimodal

Vision / Multimodal (2026)

Vision and multimodal AI tools for image understanding, generation, and processing multiple input modalities.

178 tools

NameTaglineCategoryStarsLicenseUpdated
© 2026 AgentIndex.app|Built by a 10-year iOS Developer.
QYSGitHubBuy me a coffee ☕

Browse by Category

Code AssistantWorkflow AutomationRAG / Knowledge BaseMulti-AgentBrowser AutomationLLM InfraDev ToolingObservability

Not affiliated with Anthropic, OpenAI or Microsoft.

ragflow logo
ragflowMost PopularTrending···RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMsVision / Multimodal★ 81.5kAPACHE-2.02026-05-29
n8n logo
n8nMost PopularTrending···Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.Vision / Multimodal★ 190.2kNOASSERTION2026-05-29
Claude Flow logo
Claude FlowMost PopularTrending···The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms.Vision / Multimodal★ 56.4kMIT2026-05-29
ToolJet logo
ToolJetMost PopularTrending···ToolJet is the open-source foundation of ToolJet AI - the AI-native platform for building internal tools, dashboard, business applications, workflows and AI agents 🚀Vision / Multimodal★ 37.9kAGPL-3.02026-05-29
Open Interpreter logo
Open InterpreterMost PopularTrending···A natural language interface for computers. Lets LLMs run code (Python, Javascript, Shell, etc.) locally on your machine.Vision / Multimodal★ 63.7kAGPL-3.02026-05-17
ruflo logo
rufloMost PopularMulti-Agent···🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architecture, distributed swarm intelligence, RAG integration, and native Claude Code / Codex IntegrationVision / Multimodal★ 56.4kMIT2026-05-29
UI-TARS-desktop logo
UI-TARS-desktopMost PopularTrending···The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent InfraVision / Multimodal★ 35.7kApache-2.02026-05-18
GPT Researcher logo
GPT ResearcherMost PopularTrending···An LLM agent that conducts deep research (local and web) on any given topic and generates a long report with citations.Vision / Multimodal★ 27.4kApache-2.02026-05-28
Flowise logo
FlowiseMost PopularTrending···Build AI Agents, VisuallyVision / Multimodal★ 53.2kNOASSERTION2026-05-29
chrome-devtools-mcp logo
chrome-devtools-mcpMost PopularTrending···Chrome DevTools for coding agentsVision / Multimodal★ 42.3kApache-2.02026-05-28
playwright-mcp logo
playwright-mcpMost PopularTrending···Playwright MCP serverVision / Multimodal★ 33.2kApache-2.02026-05-28
Meshroom logo
MeshroomMost PopularTrending···Node-based Visual Programming ToolboxVision / Multimodal★ 12.8kNOASSERTION2026-05-29
mcp-sequentialthinking-tools logo
mcp-sequentialthinking-toolsTrendingEssential🧠 An adaptation of the MCP Sequential Thinking Server to guide tool usage. This server provides recommendations for which MCP tools would be most effective at each stage.Vision / Multimodal★ 584MIT2026-05-29
Cherry Studio logo
Cherry StudioMost PopularTrending···A powerful desktop client for multiple LLMs. Supports local and cloud models.Vision / Multimodal★ 46.6kAGPLV32026-05-29
BettaFish logo
BettaFishMost PopularTrendingMulti-agent opinion analysis assistant. Breaks information bubbles and predicts trends.Vision / Multimodal★ 41.1kGPL-2.02026-05-24
XHS-Downloader logo
XHS-DownloaderMost PopularTrending小红书(XiaoHongShu、RedNote)链接提取/作品采集工具:提取账号发布、收藏、点赞、专辑作品链接;提取搜索结果作品、用户链接;采集小红书作品信息;提取小红书作品下载地址;下载小红书无水印作品文件Vision / Multimodal★ 11.3kGPL-3.02026-05-29
rllm logo
rllmTrendingDemocratizing Reinforcement Learning for LLMsVision / Multimodal★ 5.6kApache-2.02026-05-28
ir-sim logo
ir-simTrendingA Python-based lightweight robot simulator designed for navigation, control, and reinforcement learningVision / Multimodal★ 1.1kMIT2026-05-26
mcp-server-chart logo
mcp-server-chartTrending🤖 A visualization mcp & skills contains 25+ visual charts using @antvis. Using for chart generation and data analysis.Vision / Multimodal★ 4.1kMIT2026-05-06
inspector logo
inspectorTrendingVisual testing tool for MCP serversVision / Multimodal★ 9.9kTHE MIT LICENSE2026-05-29
1–20 / 178
…