Vision and multimodal AI tools for image understanding, generation, and processing multiple input modalities.
178 tools
| Name | Tagline | Category | Stars | License | Updated |
|---|
| ragflowMost PopularTrending··· | RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs | Vision / Multimodal | ★ 81.5k | APACHE-2.0 | 2026-05-29 |
| n8nMost PopularTrending··· | Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations. | Vision / Multimodal | ★ 190.2k | NOASSERTION | 2026-05-29 |
| Claude FlowMost PopularTrending··· | The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms. | Vision / Multimodal | ★ 56.4k | MIT | 2026-05-29 |
| ToolJetMost PopularTrending··· | ToolJet is the open-source foundation of ToolJet AI - the AI-native platform for building internal tools, dashboard, business applications, workflows and AI agents 🚀 | Vision / Multimodal | ★ 37.9k | AGPL-3.0 | 2026-05-29 |
| Open InterpreterMost PopularTrending··· | A natural language interface for computers. Lets LLMs run code (Python, Javascript, Shell, etc.) locally on your machine. | Vision / Multimodal | ★ 63.7k | AGPL-3.0 | 2026-05-17 |
| rufloMost PopularMulti-Agent··· | 🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architecture, distributed swarm intelligence, RAG integration, and native Claude Code / Codex Integration | Vision / Multimodal | ★ 56.4k | MIT | 2026-05-29 |
| UI-TARS-desktopMost PopularTrending··· | The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra | Vision / Multimodal | ★ 35.7k | Apache-2.0 | 2026-05-18 |
| GPT ResearcherMost PopularTrending··· | An LLM agent that conducts deep research (local and web) on any given topic and generates a long report with citations. | Vision / Multimodal | ★ 27.4k | Apache-2.0 | 2026-05-28 |
| FlowiseMost PopularTrending··· | Build AI Agents, Visually | Vision / Multimodal | ★ 53.2k | NOASSERTION | 2026-05-29 |
| chrome-devtools-mcpMost PopularTrending··· | Chrome DevTools for coding agents | Vision / Multimodal | ★ 42.3k | Apache-2.0 | 2026-05-28 |
| playwright-mcpMost PopularTrending··· | Playwright MCP server | Vision / Multimodal | ★ 33.2k | Apache-2.0 | 2026-05-28 |
| MeshroomMost PopularTrending··· | Node-based Visual Programming Toolbox | Vision / Multimodal | ★ 12.8k | NOASSERTION | 2026-05-29 |
| mcp-sequentialthinking-toolsTrendingEssential | 🧠 An adaptation of the MCP Sequential Thinking Server to guide tool usage. This server provides recommendations for which MCP tools would be most effective at each stage. | Vision / Multimodal | ★ 584 | MIT | 2026-05-29 |
| Cherry StudioMost PopularTrending··· | A powerful desktop client for multiple LLMs. Supports local and cloud models. | Vision / Multimodal | ★ 46.6k | AGPLV3 | 2026-05-29 |
| BettaFishMost PopularTrending | Multi-agent opinion analysis assistant. Breaks information bubbles and predicts trends. | Vision / Multimodal | ★ 41.1k | GPL-2.0 | 2026-05-24 |
| XHS-DownloaderMost PopularTrending | 小红书(XiaoHongShu、RedNote)链接提取/作品采集工具:提取账号发布、收藏、点赞、专辑作品链接;提取搜索结果作品、用户链接;采集小红书作品信息;提取小红书作品下载地址;下载小红书无水印作品文件 | Vision / Multimodal | ★ 11.3k | GPL-3.0 | 2026-05-29 |
| rllmTrending | Democratizing Reinforcement Learning for LLMs | Vision / Multimodal | ★ 5.6k | Apache-2.0 | 2026-05-28 |
| ir-simTrending | A Python-based lightweight robot simulator designed for navigation, control, and reinforcement learning | Vision / Multimodal | ★ 1.1k | MIT | 2026-05-26 |
| mcp-server-chartTrending | 🤖 A visualization mcp & skills contains 25+ visual charts using @antvis. Using for chart generation and data analysis. | Vision / Multimodal | ★ 4.1k | MIT | 2026-05-06 |
| inspectorTrending | Visual testing tool for MCP servers | Vision / Multimodal | ★ 9.9k | THE MIT LICENSE | 2026-05-29 |