AgentIndex icon
AgentIndex
ToolsCategoriesTrendingNewCompare
Submit Tool
Home/
Vision / Multimodal/
Awesome-GUI-Agent
Awesome-GUI-Agent logo

Awesome-GUI-Agent

Active·★ 1.2k·Updated 2025-08-17
★ Trending

💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.

This repository provides a curated list of papers, projects, and resources focused on multi-modal Graphical User Interface (GUI) agents. It aims to build a comprehensive overview for developing digital assistants capable of interacting with screens.

#Multimodal AI#GUI Agents#Web Automation#Mobile Automation#Research Resources#Web Browsing#Data Analysis
↗ Visit site★ GitHub
01

Features

01Curated collection of papers, projects, and resources.
02Focus on multi-modal GUI agents.
03Actively maintained and open for contributions.
04Categorized sections for datasets, models, surveys, and projects.
05Includes an associated tool (Awesome-Paper-Agent) for easy contribution submission.
02

Compatibility

Web Interfaces
Web
Verified via docs
Mobile GUIs
Mobile
Verified via docs
Desktop GUIs
Desktop
Verified via docs
03

Use cases

↳Researchers seeking cutting-edge work and benchmarks in GUI agent development.
↳
↳Developers looking for relevant projects and resources to build or enhance GUI automation tools.
↳Contributors who want to share new papers, projects, or insights within the multi-modal GUI agent community.
04

Alternatives

ragflow logo
ragflow★ 81.5k
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
vs →
n8n logo
n8n★ 190.2k
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
vs →
mindsdb logo
mindsdb★ 39.2k
Federated Query Engine for AI - The only MCP Server you'll ever need
vs →
GitHub MCP Server logo
GitHub MCP Server★ 30.3k
GitHub's official MCP Server. Allows AI agents to interact directly with your GitHub repositories (read files, search code, issues).
vs →
Brave Search MCP logo
Brave Search MCP★ 86.5k
Allow your AI Agent to search the real-time internet using Brave Search API. Essential for getting up-to-date information.
vs →
awesome-n8n-templates logo
awesome-n8n-templates★ 22.6k
Supercharge your workflow automation with this curated collection of n8n templates! Instantly connect your favorite apps-like Gmail, Telegram, Google Drive, Slack, and more-with ready-to-use, AI-powered automations. Save time, boost productivity, and unlock the true potential of n8n in just a few clicks.
vs →
dagster logo
dagster★ 15.6k
An orchestration platform for the development, production, and observation of data assets.
vs →
genai-toolbox logo
genai-toolbox★ 15.4k
MCP Toolbox for Databases is an open source MCP server for databases.
vs →
See all alternatives →

Related searches

Awesome-GUI-Agent AlternativesBest Vision / Multimodal Tools 2026Open Source Vision / MultimodalAwesome-GUI-Agent TutorialAwesome-GUI-Agent Vs CompetitorsMultimodal AIGUI AgentsWeb Automation

Comments

Log in to leave a comment

No comments yet. Be the first!

On this page
01Features02Compatibility03Use cases04Alternatives
Stats
GitHub Stars★ 1.2k
Last commit9mo ago
StatusActive
License—
CategoryVision / Multimodal
Trend (30d)
+0k↑ 4.0%
Links
Documentation↗Discussion↗Issues↗Releases↗

Deploy on DigitalOcean — Get $200 Free Credit

Ad
© 2026 AgentIndex.app|Built by a 10-year iOS Developer.
QYSGitHubBuy me a coffee ☕

Browse by Category

Code AssistantWorkflow AutomationRAG / Knowledge BaseMulti-AgentBrowser AutomationLLM InfraDev ToolingObservability

Not affiliated with Anthropic, OpenAI or Microsoft.