gemini-skill
Active·★ 822·Updated 2026-05-29
★ Vision / Multimodal★ Browser Automation
gemini drawing MCP & skill through browser, can be used in openclaw or any agent that supports MCP.
Gemini Skill automates interactions with Google Gemini's web interface using CDP. It supports AI image generation, multi-turn conversations, image uploading and extraction, session management, and an MCP server for integration with AI clients. The system uses a daemon architecture to manage browser processes efficiently.
#automation#drawing#gemini#mcp#mcp-client#mcp-server#mcp-servers#openclaw
01
Features
01AI image generation with prompt and full-size download
02Multi-turn text dialogue with Gemini
03Image upload for reference-based generation
04Image extraction from conversations (base64 and CDP full-size)
05Session management (new, temp, model switch, navigate history)
02
Compatibility
Windows
Windows
Verified via docs
macOS
macOS
Verified via docs
Linux
Linux
Verified via docs
03
Quick start
1
$ git clone https://github.com/WJZ-P/gemini-skill.git
2
$ cd gemini-skill
3
$ npm install
04
Use cases
↳Automatically generate game-style emojis through AI dialogue
↳Conduct multi-turn conversations with Gemini for information retrieval
↳Upload a reference image to generate a new variant using Gemini
05
Alternatives
CopilotKit★ 31.8k
React UI + elegant infrastructure for AI Copilots, AI chatbots, and in-app AI agents. The Agentic Frontend.
mcp-chrome★ 11.8k
Chrome MCP Server is a Chrome extension-based Model Context Protocol (MCP) server that exposes your Chrome browser functionality to AI assistants like Claude, enabling complex browser automation, content analysis, and semantic search.
budibase★ 28.0k
Create business apps and automate workflows in minutes. Supports PostgreSQL, MySQL, MariaDB, MSSQL, MongoDB, Rest API, Docker, K8s, and more 🚀 No code / Low code platform..
Related searches
Comments
Log in to leave a comment
- Jesse ChenMay 23, 2026
Works surprisingly well on Node 18+ setups. The browser automation side of it is remarkably stable.
- OOakley ZhangApr 9, 2026
Used this to let an LLM agent draw mockups directly in a headless browser while iterating on UI feedback.
- JJustice GarciaMar 13, 2026
Is there support for rendering SVG outputs directly, or does it always go through the canvas element?