Cradle
Active·★ 2.5k·Updated 2024-11-07
★ Trending
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.
Cradle is a framework that enables foundation models to control computers using human-like interfaces, taking screenshots as input and generating keyboard/mouse actions as output. It supports controlling various games and software, allowing AI agents to perform complex tasks across different applications.
#AI Agents#General Computer Control#LLM#Computer Vision#Human-Computer Interaction#Image Generation
01
Features
01Enables foundation models for general computer control.
02Utilizes a unified human-like interface: screenshots as input, keyboard and mouse as output.
03Supports diverse applications including popular games (e.g., RDR2, Stardew Valley) and productivity software (e.g., Chrome, Outlook).
04Provides a modular framework designed for easy adaptation and migration to new game and software environments.
05Integrates with various Large Language Model (LLM) APIs, including OpenAI, Azure OpenAI, and Claude.
02
Compatibility
OpenAI
API Supported
Verified via docs
Azure OpenAI
API Supported
Verified via docs
Anthropic Claude
API Supported
Verified via docs
AWS Restful API for Claude
API Supported
Verified via docs
VS Code
IDE Supported
Verified via docs
PyCharm
IDE Supported
Verified via docs
03
Quick start
1
$ pip install -r requirements.txt
04
Use cases
↳Automating gameplay in complex video games.
↳Performing in-game management and progression tasks.
↳Controlling and interacting with various desktop applications for productivity or creative tasks.
05
Alternatives
ragflow★ 81.5k
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
n8n★ 190.2k
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
Context7★ 56.4k
MCP Server that provides up-to-date code documentation for LLMs and AI code editors.
GitHub MCP Server★ 30.3k
GitHub's official MCP Server. Allows AI agents to interact directly with your GitHub repositories (read files, search code, issues).
Brave Search MCP★ 86.5k
Allow your AI Agent to search the real-time internet using Brave Search API. Essential for getting up-to-date information.
Related searches
Comments
Log in to leave a comment
No comments yet. Be the first!