AgentIndex icon
AgentIndex
ToolsCategoriesTrendingNewCompare
Submit Tool
Home/
Compare/
on-policy vs mini-swe-agent
on-policy logo
on-policy
★ 2.0k
vs
mini-swe-agent logo
mini-swe-agent
★ 4.7k

on-policy vs mini-swe-agent

on-policy: This repository implements MAPPO, a multi-agent variant of PPO, widely used in cooperative multi-agent games and research. It provides robust implementations for various multi-agent environments like StarCraft II, Hanabi, and Google Research Football, along with detailed training scripts and hyperparameter guidance.; mini-swe-agent: Mini-SWE-agent is a lightweight, 100-line AI agent designed to solve GitHub issues and more, offering a simplified yet performant alternative to larger coding agents. It focuses on minimalism, high performance on benchmarks like SWE-bench, and easy deployment across various environments.

01

TL;DR

on-policy logoChoose on-policy if…

Research and experimentation in cooperative multi-agent reinforcement learning

mini-swe-agent logoChoose mini-swe-agent if…

Researchers for benchmarking, fine-tuning, or RL experiments without bloat

02

Side-by-Side Comparison

Field
on-policy logoon-policy
mini-swe-agent logomini-swe-agent
Category
LLM Infra
LLM Infra
Stars
★ 2.0k
★ 4.7k
License
MIT
—
Updated
1y ago
6d ago
Open Source
Yes
Yes
Website
↗ Visit
↗ Visit
GitHub
↗ GitHub
↗ GitHub
Tags
Multi-Agent Reinforcement Learning, PPO, MAPPO
AI Agent, Python, Software Engineering
03

Features

on-policy logoon-policy
01Implementation of MAPPO (Multi-Agent PPO)
02Support for diverse multi-agent environments (e.g., StarCraft II, Hanabi)
03Ready-to-use training scripts for various scenarios
04Detailed hyperparameter guidance and updated results
05Default support for shared policy among agents
mini-swe-agent logomini-swe-agent
01Minimal code (approx. 100 lines of Python)
02High performance (>74% on SWE-bench verified benchmark)
03Easy deployment and sandboxing (Docker, Podman, Singularity)
04Utilizes only Bash tools, avoiding complex tool-calling interfaces
05Linear history for simplified debugging and fine-tuning
04

Use Cases

on-policy logoon-policy
↳Research and experimentation in cooperative multi-agent reinforcement learning
↳Benchmarking and evaluating PPO's effectiveness in MARL scenarios
↳Training AI agents for popular multi-agent games like StarCraft II and Hanabi
mini-swe-agent logomini-swe-agent
↳Researchers for benchmarking, fine-tuning, or RL experiments without bloat
↳Developers who want to own, understand, and modify their AI tools
↳Engineers needing a trivial-to-sandbox and deployable solution anywhere
05

Best For

on-policy logoon-policy
TrendingReinforcement LearningMulti-Agent AI
mini-swe-agent logomini-swe-agent
TrendingHidden Gem
FAQ

FAQ

What is the difference between on-policy and mini-swe-agent?
Both on-policy and mini-swe-agent are in the LLM Infra category. on-policy has 2.0k stars, while mini-swe-agent has 4.7k stars.
Which is better, on-policy or mini-swe-agent?
The best choice depends on your use case. Choose on-policy if Research and experimentation in cooperative multi-agent reinforcement learning, and mini-swe-agent if Researchers for benchmarking, fine-tuning, or RL experiments without bloat.
Is on-policy free or open source?
Yes, on-policy is open source on GitHub (MIT).
Is mini-swe-agent free or open source?
Yes, mini-swe-agent is open source on GitHub.
→

Related

Alternatives to on-policy →Alternatives to mini-swe-agent →on-policy details →mini-swe-agent details →
© 2026 AgentIndex.app|Built by a 10-year iOS Developer.
QYSGitHubBuy me a coffee ☕

Browse by Category

Code AssistantWorkflow AutomationRAG / Knowledge BaseMulti-AgentBrowser AutomationLLM InfraDev ToolingObservability

Not affiliated with Anthropic, OpenAI or Microsoft.