on-policy: This repository implements MAPPO, a multi-agent variant of PPO, widely used in cooperative multi-agent games and research. It provides robust implementations for various multi-agent environments like StarCraft II, Hanabi, and Google Research Football, along with detailed training scripts and hyperparameter guidance.; rllm: rLLM is an open-source framework designed for post-training language agents using reinforcement learning. It allows users to easily build, train, and deploy custom agents and environments for real-world workloads.
Research and experimentation in cooperative multi-agent reinforcement learning
Training powerful coding models for tasks like code generation and bug fixing.