evidra: Evidra records intent, outcome, and refusal for every infrastructure mutation, enabling risk assessment, behavioral signal detection, and reliability scoring. It provides an MCP server and CLI for seamless integration with agents, pipelines, and scripts.; Auto-claude-code-research-in-sleep: Auto-claude-code-research-in-sleep (ARIS) is a set of custom Claude Code skills for autonomous ML research workflows. It orchestrates cross-model collaboration, with Claude Code executing research tasks and an external LLM (like GPT-5.4) critically reviewing. This system can autonomously discover ideas, run experiments, and write/refine research papers, allowing researchers to wake up to ready-to-submit results.
Agent benchmarking with real infrastructure scenarios
Explore new research areas and discover novel ideas through literature surveys and brainstorming.