mcp-security-hub: This repository provides production-ready, Dockerized Model Context Protocol (MCP) servers for a wide array of offensive security tools. It enables AI assistants like Claude to interact with over 175 security tools for tasks such as vulnerability scanning, binary analysis, and web security assessments.; AgentBench: AgentBench is a comprehensive benchmark for evaluating Large Language Models (LLMs) as agents across diverse environments, now featuring a function-calling version integrated with AgentRL. It provides a containerized setup for various tasks like OS interaction, database operations, and web shopping, enabling robust and reproducible agent evaluation.
Conducting network reconnaissance to identify active hosts, services, and web technologies.
Systematically benchmark the performance of various LLM-based agents.