chinese-llm-benchmark: ReLE Benchmark (formerly CLiB) provides a continuously updated evaluation for Chinese AI large language models, covering over 337 commercial and open-source LLMs. It offers multi-dimensional capability assessments across various domains, along with comprehensive rankings and a large defect library for model improvement.; llama-cpp-agent: llama-cpp-agent is a Python framework for interacting with LLMs running via llama.cpp. It provides a unified interface for chat, structured function calls, and JSON-formatted output — including models not explicitly fine-tuned for function calling. Developers can define tools and callable functions that the agent invokes directly, making it practical for building local agentic workflows without cloud dependencies.
Comparing and selecting the best performing LLMs for specific applications.
Building local agentic pipelines with open-source LLMs