chinese-llm-benchmark: ReLE Benchmark (formerly CLiB) provides a continuously updated evaluation for Chinese AI large language models, covering over 337 commercial and open-source LLMs. It offers multi-dimensional capability assessments across various domains, along with comprehensive rankings and a large defect library for model improvement.; Pydantic AI: Pydantic AI is a Python agent framework for building production-grade Generative AI applications with the ergonomics and type-safety similar to FastAPI. It offers a model-agnostic approach with deep integration into the Pydantic ecosystem, focusing on reliability and developer experience.
Comparing and selecting the best performing LLMs for specific applications.
Building production-grade Generative AI applications and workflows.