Tools Categories Trending New Compare

RAG / Knowledge Base/

chinese-llm-benchmark

chinese-llm-benchmark

Active·★ 6.1k·Updated 2026-05-23

★ Trending★ Essential

ReLE评测：中文AI大模型能力评测（持续更新）：目前已囊括335个大模型，覆盖chatgpt、gpt-5.2、o4-mini、谷歌gemini-3-pro、Claude-4.5、文心ERNIE-X1.1、ERNIE-5.0-Thinking、qwen3-max、百川、讯飞星火、商汤senseChat等商用模型，以及kimi-k2、ernie4.5、minimax-M2、deepseek-v3.2、qwen3-2507、llama4、智谱GLM-4.6、gemma3、mistral等开源大模型。不仅提供排行榜，也提供规模超200万的大模型缺陷库！方便广大社区研究分析、改进大模型。

ReLE Benchmark (formerly CLiB) provides a continuously updated evaluation for Chinese AI large language models, covering over 337 commercial and open-source LLMs. It offers multi-dimensional capability assessments across various domains, along with comprehensive rankings and a large defect library for model improvement.

#LLM Evaluation#Chinese LLMs#AI Benchmark#Model Ranking#Defect Analysis#Data Analysis#Communication

↗ Visit site ★ GitHub

01

Features

01Extensive coverage of 337+ commercial and open-source Chinese LLMs.

02Multi-dimensional evaluation across 7 main domains and ~300 sub-dimensions.

03Provides detailed ranking lists for various capabilities and specific domains.

04Offers a large defect library with over 2 million LLM flaws for research and improvement.

05Supports customized model selection and free evaluation services for private models.

02

Compatibility

OpenAI (GPT series)

Supported

Verified via docs

Google (Gemini series)

Supported

Verified via docs

Anthropic (Claude series)

Supported

Verified via docs

Baidu (ERNIE series)

Supported

Verified via docs

Alibaba (Qwen series)

Supported

Verified via docs

DeepSeek

Supported

Verified via docs

03

Use cases

↳Comparing and selecting the best performing LLMs for specific applications.

↳Identifying weaknesses and improving the capabilities of large language models.

↳Benchmarking private or custom LLMs against public models for performance and cost optimization.

04

Alternatives

mindsdb★ 39.2k

Federated Query Engine for AI - The only MCP Server you'll ever need

Brave Search MCP★ 86.5k

Allow your AI Agent to search the real-time internet using Brave Search API. Essential for getting up-to-date information.

Claude Flow★ 56.4k

The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms.

CopilotKit★ 31.8k

React UI + elegant infrastructure for AI Copilots, AI chatbots, and in-app AI agents. The Agentic Frontend.

awesome-n8n-templates★ 22.6k

Supercharge your workflow automation with this curated collection of n8n templates! Instantly connect your favorite apps-like Gmail, Telegram, Google Drive, Slack, and more-with ready-to-use, AI-powered automations. Save time, boost productivity, and unlock the true potential of n8n in just a few clicks.

dagster★ 15.6k

An orchestration platform for the development, production, and observation of data assets.

genai-toolbox★ 15.4k

MCP Toolbox for Databases is an open source MCP server for databases.

mcp-chrome★ 11.8k

Chrome MCP Server is a Chrome extension-based Model Context Protocol (MCP) server that exposes your Chrome browser functionality to AI assistants like Claude, enabling complex browser automation, content analysis, and semantic search.

See all alternatives →

Related searches

chinese-llm-benchmark Alternatives Best RAG / Knowledge Base Tools 2026 Open Source RAG / Knowledge Base chinese-llm-benchmark Tutorial chinese-llm-benchmark Vs Competitors LLM Evaluation Chinese LLMs AI Benchmark

Comments

Log in to leave a comment

R
Rebel BrownMay 22, 2026
The reliable agent design scales well from prototype to production — 5、minimax-m2、deepseek-v3. Good documentation, reduces onboarding time.
T
Taylor ZhangMay 3, 2026
The clean approach to agent memory is more reliable than alternatives — rele评测：中文ai大模型能力评测（持续更新）：目前已囊括335个大模型，覆盖chatgpt、gpt-5. Would recommend for clean use cases.
R
Robin BrownMar 29, 2026
The robust agent design scales well from prototype to production. Runs fine on Python 3.11.
S
Sam JacksonMar 14, 2026
The solid approach to agent memory is more reliable than alternatives. The maintainers are responsive to issues.

On this page

01Features 02Compatibility 03Use cases 04Alternatives

Stats

GitHub Stars★ 6.1k

Last commit1w ago

StatusActive

License—

CategoryRAG / Knowledge Base

Trend (30d)

+0.2k↑ 4.6%

Links

Documentation↗Discussion↗Issues↗Releases↗

Deploy on DigitalOcean — Get $200 Free Credit

© 2026 AgentIndex.app|Built by a 10-year iOS Developer.

QYS GitHub Buy me a coffee ☕

Browse by Category

Code Assistant Workflow Automation RAG / Knowledge Base Multi-Agent Browser Automation LLM Infra Dev Tooling Observability

Not affiliated with Anthropic, OpenAI or Microsoft.