FunASR
Active·★ 16.6k·MIT·Updated 2026-05-29
★ Most Popular★ Voice / Speech★ LLM Infra
Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.
FunASR is a fundamental end-to-end speech recognition toolkit. It offers industrial-grade speech recognition, being 170x faster than Whisper, supporting over 50 languages, and integrating features like speaker diarization, emotion detection, and streaming.
#asr#audio#chinese#emotion-recognition#mcp-server#mcp-servers#multilingual-asr#openai-compatible-api
01
Features
01Extremely fast (170x faster than Whisper)
02Supports 50+ languages
03Built-in Speaker Diarization
04Emotion Detection
05Streaming ASR and vLLM Acceleration
02
Compatibility
PyTorch
PyTorch
Verified via docs
GPU
GPU (CUDA)
Verified via docs
CPU
CPU
Verified via docs
Docker
Docker
Verified via docs
03
Quick start
1
$ pip install funasr
04
Use cases
↳Meeting transcription with speaker labels, timestamps, and punctuation
↳Deployment as an OpenAI-compatible API server
↳Integration with AI agents (e.g., Claude, LangChain, Dify, AutoGen)
05
Alternatives
Related searches
Comments
Log in to leave a comment
- PPeyton DavisMay 3, 2026
Language coverage beyond English is a meaningful differentiator.
- AAlex RiveraApr 3, 2026
Active development from Alibaba's speech research team, keeps improving.
- SSterling LewisMar 26, 2026
170x realtime speech recognition across 50+ languages is genuinely industrial-grade.
- BBlake MartinezMar 24, 2026
Good for teams building speech-enabled AI applications that need production ASR.