vmlx
vMLX - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers MLX Studio. Image gen/edit, OpenAI/Anth
vMLX is a local AI inference engine for Apple Silicon Macs that runs LLMs, VLMs, and image generation models. It provides OpenAI and Anthropic compatible APIs with advanced features like continuous batching, prefix caching, KV cache quantization, speculative decoding, and tool calling. No cloud or API keys required, ensuring data privacy.
Features
Compatibility
Quick start
Use cases
Alternatives
Related searches
Comments
- DDylan GarciaMar 31, 2026
Setup was straightforward, seamless config and running in minutes. Integrates well with existing cont setups.
- AAvery AndersonMar 12, 2026
Used this for reliable automation — reliable under load — image gen/edit, openai/anth. The maintainers are responsive to issues.
- EEmerson GarciaMar 11, 2026
Setup was straightforward, reliable config and running in minutes — image gen/edit, openai/anth. Integrates well with existing batch setups.
- PPeyton DavisFeb 28, 2026
Used this for clean automation — reliable under load. Integrates well with existing vmlx setups.