houtini-lm
Local or Cloud LLM support via MCP for your AI assistant with Houtini-LM - uses OpenAPI API for LM Studio, Cloud API and Ollama Compatibility. Save tokens by offloading some grunt work for your API - our tool description helps claude decide what work to assign and why.
Houtini LM connects Claude Code to a local LLM server or any OpenAI-compatible API, offloading bounded tasks to reduce token costs. It provides tools, performance tracking, and model routing for efficient delegation. Claude remains the orchestrator for complex reasoning, while cheap local models handle grunt work.
Features
Compatibility
Quick start
Use cases
Alternatives
Related searches
Comments
- MMarlowe WilsonMay 3, 2026
Switching between local and cloud LLMs via MCP without changing agent code is very useful
- DDrew GarciaApr 21, 2026
Local inference fallback when cloud APIs are slow or expensive works transparently
- RRiley WhiteMar 28, 2026
Good for keeping costs down on routine tasks while using cloud for complex ones
- DDrew PatelMar 5, 2026
The OpenAPI integration means any local model with a REST endpoint just works