Engineering Blog

Opinionated technical deep dives into agent architecture, local LLMs, and prompt economics.

March 12, 2026

How We Rebuilt Tool Calling: From 75 Schemas to Smart Selection

Anthropic's advanced tool use research showed us the path. We adapted it for every provider — and for local models that can't even call tools natively.

March 4, 2026

The MCP Token Tax: How to Manage 200+ Tools on an 8K Budget

Dumping 200 MCP tool schemas into a local model's context is a great way to ensure it never answers your question. Here is how we fixed the 'amnesiac expert' problem.

February 25, 2026

Deterministic UI in the Age of Streaming AI (And Why It Keeps Breaking)

AI doesn't return clean JSON — it streams chaotic fragments of text, tool calls, and errors in real time. Here's what it took to render that smoothly, and why it still breaks.

February 25, 2026

The Context Cost: Why Context Management is the Hardest Problem in Local AI

When you build a local-first AI app, you hit a brutal hardware constraint almost immediately: the context window. Every token is precious. Here's how we're managing it.

February 23, 2026

The Reality of 'Local Agents': API Tools vs. Prompt Engineering

Building native cloud APIs is easy. Getting a local Llama-3 model to execute shell commands requires the dark arts of prompt engineering and a lot of regex.

February 23, 2026

The Context Sandwich: Why 'Infinite Memory' is a Trap

Everyone wants an infinite context window. The math says no. Here's how we compress history to keep the system prompt from getting lost in the middle.

February 23, 2026

The JSON Mode Tax: Why Your Agent Bills Are Bloated

We love JSON mode because it makes agents reliable. But forcing a model to think in JSON burns 3x more tokens than letting it reason in plain text.

February 23, 2026

The Middleman Attack: Why Your AI Wrapper is Logging Your Code

If you aren't using your own API key, you're trusting a middleman with your codebase. Here's how the big three handle your data differently depending on how you access them.