Insights

Lessons from shipping AI in production.

Real-world notes on voice agents, RAG, autonomous systems, and the trade-offs that matter when AI meets production.

Agents·Jun 15, 2026

The Week a Government Cut Off Anthropic's Best Model

Fable 5 suspended by US export control on June 12, two legacy models retired June 15, and the SDK shipped model fallback for every failure mode. Same week: the risk and the answer. Here's how to harden your stack.

7 min read

Agents·Jun 8, 2026

The Anthropic SDK Middleware: Stop Writing Your Own Tracing Wrappers

The Anthropic SDK shipped a native middleware API, the agent SDK pushed 10 releases in 7 days, and Nuxt 4.4.7 is a security hotfix. Quarterly dependency reviews are now too slow for production AI.

7 min read

Agents·Jun 1, 2026

Opus 4.8 and Dynamic Workflows: Claude Code Just Got an Orchestration Layer

Claude Code v2.1.154 shipped Opus 4.8 with dynamic workflows — background multi-agent orchestration at scale. Here's what actually changed and what it means for teams building agents.

6 min read

Agents·May 25, 2026

The Week Anthropic Claimed the Full Stack

Project Glasswing went to public beta. Stainless — the company behind all Anthropic SDKs — was acquired. Seven agent SDK releases in four days. The platform era is here.

7 min read

Agents·May 18, 2026

Six Releases in Eleven Days: What Google's Pre-I/O Sprint Signals

@google/genai shipped Agent and Environment APIs today — days before Google I/O. The SDK velocity tells you what's coming before the keynote does.

6 min read

Agents·May 16, 2026

Agent Infrastructure Is Hardening — What to Own, What to Delegate

Claude Agent SDK jumped to 0.3.x, Remote Agents went live, Gemini SDK shipped four versions in eight days. The agent infrastructure layer is moving.

6 min read

Lessons·May 10, 2026

Building AI Worldbuilding Pipelines: 4 Novels, 4 Albums, 7 Champions

Most AI agencies build products. We build worlds — and worlds you can read, listen to and walk through. Codemachia is our 7-sovereign-AI transmedia universe: four published novels (~297,000 words), four music albums (52 tracks), seven champions, 46 Codex fragments, bilingual EN/FR. Here's the discipline.

14 min read

Platforms·May 9, 2026

Choosing Your LLM in 2026 — Claude, Gemini, Mistral, OpenAI by Use-Case

Don't pick on benchmarks. Pick by use-case. Here is the decision tree we run for every new AI product, with the model we actually ship for each task.

12 min read

Agents·May 9, 2026

Forced Tool Calling — How to Kill the Almost-Right Sentence in Production Chatbots

The failure mode that takes down most production conversational agents isn't hallucination — it's the sentence that sounds confident and is almost right. Here is the architecture that fixes it.

11 min read

Lessons·May 9, 2026

Build an LLM Eval Pipeline in 2 Days, Not 2 Weeks

Most teams ship AI features without eval. They flip a coin every PR. A small eval set built right takes two days and pays back forever — here is the minimum viable version.

11 min read

Platforms·May 9, 2026

Multi-Tenancy Patterns for AI SaaS — Org Isolation, Quotas, Billing

Most SaaS products try to be multi-tenant from day one. Most get it wrong. Here are the patterns that actually ship — org isolation, per-org quotas, role-based collaboration, and the migration to schema-per-tenant when you outgrow shared DB.

12 min read

Platforms·May 9, 2026

Prompt Caching with Claude — What `cache_control: ephemeral` Actually Saves

Anthropic prompt caching can cut your bill 80–95% on the right shapes. It can also do nothing at all if you mis-order your blocks. The patterns, the pitfalls, and the numbers from production.

10 min read

Voice AI·Apr 22, 2026

ElevenLabs vs Vapi vs Retell — Voice AI Platform Comparison 2026

Side-by-side comparison of the three leading voice AI platforms in 2026 — latency, languages, pricing, integrations, and what we ship in production at Ikki.

9 min read

RAG·Apr 8, 2026

RAG vs Agentic — How to Choose, and How to Ship RAG When You Need It (2026)

Most teams reach for RAG by default. Most don't need it. Here's how to decide between RAG and an agentic + tool-call architecture, and how to ship RAG correctly when it's the right call.

12 min read

Voice AI·Mar 25, 2026

Voice AI Agent Cost: How Much Does It Really Cost in 2026?

Real-world numbers from voice AI projects we shipped: build cost, monthly run cost, hidden expenses, and how to avoid common pricing traps.

8 min read

Lessons·Mar 12, 2026

Lessons from Shipping AI Products

What we learned shipping voice agents, RAG platforms, fintech engines, civic AI, and immersive web — the patterns that worked, the ones that didn't, and the things nobody told us.

11 min read

Platforms·Feb 12, 2026

Why We Chose Nuxt 4 for AI Products in 2026

After shipping AI products to production, here's the architecture we converged on — Nuxt 4 + Fastify + MongoDB — and why it beats Next.js, Astro, and SvelteKit for our use case.

9 min read

SHIP LOG

SHIP-0247·CODEMACHIA·v1.4.2—DEPLOYED 2026-06-18 14:22 UTC