Anvat / Writing

Field notes from the LLM gateway.

Pricing math, Claude Code playbooks, model comparisons, and postmortems — written for developers shipping with frontier models in production.

Jun 6, 20267 min readcoding-agents / tutorial
How to build a production coding agent in 2026 (architecture + cost guide)
Practical guide to building a coding agent that ships — tool design, model routing, prompt caching, cost control, and the patterns that actually scale from prototype to production.
Read post →
Jun 6, 20267 min readclaude / pricing
Cheap Claude API in 2026: four legitimate ways to cut the bill (and three you should avoid)
Real strategies for cutting your Anthropic Claude API spend without sacrificing quality — prompt caching, batch API, model routing, and discounted gateways. Plus the dodgy resellers you should walk away from.
Read post →
Jun 6, 20267 min readpricing / claude
Claude API pricing in 2026: a complete breakdown (Opus 4.8, Sonnet 4.6, Haiku 4.5)
The full Anthropic Claude API price list for 2026 — every model, input/output rates, prompt caching discounts, batch API savings, and how to cut the bill by ~50% with a discounted gateway.
Read post →
Jun 6, 20266 min readclaude-code / tutorial
Setting up Claude Code with a custom base URL (ANTHROPIC_BASE_URL guide for 2026)
Step-by-step guide to pointing Claude Code at a custom Anthropic-compatible gateway. Two environment variables, one verification command, common gotchas, and how to enable gateway model discovery.
Read post →
Jun 6, 20266 min readclaude-code / mcp
Claude Code MCP setup guide: every server worth installing in 2026
Practical setup guide for Claude Code's Model Context Protocol — which MCP servers actually improve productivity, how to configure them safely, and the security model you need to understand.
Read post →
Jun 6, 20266 min readclaude-code / cursor
Claude Code vs Cursor in 2026: which AI coding tool actually wins (and when to use both)
An honest comparison of Claude Code and Cursor — autonomous terminal agent vs AI-native IDE. Real pricing, real workflows, and why most production developers run both side by side.
Read post →
Jun 6, 20268 min readclaude / mythos
Claude Mythos explained: Project Glasswing, access status, and what it means for the API market (June 2026)
What Claude Mythos actually is, how Project Glasswing works, the benchmark scores vs Opus 4.6, who has access, the $25/$125 pricing, and what the gated-frontier model strategy means for buyers.
Read post →
Jun 6, 20267 min readclaude / security
Using Claude for security research in 2026: what works, what doesn't, what's locked behind Glasswing
Practical guide to AI-assisted security research with Claude — which workflows produce real findings, the Opus 4.8 ceiling for public-tier work, and what the gated Mythos-class capability changes.
Read post →
Jun 6, 20267 min readclaude / skills
Claude Skills: the open standard 32+ AI tools now support (2026 guide)
Anthropic's Agent Skills shipped October 2025, went open standard December 2025, and are now supported by Cursor, Codex CLI, Gemini CLI, VS Code, JetBrains Junie, and 32+ other tools. The practical guide.
Read post →
Jun 6, 20269 min readcursor / antigravity
Cursor 3.0 vs Antigravity 2.0 vs Devin Desktop — the 2026 AI IDE war
Three AI IDEs are competing for the same $20/month seat: Cursor 3.0 with parallel agents, Google Antigravity 2.0 with Browser Subagent, and Devin Desktop (formerly Windsurf, acquired by Cognition). Honest comparison.
Read post →
Jun 6, 20267 min readdeepseek / open-source
DeepSeek V4 Pro deep-dive: 1M context, 80.6% SWE-Bench, and the 1/30th cost claim (2026)
DeepSeek V4 Pro hits 80.6% SWE-Bench Verified (within 0.2 points of Claude Opus 4.6) and 93.5 LiveCodeBench (highest of any model) at $1.74/$3.48 per MTok. The full benchmark + when to actually use it.
Read post →
Jun 6, 20269 min readregulation / eu-ai-act
EU AI Act enforcement starts August 2, 2026 — the GPAI builder's guide
The EU AI Office gets enforcement powers over general-purpose AI providers on August 2, 2026. Fines up to 3% of global turnover. What changes, what to ship before then, and what it means for AI buyers.
Read post →
Jun 6, 20266 min readgemini / google
Gemini 3.5 Pro launch tracker (June 2026) + Flash benchmark deep-dive
Gemini 3.5 Flash is shipping now and already beats Gemini 3.1 Pro on agentic + coding benchmarks. Pro lands in June. Full benchmark table, regression points, and what to do until Pro arrives.
Read post →
Jun 6, 20265 min readgpt-5 / openai
GPT-5.6 release tracker: iris-alpha, the Codex leak, and what's actually known (June 2026)
Everything verifiable about GPT-5.6 — the iris-alpha codename, the Codex log leak, Polymarket's 89% June-30 odds, the 1.5M context rumor, and what builders should plan around.
Read post →
Jun 6, 20267 min readgpt-5 / claude-opus
GPT-5 vs Claude Opus 4.8: honest comparison for 2026 production workloads
Side-by-side comparison of GPT-5 and Claude Opus 4.8 — pricing, benchmarks, agentic reliability, context window, multimodal. The decision matrix for picking between OpenAI and Anthropic's flagship models.
Read post →
Jun 6, 20264 min readclaude / mythos
Claude Mythos benchmark tracker — Opus 4.8 vs Mythos deep-dive (live)
Living comparison of Claude Mythos vs Opus 4.8 on every published benchmark. Currently sparse — Anthropic has released limited public numbers. Updated as data lands.
Read post →
Jun 6, 20267 min readgateway / openrouter
OpenRouter alternatives in 2026: an honest comparison (LiteLLM, Portkey, Helicone, Anvat)
A no-bullshit guide to the LLM gateway landscape in 2026. When OpenRouter is the right pick, when it isn't, and which alternative fits your actual workload — with real pricing and the supply-chain footnotes nobody else mentions.
Read post →
Jun 6, 20268 min readclaude / caching
Prompt caching deep-dive: how to cut Claude input costs by 90% (2026 guide)
Everything you need to know about Anthropic prompt caching — cache_control semantics, 5-minute vs 1-hour TTL, what to cache and what not to, and the cache-hit math that lets a single optimisation cut your bill 60-80%.
Read post →
Jun 6, 20267 min readclaude / vertex
Vertex AI vs AWS Bedrock vs Anthropic direct: the real differences for Claude in 2026
Side-by-side comparison of accessing Claude through Google Vertex AI, AWS Bedrock, and Anthropic direct API — pricing, latency, region availability, feature parity, and when each makes sense.
Read post →
Jun 6, 20267 min readclaude-opus / security
How Claude Opus 4.8 found the four-year-old Zcash zero-knowledge proof bug (May 2026)
A security researcher used Claude Opus 4.8 to discover a critical soundness bug in Zcash's Orchard circuit that had been live since 2022. What it found, how the workflow worked, and what it means for AI-assisted security research.
Read post →

How to build a production coding agent in 2026 (architecture + cost guide)

Cheap Claude API in 2026: four legitimate ways to cut the bill (and three you should avoid)

Claude API pricing in 2026: a complete breakdown (Opus 4.8, Sonnet 4.6, Haiku 4.5)

Setting up Claude Code with a custom base URL (ANTHROPIC_BASE_URL guide for 2026)

Claude Code MCP setup guide: every server worth installing in 2026

Claude Code vs Cursor in 2026: which AI coding tool actually wins (and when to use both)

Claude Mythos explained: Project Glasswing, access status, and what it means for the API market (June 2026)

Using Claude for security research in 2026: what works, what doesn't, what's locked behind Glasswing

Claude Skills: the open standard 32+ AI tools now support (2026 guide)

Cursor 3.0 vs Antigravity 2.0 vs Devin Desktop — the 2026 AI IDE war

DeepSeek V4 Pro deep-dive: 1M context, 80.6% SWE-Bench, and the 1/30th cost claim (2026)

EU AI Act enforcement starts August 2, 2026 — the GPAI builder's guide

Gemini 3.5 Pro launch tracker (June 2026) + Flash benchmark deep-dive

GPT-5.6 release tracker: iris-alpha, the Codex leak, and what's actually known (June 2026)

GPT-5 vs Claude Opus 4.8: honest comparison for 2026 production workloads

Claude Mythos benchmark tracker — Opus 4.8 vs Mythos deep-dive (live)

OpenRouter alternatives in 2026: an honest comparison (LiteLLM, Portkey, Helicone, Anvat)

Prompt caching deep-dive: how to cut Claude input costs by 90% (2026 guide)

Vertex AI vs AWS Bedrock vs Anthropic direct: the real differences for Claude in 2026

How Claude Opus 4.8 found the four-year-old Zcash zero-knowledge proof bug (May 2026)