Claude Mythos benchmark tracker — Opus 4.8 vs Mythos deep-dive (live) · Anvat Blog

Claude Mythos is Anthropic's most capable model to date — gated behind Project Glasswing, accessible only to ~150 vetted organizations as of June 2026. Public benchmark numbers are sparse and will stay sparse until Anthropic moves Mythos toward general availability. This page is the tracker — what's verified, what's claimed, and where it stands relative to publicly-available Opus 4.8.

Published numbers (verified)

What Anthropic has stated publicly about Mythos in the April 7 launch announcement + Glasswing expansion materials:

Benchmark	Mythos	Opus 4.8 (public)	Delta
SWE-Bench Verified	93.9%	81.5%	+12.4
CyberGym (vulnerability reproduction)	83.1%	~62%	+21
Agentic security tasks (internal benchmark)	Anthropic-reported "substantially higher"	baseline	—

That's the entire publicly verifiable set as of June 6, 2026. Everything else is claim-without-numbers, leaked guesses, or extrapolation.

Pricing (verified)

Mythos (partner pricing via Glasswing): $25 input / $125 output per MTok
Opus 4.8 (public): $5 input / $25 output per MTok (Anthropic dropped the Opus tier from $15/$75 to $5/$25 with the 4.5 release — current Opus 4.8 still ships at this rate)
Opus 4.8 via Anvat: $3.50 input / $17.50 output per MTok (30% off the new $5/$25 list)

So Mythos is now 5× more expensive at list than the current public Opus 4.8 — and ~7.1× more expensive than Opus 4.8 through Anvat. The Opus 4.8 price drop tilted this comparison sharply since December 2025.

For a typical agent turn (5K input + 800 output), the per-request comparison is:

Model	Per request	Per million requests
Mythos (list)	$0.225	$225K
Opus 4.8 (Anthropic list)	$0.045	$45K
Opus 4.8 (Anvat)	$0.032	$32K

Where Mythos is reported to lead

Per Anthropic's launch materials and partner reports (no benchmark publication yet):

Multi-step agentic security tasks — the headline use case for Glasswing partners.
Long-horizon coding — workflows that need to run 10+ tool calls without losing track of state.
Adversarial reasoning — finding novel attack patterns rather than recognizing known ones.

What's NOT yet shown publicly:

Humanity's Last Exam (Mythos number not published)
ARC-AGI-2 (no number)
MRCR long-context recall (no number)
HMMT math (no number)
Multimodal benchmarks (no number)

Where Opus 4.8 still likely wins

Even without Mythos numbers, the structural factors that favor Opus 4.8 today:

Cost. ~50% cheaper at list, ~62% cheaper through Anvat.
Availability. Public, no waiting list.
Ecosystem. Every Claude Code, Cursor, MCP integration that exists works against Opus 4.8 today. Mythos integrations are per-partner, with separate access agreements.
Predictability. Opus 4.8 is in active production at thousands of teams; failure modes are documented. Mythos behavior is preview-stage even for Glasswing partners.

What we expect to update

When Anthropic releases Mythos benchmark numbers — likely tied to a GA announcement or the GA-equivalent "Mythos 1" public release — we'll fill in:

Per-benchmark Mythos scores
Public access path + pricing tier(s)
Anvat support timeline (we will day-0 add Mythos to the gateway when API access is publicly available)

Watch this page for the update.

The honest framing for builders

Until Mythos is publicly available, Opus 4.8 is the right default for the workloads Mythos was designed for. The Zcash bug discovery from May 29, 2026 happened on publicly available Opus 4.8 — not Mythos. The capability ceiling on Opus 4.8 is already high enough to find real security findings in production protocols. Mythos raises the ceiling further, but the practical workflow doesn't change.

If your team needs Mythos-class capability today and you can pass Glasswing vetting, apply via the Claude Console. If you can't or don't want to wait, Opus 4.8 through Anvat at 30% off is the path that ships now.

Run Opus 4.8 at 30% off list while you wait for Mythos

Anvat is Anthropic-compatible. Day-0 Mythos support when public API drops. Two env vars to switch — same SDK, lower bill.

Try free → →