Claude Mythos is Anthropic's most capable model to date — gated behind Project Glasswing, accessible only to ~150 vetted organizations as of June 2026. Public benchmark numbers are sparse and will stay sparse until Anthropic moves Mythos toward general availability. This page is the tracker — what's verified, what's claimed, and where it stands relative to publicly-available Opus 4.8.
Published numbers (verified)
What Anthropic has stated publicly about Mythos in the April 7 launch announcement + Glasswing expansion materials:
| Benchmark | Mythos | Opus 4.8 (public) | Delta |
|---|---|---|---|
| SWE-Bench Verified | 93.9% | 81.5% | +12.4 |
| CyberGym (vulnerability reproduction) | 83.1% | ~62% | +21 |
| Agentic security tasks (internal benchmark) | Anthropic-reported "substantially higher" | baseline | — |
That's the entire publicly verifiable set as of June 6, 2026. Everything else is claim-without-numbers, leaked guesses, or extrapolation.
Pricing (verified)
- Mythos (partner pricing via Glasswing): $25 input / $125 output per MTok
- Opus 4.8 (public): $15 input / $75 output per MTok
- Opus 4.8 via Anvat: $10.50 input / $52.50 output per MTok (30% off)
So Mythos is ~67% more expensive at list than Opus 4.8 — and ~138% more expensive than Opus 4.8 through Anvat.
For a typical agent turn (5K input + 800 output), the per-request comparison is:
| Model | Per request | Per million requests |
|---|---|---|
| Mythos (list) | $0.225 | $225K |
| Opus 4.8 (Anthropic list) | $0.135 | $135K |
| Opus 4.8 (Anvat) | $0.095 | $95K |
Where Mythos is reported to lead
Per Anthropic's launch materials and partner reports (no benchmark publication yet):
- Multi-step agentic security tasks — the headline use case for Glasswing partners.
- Long-horizon coding — workflows that need to run 10+ tool calls without losing track of state.
- Adversarial reasoning — finding novel attack patterns rather than recognizing known ones.
What's NOT yet shown publicly:
- Humanity's Last Exam (Mythos number not published)
- ARC-AGI-2 (no number)
- MRCR long-context recall (no number)
- HMMT math (no number)
- Multimodal benchmarks (no number)
Where Opus 4.8 still likely wins
Even without Mythos numbers, the structural factors that favor Opus 4.8 today:
- Cost. ~50% cheaper at list, ~62% cheaper through Anvat.
- Availability. Public, no waiting list.
- Ecosystem. Every Claude Code, Cursor, MCP integration that exists works against Opus 4.8 today. Mythos integrations are per-partner, with separate access agreements.
- Predictability. Opus 4.8 is in active production at thousands of teams; failure modes are documented. Mythos behavior is preview-stage even for Glasswing partners.
What we expect to update
When Anthropic releases Mythos benchmark numbers — likely tied to a GA announcement or the GA-equivalent "Mythos 1" public release — we'll fill in:
- Per-benchmark Mythos scores
- Public access path + pricing tier(s)
- Anvat support timeline (we will day-0 add Mythos to the gateway when API access is publicly available)
Watch this page for the update.
The honest framing for builders
Until Mythos is publicly available, Opus 4.8 is the right default for the workloads Mythos was designed for. The Zcash bug discovery from May 29, 2026 happened on publicly available Opus 4.8 — not Mythos. The capability ceiling on Opus 4.8 is already high enough to find real security findings in production protocols. Mythos raises the ceiling further, but the practical workflow doesn't change.
If your team needs Mythos-class capability today and you can pass Glasswing vetting, apply via the Claude Console. If you can't or don't want to wait, Opus 4.8 through Anvat at 30% off is the path that ships now.
Related coverage
- Claude Mythos & Project Glasswing explained
- How Opus 4.8 found the Zcash ZK proof bug
- AI-assisted security research patterns
- Opus 4.8 model spec
- LLM benchmark comparison (all major models)
Run Opus 4.8 at 30% off list while you wait for Mythos
Anvat is Anthropic-compatible. Day-0 Mythos support when public API drops. Two env vars to switch — same SDK, lower bill.
Try free → →