Using Claude for security research in 2026: what works, what doesn't, what's locked behind Glasswing

The release of Claude Opus 4.8 in early 2026 and the Mythos Preview gating in April have made AI-assisted security research a real discipline rather than a demo. This guide covers what's actually working — workflows that have produced shipped findings — and the ceiling of what publicly available models can do today.

The state of public-tier capability

What you can do with a normal API key in mid-2026:

Protocol audits (consensus rules, cryptographic protocols, smart contracts) — proven on the Zcash Orchard bug found with Opus 4.8 in May 2026.
Code-pattern vulnerability detection at scale — Project Glasswing partners have surfaced 23K+ candidate findings, 6K+ confirmed severe.
Patch generation for confirmed findings — Mythos partners now routinely use the model to write fixes for what it surfaces.
Pre-release security review of new feature PRs — same workflow as code review, with security-specific framing.
Penetration-test scenario planning — attack-surface enumeration, prioritisation, exploit sequence sketching.

What's still hard for public-tier models:

End-to-end autonomous offensive ops. The Mythos Preview gate exists because that's where the capability cliff is. Public Opus 4.8 can co-pilot offensive research; Mythos can drive much more of it autonomously.
Live-target red-team operations without supervision. Models hallucinate target state, mis-read responses, take side paths. Production red-team work still needs a human in the loop.
Novel cryptographic primitive analysis from first principles. Public models are excellent at finding implementation bugs against a known spec, less reliable at proving novel primitives.

The workflow patterns that work

Pattern 1: Spec-vs-implementation diff

The single most productive pattern. Load the formal spec + the implementation, ask the model to enumerate every spec constraint and report whether the implementation enforces it precisely.

const review = await client.messages.create({
  model: "claude-opus-4-8",
  system: [
    {
      type: "text",
      text: AUDIT_SYSTEM_PROMPT,
      cache_control: { type: "ephemeral" },
    },
    {
      type: "text",
      text: protocolSpec, // RFC / paper / written spec
      cache_control: { type: "ephemeral" },
    },
    {
      type: "text",
      text: implementationCode,
      cache_control: { type: "ephemeral" },
    },
  ],
  messages: [{
    role: "user",
    content: `For each invariant the spec requires, walk through how the
implementation enforces it. Flag any check that is loose,
ambiguous, or absent.`,
  }],
});

This is the pattern that found the Zcash bug. The cached prefix (system prompt + spec + code) makes repeated questions cheap.

Pattern 2: Diff-based PR review

For an engineering team, this is the pattern with the best defects-found-per-dollar:

const review = await client.messages.create({
  model: "claude-opus-4-8",
  system: SECURITY_REVIEW_PROMPT,
  messages: [{
    role: "user",
    content: `Review this diff with a security lens. Focus on:
- Auth bypass paths
- Injection vectors (SQL, command, path traversal)
- Race conditions in concurrent paths
- Missing input validation
- Cryptographic primitive misuse
 
<diff>
${prDiff}
</diff>`,
  }],
});

Run as a GitHub Action on every PR. Two months of dogfooding suggests this catches roughly 1 actual security-relevant issue per 50 PRs reviewed — at $0.05-0.20 per PR on Opus 4.8, the ROI is large.

Pattern 3: Attack-surface enumeration

For a new service or feature, ask the model to enumerate the attack surface from a threat-model perspective:

const enumeration = await client.messages.create({
  model: "claude-opus-4-8",
  messages: [{
    role: "user",
    content: `You are a senior security architect. The system below
is being introduced. Enumerate every plausible attack surface,
ranked by impact × likelihood, and propose a minimum-viable
mitigation for each.
 
<system-design>
${design}
</system-design>`,
  }],
});

Output: a 20-50 item ranked threat model in 30 seconds. Useful as the first draft of a STRIDE analysis, not the final word.

Pattern 4: Triage queue prioritisation

When you have a large queue of dependency CVEs or static-analysis findings, route the queue through the model for severity assessment in your specific context:

const triage = await client.messages.create({
  model: "claude-sonnet-4-6", // cheap enough for batch
  system: TRIAGE_SYSTEM_PROMPT,
  messages: [{
    role: "user",
    content: `For each finding, classify as:
- urgent (live exploit risk in our deployed config)
- important (deferred but real)
- noise (not exploitable in our usage)
 
<findings>${batch}</findings>`,
  }],
});

Sonnet 4.6 handles this well enough at 1/5th the Opus cost. Add Opus 4.8 escalation for items the model marks as "uncertain."

What models to use for what

Task	Recommended	Why
Protocol audit (deep)	Opus 4.8	Needs the strongest reasoning + dense context handling
PR security review	Opus 4.8 (cached)	Catch rate matters more than cost per call
Triage / classification at scale	Sonnet 4.6	Cost dominates, quality is sufficient
Threat-model first draft	Opus 4.8	Quality of reasoning shows
Bulk static-analysis filter	Haiku 4.5	Cheapest tier, sufficient for noise filtering
Exploit construction	Opus 4.8	Requires multi-step code reasoning
Novel cryptography analysis	Hold for Mythos	Public tier insufficient — wait for GA

What Glasswing changes (and doesn't)

The Mythos Preview + Project Glasswing program ships defensive AI to ~150 vetted organizations. For everyone else — the 99.9% of security teams not in Glasswing — what's accessible publicly is Opus 4.8. The capability gap is real (25% better on CyberGym for Mythos), but Opus 4.8 is already enough to find live four-year-old bugs in production cryptography.

Three things Glasswing doesn't change for the average team:

You can do AI-assisted security research today. Opus 4.8 is public. The Zcash bug is the proof.
The methodology transfers across model generations. Spec-vs- implementation diff, diff-based PR review, attack-surface enumeration — these patterns will work on whatever Mythos-class GA model arrives next.
The economics already justify it. Anvat customers running these workflows are paying ~$200-600/month in Opus tokens to drive ~$50K-5M of avoided incident cost.

What's coming

Three trends to watch over the next 6 months:

Mythos-class general availability. Anthropic has committed to shipping it "once safeguards are ready." Probable: late June or July 2026, initially gated to security-specific products.
Sector-specific defensive coalitions. Glasswing's pattern will replicate in finance, energy, telecoms. Expect 3-5 sector-specific defensive AI programs by year-end.
AI-driven offensive parity. Independent researchers are at the capability level that was nation-state-only in 2023. Plan defense assuming the same uplift exists for offense.

Where to start

If you've never run an AI-assisted security workflow:

Get an API key. Anvat issues one in 30 seconds, $2 free credit, no card required. Same wire format as direct Anthropic at 30% off list.
Pick one codebase you own and the spec it implements. Even internal docs count as a spec.
Run the spec-vs-implementation diff pattern above. Budget 30 minutes and $5 of tokens for the first audit.
Validate findings manually. The model will surface real bugs and also fluent-sounding non-bugs. Triage before disclosing.

The Zcash bug was found by one researcher with one API key. The research methodology is publicly documented. The capability is sitting in your account waiting to be used.

How Opus 4.8 found the Zcash bug → Claude Mythos & Project Glasswing → Security research use case setup →

Run security audits on Opus 4.8 at 30% off

Anvat passes Anthropic's full feature set through (prompt caching for repeated audit context, tool use, MCP) at metered rates 30% below list. $2 free credit on signup.

Start free → →