deepseek-v4-flashDeepSeek V4 Flash
Ultra-cheap open-weight tier — $0.14/MTok input, 1M context, 13B active params
DeepSeek V4 Flash is the smaller sibling of V4 Pro — 284B total parameters with 13B active per token, also 1M-context, also MIT-licensed. At $0.14 input / $0.28 output per MTok, it's the cheapest open-weight model in the frontier tier and runs on a single 8× H100 node for self-hosters. The right pick when V4-Pro quality is overkill or when you need very high request throughput at the lowest possible per-request cost.
Pricing
| Rate | List price | Anvat effective | Savings |
|---|---|---|---|
| Input | $0.14 | $0.10 | 30% |
| Output | $0.28 | $0.20 | 30% |
| All prices per million tokens (MTok). List = provider direct. Anvat effective = 30% discount applied. | |||
Pricing verified 2026-06 · See full Anvat pricing
Strengths
- Cheapest frontier-tier model — $0.14/MTok input
- 1M-token context standard, same as V4-Pro
- Runs on a single 8× H100 node for self-hosters
- Same MIT license + open-weight provenance as V4-Pro
- Strong on routine coding tasks (LiveCodeBench in the 80s)
- Same wire format / SDK story as V4-Pro — easy to swap up/down
Where it underperforms
- Trails V4-Pro materially on hardest reasoning + agentic tasks
- Not competitive with frontier-tier closed models on highest-difficulty workloads
- Text-only (same as V4-Pro)
Use cases this model is the right pick for
- Very high-volume classification, extraction, summarisation
- First-pass router target before escalating to V4-Pro or Opus 4.7
- Batch processing at scale where per-request cost dominates
- Self-hosted deployments with single-node H100 budget
- Translation + multilingual content generation at volume
Benchmarks
coding
LiveCodeBench Pass@1
~82 (in the 80s — competitive with last-gen frontier)
agentic
Cost per request
~$0.0003 typical chat turn
reasoning
MRCR 1M recall
78.7
Benchmark numbers self-reported by provider; verify against the latest publisher documentation before quoting.
Quickstart
Same wire format as direct provider APIs — your existing SDK code keeps working. Point at api.anvat.app/v1 and use your Anvat key.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.anvat.app/v1",
apiKey: process.env.ANVAT_API_KEY,
});
const response = await client.chat.completions.create({
model: "deepseek-v4-flash",
max_tokens: 500,
messages: [
{ role: "user", content: "Classify this support ticket: 'My subscription renewed twice this month.'" },
],
});Try DeepSeek V4 Flash — 30% off list
Same model, same quality, same wire format — at the discounted Anvat effective rate. $2 free credit on signup, no card required.
Get a key →Related
All models
Full catalog of frontier models on Anvat
Claude API pricing in 2026
Full breakdown + cost-cutting strategies
Cheap Claude API in 2026
Four legitimate ways to cut your bill
Cost calculator
Estimate monthly spend on this model
Use cases
Claude Code, Cursor, coding agents, RAG setup guides
Anvat pricing
Full price list across all models