DeepSeek1M contextdeepseek-v4-flash

DeepSeek V4 Flash

Ultra-cheap open-weight tier — $0.14/MTok input, 1M context, 13B active params

DeepSeek V4 Flash is the smaller sibling of V4 Pro — 284B total parameters with 13B active per token, also 1M-context, also MIT-licensed. At $0.14 input / $0.28 output per MTok, it's the cheapest open-weight model in the frontier tier and runs on a single 8× H100 node for self-hosters. The right pick when V4-Pro quality is overkill or when you need very high request throughput at the lowest possible per-request cost.

Pricing

Rate	List price	Anvat effective	Savings
Input	$0.14	$0.10	30%
Output	$0.28	$0.20	30%
All prices per million tokens (MTok). List = provider direct. Anvat effective = 30% discount applied.

Pricing verified 2026-06 · See full Anvat pricing

Strengths

Cheapest frontier-tier model — $0.14/MTok input
1M-token context standard, same as V4-Pro
Runs on a single 8× H100 node for self-hosters
Same MIT license + open-weight provenance as V4-Pro
Strong on routine coding tasks (LiveCodeBench in the 80s)
Same wire format / SDK story as V4-Pro — easy to swap up/down

Where it underperforms

Trails V4-Pro materially on hardest reasoning + agentic tasks
Not competitive with frontier-tier closed models on highest-difficulty workloads
Text-only (same as V4-Pro)

Use cases this model is the right pick for

Very high-volume classification, extraction, summarisation
First-pass router target before escalating to V4-Pro or Opus 4.7
Batch processing at scale where per-request cost dominates
Self-hosted deployments with single-node H100 budget
Translation + multilingual content generation at volume

Benchmarks

coding
LiveCodeBench Pass@1
~82 (in the 80s — competitive with last-gen frontier)
agentic
Cost per request
~$0.0003 typical chat turn
reasoning
MRCR 1M recall
78.7

Benchmark numbers self-reported by provider; verify against the latest publisher documentation before quoting.

Quickstart

Same wire format as direct provider APIs — your existing SDK code keeps working. Point at api.anvat.app/v1 and use your Anvat key.

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.anvat.app/v1",
  apiKey: process.env.ANVAT_API_KEY,
});

const response = await client.chat.completions.create({
  model: "deepseek-v4-flash",
  max_tokens: 500,
  messages: [
    { role: "user", content: "Classify this support ticket: 'My subscription renewed twice this month.'" },
  ],
});

Try DeepSeek V4 Flash — 30% off list

Same model, same quality, same wire format — at the discounted Anvat effective rate. $2 free credit on signup, no card required.

Get a key →