DeepSeek1M contextdeepseek-v4-flash

DeepSeek V4 Flash

Ultra-cheap open-weight tier — $0.14/MTok input, 1M context, 13B active params

DeepSeek V4 Flash is the smaller sibling of V4 Pro — 284B total parameters with 13B active per token, also 1M-context, also MIT-licensed. At $0.14 input / $0.28 output per MTok, it's the cheapest open-weight model in the frontier tier and runs on a single 8× H100 node for self-hosters. The right pick when V4-Pro quality is overkill or when you need very high request throughput at the lowest possible per-request cost.

Pricing

RateList priceAnvat effectiveSavings
Input$0.14$0.1030%
Output$0.28$0.2030%
All prices per million tokens (MTok). List = provider direct. Anvat effective = 30% discount applied.

Pricing verified 2026-06 · See full Anvat pricing

Strengths

  • Cheapest frontier-tier model — $0.14/MTok input
  • 1M-token context standard, same as V4-Pro
  • Runs on a single 8× H100 node for self-hosters
  • Same MIT license + open-weight provenance as V4-Pro
  • Strong on routine coding tasks (LiveCodeBench in the 80s)
  • Same wire format / SDK story as V4-Pro — easy to swap up/down

Where it underperforms

  • Trails V4-Pro materially on hardest reasoning + agentic tasks
  • Not competitive with frontier-tier closed models on highest-difficulty workloads
  • Text-only (same as V4-Pro)

Use cases this model is the right pick for

  • Very high-volume classification, extraction, summarisation
  • First-pass router target before escalating to V4-Pro or Opus 4.7
  • Batch processing at scale where per-request cost dominates
  • Self-hosted deployments with single-node H100 budget
  • Translation + multilingual content generation at volume

Benchmarks

  • coding

    LiveCodeBench Pass@1

    ~82 (in the 80s — competitive with last-gen frontier)

  • agentic

    Cost per request

    ~$0.0003 typical chat turn

  • reasoning

    MRCR 1M recall

    78.7

Benchmark numbers self-reported by provider; verify against the latest publisher documentation before quoting.

Quickstart

Same wire format as direct provider APIs — your existing SDK code keeps working. Point at api.anvat.app/v1 and use your Anvat key.

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.anvat.app/v1",
  apiKey: process.env.ANVAT_API_KEY,
});

const response = await client.chat.completions.create({
  model: "deepseek-v4-flash",
  max_tokens: 500,
  messages: [
    { role: "user", content: "Classify this support ticket: 'My subscription renewed twice this month.'" },
  ],
});

Try DeepSeek V4 Flash — 30% off list

Same model, same quality, same wire format — at the discounted Anvat effective rate. $2 free credit on signup, no card required.

Get a key →

Related