How do I see Claude Code costs in real time?

Install a Stop hook that logs every session's token spend to .claude/state/costs.jsonl. Then tail -f the file while you work. The hook reads transcript stats and writes a JSON line with timestamp, model, input/output tokens, and estimated dollar cost. Open-source reference: cost-tracker.sh at https://github.com/ianymu/claude-verify-before-stop's parent pack.

What is the biggest cause of Claude Code overspending?

Three patterns drive most overspend: (1) re-reading the same files repeatedly because context was compacted and the model can't remember it read them, (2) unnecessary deep exploration when a focused grep would suffice, (3) running long sessions on Opus when Sonnet or Haiku would do for routine edits. The cost-tracker hook surfaces all three so you can see them and intervene.

How much can a Stop hook actually save on Claude Code?

In our production usage across 14 parallel projects over 6 months, surfacing real-time cost via cost-tracker.sh combined with verify-before-stop (which catches lies-of-completion and reduces retry sessions) cut total Opus spend by approximately 40%. The bulk of the savings came from catching runaway sessions early — when costs.jsonl showed a session crossing $20 with no visible progress, we'd intervene rather than let it continue for another hour.

Should I use Claude Sonnet instead of Opus for cost reasons?

Yes for routine tasks. Anthropic's Sonnet 4.6 is approximately 5x cheaper than Opus 4.7 at $3/$15 per million tokens vs Opus's $15/$75. The right pattern is to use Opus for the planning phase (high-judgement work) and switch to Sonnet for the implementation phase (mechanical edits). Claude Code supports model switching mid-session via /model command.

Published 2026-05-20 · 7 min read · By Ian Mu

Why does Claude Code burn through Opus credits so fast? — and how to cut 40% with one hook

Q: Why is Claude Code so expensive?

Claude Code uses Claude Opus models which are priced at $15/million input tokens and $75/million output tokens. A typical agentic coding session can read tens of thousands of tokens (file context) and generate thousands of tokens (code + tool calls + analysis). Over a multi-hour session, this compounds quickly. A single 6-hour session can easily consume $20-40 of Opus tokens without obvious feedback.

My Claude Code spend was averaging $30-100/day on Opus across 14 parallel projects. Most of it was invisible until the invoice arrived. Then I built a cost-tracker.sh Stop hook that surfaces token spend in real time to .claude/state/costs.jsonl. Combined with workflow patterns it caught, my Opus spend dropped roughly 40% in the first month. This article walks through why Claude Code is expensive, the three patterns that drive most overspend, and the exact hook that surfaces them.

TL;DR

Add a Stop hook that writes {session_id, model, input_tokens, output_tokens, est_cost_usd} to .claude/state/costs.jsonl on every turn. Then tail -f .claude/state/costs.jsonl | jq . in a side terminal while you work. You'll see exactly when a session crosses your budget threshold and you can intervene before it crosses again.

Why is Claude Code so expensive?

Three reasons compound:

1. Opus pricing. Claude Opus 4.7 is priced at $15/million input tokens and $75/million output tokens. A 30,000-token system prompt + project context + transcript read once costs $0.45 in input alone. Multiply by the dozens of turns in a session.

2. Agentic loops re-read everything. Unlike a single-shot chat completion, an agentic coding session reads files, calls tools, gets results, reads more files, decides what to do next. Each tool round-trip can grow context by 5-30% (the assistant message + tool result get appended to the next turn's input).

3. Compaction loses context, model re-reads. When the conversation grows beyond model context limits, Claude Code summarizes (compacts) and starts fresh. But the model now needs to re-read files it already read. Long sessions can double or triple effective token cost via compaction-driven re-reads.

The three patterns that drive most overspend

Pattern 1: Compaction-driven re-reads

Symptom: a 4-hour session uses 3x the tokens of a 1-hour session that did the same work split into 4 sittings. Cause: each compaction event forces re-reads.

Mitigation: explicit checkpoint hook (force-progress-update.sh) that writes a progress.json file every 5 actions. After compaction, the model can read the small progress file instead of re-reading source code.

Pattern 2: Deep exploration when grep would suffice

Symptom: model reads 20 files when it only needed to find 1 line in 1 file. Cause: it's exploring vs. searching.

Mitigation: a custom system prompt or PreToolUse(Read) hook that suggests grep-first. We didn't fully solve this with a hook — the cleanest fix is a CLAUDE.md instruction: "Before reading any file, grep for the relevant symbol/string first."

Pattern 3: Opus for routine edits

Symptom: model uses Opus for mechanical work (rename a variable, add a typehint, fix a typo). Cost: $0.20 per turn vs $0.04 on Sonnet.

Mitigation: switch model mid-session. Use /model sonnet when leaving planning phase. Or have the cost-tracker emit a warning when an Opus turn cost > some threshold for low-complexity work.

The cost-tracker.sh hook

This is a Stop hook (fires when Claude finishes a turn). It reads the transcript file Claude Code passes via stdin, extracts the model + token counts, computes estimated cost, and appends to a JSONL log.

#!/bin/bash
# cost-tracker.sh
# Logs every Claude Code session's spend to .claude/state/costs.jsonl

INPUT=$(cat)
COSTS_FILE=".claude/state/costs.jsonl"
mkdir -p .claude/state

python3 - "$INPUT" <<'PY'
import json, sys, os
from datetime import datetime, timezone, timedelta

CST = timezone(timedelta(hours=-4))
now = datetime.now(CST).isoformat()

raw = sys.argv[1].strip() if len(sys.argv) > 1 else ''
if not raw: sys.exit(0)
try:
    data = json.loads(raw)
except: sys.exit(0)

session_id = data.get('session_id', 'unknown')
transcript = data.get('transcript_path', '')

# Estimate cost based on Anthropic published pricing
PRICING = {
    'opus':   {'in': 15.00, 'out': 75.00},  # per 1M tokens
    'sonnet': {'in':  3.00, 'out': 15.00},
    'haiku':  {'in':  0.80, 'out':  4.00},
}

# Walk the transcript file (if accessible) to sum tokens
input_tokens = output_tokens = 0
model_guess = 'opus'
try:
    with open(transcript) as f:
        for line in f:
            try:
                msg = json.loads(line)
                u = msg.get('usage', {})
                input_tokens += u.get('input_tokens', 0)
                output_tokens += u.get('output_tokens', 0)
                m = msg.get('model', '')
                if 'opus' in m.lower(): model_guess = 'opus'
                elif 'sonnet' in m.lower(): model_guess = 'sonnet'
                elif 'haiku' in m.lower(): model_guess = 'haiku'
            except: continue
except: pass

price = PRICING.get(model_guess, PRICING['opus'])
est_cost = (input_tokens * price['in'] + output_tokens * price['out']) / 1_000_000

entry = {
    'ts': now,
    'session_id': session_id,
    'model': model_guess,
    'input_tokens': input_tokens,
    'output_tokens': output_tokens,
    'est_cost_usd': round(est_cost, 4),
}
with open('.claude/state/costs.jsonl', 'a') as f:
    f.write(json.dumps(entry, ensure_ascii=False) + '\n')
PY

exit 0

Wire into .claude/settings.json:

{
  "hooks": {
    "Stop": [{
      "matcher": "*",
      "hooks": [
        { "type": "command", "command": "bash .claude/hooks/cost-tracker.sh" }
      ]
    }]
  }
}

How to watch your burn in real time

In a second terminal:

tail -f .claude/state/costs.jsonl | jq .

Output:

{
  "ts": "2026-05-20T15:43:21-04:00",
  "session_id": "abc123",
  "model": "opus",
  "input_tokens": 142000,
  "output_tokens": 8200,
  "est_cost_usd": 2.745
}

You'll see costs scroll past as Claude works. When a turn crosses your threshold (e.g., >$1 for a single turn), you intervene.

Weekly budget reconciliation

Quick one-liner to see today's spend by session:

jq -s 'group_by(.session_id)[] | {session: .[0].session_id, total: map(.est_cost_usd) | add}' \
  .claude/state/costs.jsonl

Or this week's total:

jq -s 'map(.est_cost_usd) | add' .claude/state/costs.jsonl

Most weeks I'd see one or two sessions accounting for >40% of total spend. That's where the workflow intervention focused — adjusting the prompts that started those runaway sessions.

Other hooks that compound the savings

cost-tracker.sh tells you what's happening. These others prevent the patterns it surfaces:

verify-before-stop.sh — eliminates retry sessions from lies-of-completion (these are pure waste tokens)
force-progress-update.sh — checkpoint every 5 actions, survives compaction without re-reads
pre-compact-diary.sh — preserves WIP state before compaction so the next session starts informed
enforce-autoplan.sh — blocks code-write until a plan exists, eliminating "let me try this and see" loops
block-secrets.sh — saves you a separate-incident cost (a leaked API key can cost more than a month of Opus)

The free verify-before-stop.sh is on GitHub (MIT). The full 6-hook pack with install script is at landing-ianymu.vercel.app ($19 lightning / $49 regular, 30-day money-back).

How to fix Claude Code's "all tests passing" lies — the verify-before-stop article
Claude Code Hooks Cheat Sheet
awesome-claude-code-hooks — curated list of hooks

Questions? Email ian.y.mu@gmail.com. Open issues / PRs welcome.