17 Ways to Stop Hitting
Your Claude Limit
The $20/month Claude plan is enough. But only if you stop making these costly token mistakes.
Most people burning through Claude’s limit every week aren’t using too much AI.
They’re using it wrong.
Claude re-reads your entire conversation before every single reply. Message 1 costs almost nothing. Message 30? Claude is re-reading 29 exchanges before it even touches your new question. That’s where your tokens go.
Fix these 17 habits. Your $20/month plan becomes more than enough.
The Token Cost Reality Check
Before the fixes — understand what’s actually burning your limit
| What You Upload/Do | Token Cost | Better Alternative | Alternative Cost |
|---|---|---|---|
| 1-page PDF | 1,500–3,000 tokens | Same text as .md file | Under 200 tokens |
| Full screenshot (1000×1000) | ~1,300 tokens | Tight cropped screenshot | Under 100 tokens |
| 500-word prompt | 500 tokens (per message!) | “Ask me questions” prompt | ~30 tokens |
| 3 separate messages | 3× full context reload | Batch in 1 message | 1× context reload |
| 20-message session | ~105,000 tokens | Restart at 15–20 msgs | Fresh 0 tokens |
| 30-message session | ~232,000 tokens | Summarize + new session | ~2,000 tokens |
| Same PDF in 5 chats | 15,000+ tokens | Use Projects (upload once) | Zero re-uploads |
File & Upload Mistakes
How you upload files is one of the biggest token drains. Fix this first.
Uploading a PDF directly to Claude is one of the most expensive habits you can have. A single PDF page costs 1,500 to 3,000 tokens. A 10-page document? That’s potentially 30,000 tokens, before Claude has even started working.
Screenshots are almost as bad. A full 1000×1000 image burns roughly 1,300 tokens. DOCX and PPTX files carry hidden metadata bloat you can’t even see.
Open a new Google Doc (doc.new in your browser). Paste the relevant text. Download it as a .md (Markdown) file. Upload that instead.
Same information. Fraction of the cost. If you need to upload a screenshot, crop it tightly to only the section that matters; this can drop from 1,300 tokens to under 100.
Claude reads your entire context — including every uploaded file on every single message. If you dump 50 files into Cowork “just in case,” every reply in that session burns tokens for all 50 files, whether you referenced them or not.
Only include files this specific task needs. For quick tasks like drafting emails, include zero folders. Treat your context like RAM, not a filing cabinet. If it’s not needed for this task, don’t include it.
For ongoing projects, use Claude Projects (covered in Way 13) to manage files smartly.
Your about-me or CLAUDE.md file is loaded before every single session in Cowork and Claude Code. If it’s 22,000 words, that’s thousands of tokens burned before you’ve typed a single question. And Anthropic warns that bloated instruction files make Claude ignore your actual instructions anyway.
Trim your personal context file to under 2,000 words. Keep only essential context: your role, tone, key preferences, and recurring guidelines. Move task-specific instructions to individual prompts not your global file.
Prompt Mistakes
How you write prompts changes how many tokens get consumed.
A 500-word prompt costs 500 tokens on message 1. But Claude re-reads your entire conversation before every reply. By message 10, that prompt has been processed 10 times. And you’re paying for it every single time.
Research consistently shows short prompts (150–300 words) outperform long ones and they’re cheaper.
Use this 30-word template instead: "I want to [task] to [success criteria]. Ask me questions using AskUserQuestion."
Let Claude ask you clarifying questions. Clicking an option costs nearly zero tokens. Typing paragraphs of instructions costs a lot. Let Claude pull context from you instead of pushing walls of text.
Every time you send a new message, Claude re-reads your entire conversation from the very beginning. Three separate messages means three full context reloads. You’re paying triple for no reason.
Batch multiple tasks into one message. Instead of three separate asks, send: "Summarize this, list the key points, suggest a headline."
One message. One context reload. Same output.
Every time you type “No, I meant…” or “Actually, can you change that…” you’re adding to the conversation history. Claude re-reads all of it. A 2,000-token report redone from scratch adds 2,000 output tokens and compounds your context.
Click Edit on your original message. Fix the prompt. Regenerate. The history gets replaced, not added to. Claude re-reads a corrected prompt, not a pile of back-and-forth.
This single habit alone saves thousands of tokens per week.
When one section is wrong, most people say “redo this” and Claude regenerates the entire 2,000-token output from scratch. That’s 2,000 output tokens wasted on content that was already correct.
Be surgical: "Only redo section 3. Keep everything else. No commentary. Just the output."
Claude targets exactly what needs changing. The rest stays. You save the tokens for what actually matters.
If you’re writing a new prompt every time you do the same type of task, you’re burning tokens reinventing the wheel. You’re also missing out on prompt caching, which significantly reduces costs on repeated patterns.
Build a prompt library. Keep the same structure, swap only the variable part. For example: "Write a [format] about [topic] for [audience] in [tone]."
Stable prompt structures get cached. Cached prompts cost up to 90% less to process than uncached ones.
Model Selection Mistakes
Using the wrong model is like using a Ferrari to pop to the shop.
Opus is Claude’s flagship model — built for deep reasoning, complex coding, and long-horizon tasks. Using it for grammar checks, simple summaries, or quick rewrites is like hiring a senior engineer to sort your emails. You’re paying premium rates for tasks that don’t need them.
Match the model to the task: Haiku for quick, simple tasks. Sonnet for most everyday work — it’s fast, capable, and cost-efficient. Opus for deep work: complex debugging, architectural decisions, long-horizon reasoning. Switch deliberately, not by default.
Anthropic confirmed that file creation — spreadsheets, documents, presentations uses significantly more of your usage limit than regular chat messages. If you’re jumping into Cowork to build things before your plan is clear, you’re burning your heaviest tokens on exploration.
Plan in Chat first. Outline the structure. Confirm what you want. Then move to Cowork to build it once you know exactly what the output should be.
Chat is lightest. Cowork is heaviest. Don’t pay Cowork prices for Chat-level planning work.
Session Management Mistakes
Long sessions are the silent token killers. Here’s how to manage them.
Claude re-reads your entire conversation history before every single reply. A 20-message session burns around 105,000 tokens. A 30-message session? Approximately 232,000 tokens;almost entirely on re-reading past messages, not on your actual question.
One developer tracked his usage: 98.5% of tokens went to re-reading old messages. Only 1.5% went to the actual answer.
Every 15–20 messages: ask Claude to summarize the key points and decisions made. Copy that brief. Open a fresh session. Paste the summary as context. Start clean.
You asked about a LinkedIn post. Then a proposal. Now you’re asking about your content strategy and Claude is re-reading the LinkedIn post and proposal context every time you ask a new question. Dead context is dead tokens.
New topic = new chat. Always. This is one of the simplest habits to build. When you switch topics, start a fresh conversation. The unrelated history stops burning your limit immediately.
If you upload the same 15-page PDF to 4 different chats, you’ve burned 60,000–120,000 tokens on a document you could have managed once. Anthropic confirmed: reused project content doesn’t count the same way.
Use Claude Projects. Upload the file once. Every chat inside that project references it without re-burning tokens on each upload. One upload. Persistent access. No repetition.
Settings & Configuration Mistakes
Default settings are not optimised for token savings. Change these.
Web search, MCP connectors, and other tools all add tokens to every message even when you’re not using them. Anthropic explicitly warns that tools are token-intensive. Having them enabled by default means you’re paying the overhead constantly.
A web search result alone returns roughly 2,000 tokens of data.
Default everything off. Go to your Search and Tools settings. Turn features on only when you need them for a specific task — not as a blanket account setting. Filtered retrieval = fewer results = fewer wasted tokens.
Without Personal Preferences configured, every chat starts with 3–5 wasted setup messages explaining your tone, style, and context. You’re paying tokens to reintroduce yourself to Claude every single session.
Go to Settings → Personal Preferences. Set your tone (pick “Concise” or write a custom one). Define your style once. It applies to every future conversation automatically. One setup = permanent savings every session going forward.
If you’re manually triggering the same report, analysis, or briefing every week, you’re spending tokens on the same overhead every time; plus the cognitive load of remembering to do it.
Use /schedule in Claude. Set it once: "Every Monday at 7am, create my weekly briefing." Wake up to a finished document. Zero manual tokens spent on the trigger and setup — just the output.
If you spend 5 messages trying to get Claude to generate an image or pull real-time search results, you’ve just burned tokens on a dead end. Claude can’t generate images natively. Real-time search has limits. Tokens spent on impossible tasks are tokens you never get back.
Know your tools. Images → Gemini or Midjourney. Real-time search → Grok or Perplexity. Deep reasoning, writing, coding, analysis → Claude. Use the right tool for the right task. Every tool has a lane. Stay in yours.
The Complete 17-Fix Summary
Want a Growth System That Actually Works?
BK helps founders and businesses build AI-powered SEO + GEO systems that generate traffic, leads, and revenue — not just rankings.



