Table of Contents
Tokens are like the currency of AI. Most developers waste them without realizing it in every session, every prompt, every repeated instruction.
Bloated instruction files and Claude re-reading the same context over and over. It silently and constantly costs you more than you think.
What if we could make Claude Code stop talking and start solving?
For example:
You ask AI a question, and instead of giving you long replies in paragraphs filled with explanations. We can make AI reply like:
“Install dependencies, Run code, Done”
Just telling you what you need, No paragraphs, No explanations, No fluff. This is the idea behind caveman.
In simple terms:
It is a style of communication where AI strips language down to the essentials short, direct, almost primitive but nothing important is lost.
Installation of Caveman Claude
Choose your agent, whether you might be using Claude Code, Codex, Copilot, Gemini CLI or anything else. Follow the steps below:
Agent | Install |
|---|---|
Claude Code |
|
Codex | Clone the repo . Go to |
Gemini CLI |
|
Cursor |
|
Windsurf |
|
Copilot |
|
Cline |
|
Any other |
|
You can check the repo for the clone link
Using Caveman Claude
After installation, trigger it with
/cavemanin Claude code & In codex$caveman“talk like caveman”
“less tokens please”
“caveman mode”
We can stop with the prompt: “stop caveman” or “normal mode”
In Caveman, we have 3 different modes of compression:
Level | Trigger | What it does |
|---|---|---|
Lite |
| Drop filler, keep grammar. Professional but no fluff |
Full |
| Default caveman. Drop articles, fragments, full grunt |
Ultra |
| Maximum compression. Telegraphic. Abbreviate everything |
Caveman Skills
Here are four slash commands that teach Claude to speak less and say more. Built for developers who value speed over ceremony. Let’s break down each one
/caveman-commit
If we have the problem of writing commit messages in a terrible way. /caveman-commit automatically writes the commit messages that are:
Under 50 characters
Focus on why the change was made not just what
example:
You built an app and fixed some errors and bug fixes “Built an a caveman app and fixed some known errors and bug fixes”
You will get "feat: built a caveman app with bug fixes"

But it does not automatically commits, It simply generates the commit message and gives it to you
/caveman-help
If you forget what commands and tools are available in your project , This help command generates a quick-reference, a cheat sheet of every mode, skill, and command in your Claude setup. Instant overview

/caveman-review
If you are stuck on review comments like long comments and slow to read
This review command forces Claude to write or give short, one line review comments pinned with exact line numbers

Here I have told to review my calculator.py application, It gives me which line has bugs and errors.
/caveman:compress <filepath>
Every time we start a new Claude code session, Claude reads the Claude.md file from the beginning. This file contains all your projects rules, preferences and instructions.
So if our claude.md file is long, Claude reads all of that in every single session.
Here /caveman:compress performs a significant job, It rewrites your entire file into a much shorter version that Claude still understands just as well, but uses far less tokens.

For example:
If your Claude.md has 1000 tokens, And you use Claude 20 times a day
Tokens per day | |
|---|---|
Before compress | 1000 × 20 = 20,000 tokens/day |
After compress | 500 × 20 = 10,000 tokens/day |
Saved | 10,000 tokens/day |
Testing Caveman Claude
I put Caveman Claude to the test in my Claude code setup to see if it truly delivers on token efficiency.
To monitor the token reduction, First of all I installed a monitoring tool called ccusage using npm command.
And at Terminal 1 I opened Claude Code to run the prompts, and at Terminal 2 I opened ccusage tool to monitor the real time token usage.
Before starting the test I took a screenshot of the token usage to measure Before and After, as you can see below:

After, I went to my Terminal 1 and ensured Claude code was in normal mode , No caveman plugin was active
Then I ran a prompt , and Claude code gave me the output , Then I just switched to Terminal 2 and I can clearly see the change of tokens in the input and output

As you can see, In normal mode the token usage is mentioned above
Then I tested with caveman mode on, And ran the same prompt again after getting the output I switched to the monitoring terminal and I got the below output

I used the default(Full) mode here,
When we use Ultra mode we will get much reduced token usage.

Now, let us compare the token usage in Normal vs Caveman mode:
Normal mode = 3,641 - 2,984 = 657
Caveman mode = 4,023 - 3,641 = 382
Caveman mode ( Ultra ) = 4,351 - 4,023 = 328
As you can see, Full mode cut token usage by ~42% compared to normal. Ultra mode pushes that to ~50%.
In the Official Repo, It was given that we can reduce up to ~75% of output tokens
When Caveman Works and When It Doesn't
Where Caveman shines:
You already know how to read error messages and debug issues on your own you don't need Claude to walk you through every step.
You run Claude Code in tight loops, planning, fixing, committing repeatedly
Token costs matter to you
You want Claude as a senior collaborator, not a teacher
Where it falls short:
You are still learning Caveman skips explanations and jumps to solutions, which leads to copy paste learning with no real understanding
When something breaks, Caveman won't explain the error in detail, It expects you to figure out the "why" from the logs yourself.
If your team has junior engineers, Ultra mode responses can feel confusing too compressed for someone still learning the basics.
Rule of thumb: If you would be comfortable debugging a failing pod without Claude explaining what a pod is, Caveman is for you.
