Tokens are like the currency of AI. Most developers waste them without realizing it in every session, every prompt, every repeated instruction.

Bloated instruction files and Claude re-reading the same context over and over. It silently and constantly costs you more than you think.

What if we could make Claude Code stop talking and start solving?

For example:

You ask AI a question, and instead of giving you long replies in paragraphs filled with explanations. We can make AI reply like:

Install dependencies, Run code, Done

Just telling you what you need, No paragraphs, No explanations, No fluff. This is the idea behind caveman.

In simple terms:

It is a style of communication where AI strips language down to the essentials short, direct, almost primitive but nothing important is lost.

Installation of Caveman Claude

Choose your agent, whether you might be using Claude Code, Codex, Copilot, Gemini CLI or anything else. Follow the steps below:

Agent

Install

Claude Code

claude plugin marketplace add JuliusBrussee/caveman && claude plugin install caveman@caveman

Codex

Clone the repo . Go to /plugins then Search "Caveman" , Install

Gemini CLI

gemini extensions install https://github.com/JuliusBrussee/caveman

Cursor

npx skills add JuliusBrussee/caveman -a cursor

Windsurf

npx skills add JuliusBrussee/caveman -a windsurf

Copilot

npx skills add JuliusBrussee/caveman -a github-copilot

Cline

npx skills add JuliusBrussee/caveman -a cline

Any other

npx skills add JuliusBrussee/caveman

You can check the repo for the clone link

Using Caveman Claude

After installation, trigger it with

  • /caveman in Claude code & In codex $caveman

  • talk like caveman

  • less tokens please

  • caveman mode

We can stop with the prompt: “stop caveman” or “normal mode

In Caveman, we have 3 different modes of compression:

Level

Trigger

What it does

Lite

/caveman lite

Drop filler, keep grammar. Professional but no fluff

Full

/caveman full

Default caveman. Drop articles, fragments, full grunt

Ultra

/caveman ultra

Maximum compression. Telegraphic. Abbreviate everything

Caveman Skills

Here are four slash commands that teach Claude to speak less and say more. Built for developers who value speed over ceremony. Let’s break down each one

  1. /caveman-commit

If we have the problem of writing commit messages in a terrible way. /caveman-commit automatically writes the commit messages that are:

  • Under 50 characters

  • Focus on why the change was made not just what

example:

You built an app and fixed some errors and bug fixes “Built an a caveman app and fixed some known errors and bug fixes”

You will get "feat: built a caveman app with bug fixes"

But it does not automatically commits, It simply generates the commit message and gives it to you

  1. /caveman-help

If you forget what commands and tools are available in your project , This help command generates a quick-reference, a cheat sheet of every mode, skill, and command in your Claude setup. Instant overview

  1. /caveman-review

If you are stuck on review comments like long comments and slow to read

This review command forces Claude to write or give short, one line review comments pinned with exact line numbers

Here I have told to review my calculator.py application, It gives me which line has bugs and errors.

  1. /caveman:compress <filepath>

Every time we start a new Claude code session, Claude reads the Claude.md file from the beginning. This file contains all your projects rules, preferences and instructions.

So if our claude.md file is long, Claude reads all of that in every single session.

Here /caveman:compress performs a significant job, It rewrites your entire file into a much shorter version that Claude still understands just as well, but uses far less tokens.

For example:

If your Claude.md has 1000 tokens, And you use Claude 20 times a day

Tokens per day

Before compress

1000 × 20 = 20,000 tokens/day

After compress

500 × 20 = 10,000 tokens/day

Saved

10,000 tokens/day

Testing Caveman Claude

I put Caveman Claude to the test in my Claude code setup to see if it truly delivers on token efficiency.

To monitor the token reduction, First of all I installed a monitoring tool called ccusage using npm command.

And at Terminal 1 I opened Claude Code to run the prompts, and at Terminal 2 I opened ccusage tool to monitor the real time token usage.

Before starting the test I took a screenshot of the token usage to measure Before and After, as you can see below:

After, I went to my Terminal 1 and ensured Claude code was in normal mode , No caveman plugin was active

Then I ran a prompt , and Claude code gave me the output , Then I just switched to Terminal 2 and I can clearly see the change of tokens in the input and output

As you can see, In normal mode the token usage is mentioned above

Then I tested with caveman mode on, And ran the same prompt again after getting the output I switched to the monitoring terminal and I got the below output

I used the default(Full) mode here,

When we use Ultra mode we will get much reduced token usage.

Now, let us compare the token usage in Normal vs Caveman mode:

Normal mode = 3,641 - 2,984 = 657

Caveman mode = 4,023 - 3,641 = 382

Caveman mode ( Ultra ) = 4,351 - 4,023 = 328

As you can see, Full mode cut token usage by ~42% compared to normal. Ultra mode pushes that to ~50%.

In the Official Repo, It was given that we can reduce up to ~75% of output tokens

When Caveman Works and When It Doesn't

Where Caveman shines:

  • You already know how to read error messages and debug issues on your own you don't need Claude to walk you through every step.

  • You run Claude Code in tight loops, planning, fixing, committing repeatedly

  • Token costs matter to you

  • You want Claude as a senior collaborator, not a teacher

Where it falls short:

  • You are still learning Caveman skips explanations and jumps to solutions, which leads to copy paste learning with no real understanding

  • When something breaks, Caveman won't explain the error in detail, It expects you to figure out the "why" from the logs yourself.

  • If your team has junior engineers, Ultra mode responses can feel confusing too compressed for someone still learning the basics.

Rule of thumb: If you would be comfortable debugging a failing pod without Claude explaining what a pod is, Caveman is for you.

Reply

Avatar

or to participate

Keep Reading