Writings on code, AI, and engineering

Can You Make an LLM Reason Better by Forcing It to Think Like a Caveman?

A benchmark on GSM8K with Qwen3-235B exploring compressed chain-of-thought prompting — 60% fewer tokens, same accuracy.

LLM AI Research Prompt Engineering