Chain-of-Thought
[!IMPORTANT] Key Insight: LLMs do not “think” before they speak. They think while they speak. If you ask for a direct answer to a complex problem, the model has to compute the final result in a single forward pass (one token). This often leads to errors.
Chain-of-Thought (CoT) is a technique that forces the model to generate intermediate reasoning steps before giving the final answer. This allows the model to decompose complex problems into simpler, manageable parts.
1. The Problem: System 1 vs System 2
In psychology, Kahneman’s System 1 is fast and intuitive, while System 2 is slow and analytical.
- Standard Prompting is like System 1: Immediate, gut-reaction answers.
- CoT Prompting induces System 2: Deliberate, step-by-step reasoning.
Failure Case: The Bat and Ball Problem
Consider this classic riddle:
“A bat and a ball cost 1.10 in total. The bat costs 1.00 more than the ball. How much does the ball cost?”
Standard Prompt (Zero-Shot):
Q: A bat and a ball cost 1.10 in total. The bat costs 1.00 more than the ball. How much does the ball cost?
A: 10 cents. (Incorrect)
The model (and many humans) intuitively jump to 10 cents because 1.10 - 1.00 looks like $0.10.
2. Zero-Shot CoT
The simplest way to enable reasoning is to append a magic phrase to your prompt:
“Let’s think step by step.”
Zero-Shot CoT Prompt:
Q: A bat and a ball cost 1.10 in total. The bat costs 1.00 more than the ball. How much does the ball cost? Let’s think step by step.
A: Let’s denote the cost of the ball as x. Then the cost of the bat is x + 1.00. The total cost is x + (x + 1.00) = 1.10. This means 2x + 1.00 = 1.10. Subtracting 1.00 from both sides, we get 2x = 0.10. Dividing by 2, we get x = 0.05. So the ball costs 5 cents. (Correct)
3. Why It Works
Models compute token by token.
- Computation Time: By forcing the model to output “The bat costs $1.00…”, you are giving it more “time” (tokens) to process the logic before it commits to the final number.
- State Management: The intermediate tokens serve as a “scratchpad”. The model attends to its own previous reasoning steps to generate the next step.
4. Few-Shot CoT
For production systems, relying on “Let’s think step by step” can be unstable. Instead, use Few-Shot CoT by providing examples of the reasoning process you want.
Prompt: Q: Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now? A: Roger started with 5 balls. 2 cans of 3 tennis balls each is 2 * 3 = 6 tennis balls. 5 + 6 = 11. The answer is 11.
Q: The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have? A:
5. Code Implementation
6. CoT Simulator
Test the difference between Standard Prompting and Chain-of-Thought on a logic puzzle.