⛓️ Chain-of-Thought Prompting

What Is It?

Chain-of-thought (CoT) prompting is a technique used in prompt engineering to enhance the reasoning capabilities of large language models (LLMs). Instead of simply asking for an answer, CoT prompting guides the model to provide intermediate reasoning steps that lead to the final answer.

Imagine how you might solve a math problem by writing down each step. Similarly, CoT prompting encourages the model to “think aloud,” breaking down complex tasks into simpler, sequential steps. This approach helps the model tackle more complicated questions that require logical thinking or multi-step reasoning.

This technique was introduced by researchers Wei et al. in 2022 and has shown remarkable effectiveness in enabling LLMs to handle complex problems more accurately.

How Does It Work?

CoT prompting works by including examples in your prompt where each question is followed by detailed reasoning steps and then the answer. Here’s how it functions:

  1. Provide Examples with Reasoning: In your prompt, include one or more examples where a question is followed by step-by-step reasoning leading to the answer.
  2. Pattern Recognition: The LLM recognizes the pattern of providing reasoning before the answer from the examples you’ve given.
  3. Model Response: When presented with a new question, the model follows the established pattern and generates its own chain of thought, leading to the answer.

This method leverages the model’s ability to identify and replicate patterns from the prompt. Interestingly, sometimes even a single example is enough for the model to catch on and provide detailed reasoning for subsequent questions.

When Do You Use It?

You would use Chain-of-Thought prompting when:

  • Dealing with Complex Tasks: If the task requires multi-step reasoning, logical deductions, or calculations, CoT prompting can help the model produce a correct answer by working through the steps.
  • Improving Accuracy: By breaking down the problem, the model is less likely to make mistakes that it might if it tried to provide the answer in one leap.
  • Encouraging Transparency: When you want to see how the model arrived at its answer, CoT prompting reveals its thought process, which can be valuable for verification and learning purposes.
  • Limited Examples: If you don’t have many examples to provide, CoT prompting can still be effective, even with minimal demonstrations.

Practical Examples

Let’s explore how Chain-of-Thought prompting can improve the model’s performance through examples.

Without Chain-of-Thought Prompting

Prompt:

I went to the market and bought 10 apples. I gave 2 apples to the neighbor and 2 to the repairman. I then went and bought 5 more apples and ate 1. How many apples did I remain with?

Model Output:

11 apples

Explanation:

The model provided an incorrect answer because it likely didn’t process all the steps needed to solve the problem accurately.

With Chain-of-Thought Prompting

By encouraging the model to “think step by step,” we can help it arrive at the correct answer.

Prompt:

I went to the market and bought 10 apples. I gave 2 apples to the neighbor and 2 to the repairman. I then went and bought 5 more apples and ate 1. How many apples did I remain with? Let’s think step by step.

Model Output:

  1. Started with 10 apples.
  2. Gave 2 to the neighbor: 10 - 2 = 8 apples.
  3. Gave 2 to the repairman: 8 - 2 = 6 apples.
  4. Bought 5 more apples: 6 + 5 = 11 apples.
  5. Ate 1 apple: 11 - 1 = 10 apples.
  6. Answer: You have 10 apples remaining.

Explanation:

By prompting the model to think step by step, it accurately processes each stage of the problem and arrives at the correct answer.

In a System Prompt

You can incorporate Chain-of-Thought prompting into the system prompt to set the behavior for the entire interaction.

System Prompt:

You are a helpful assistant who solves problems by explaining each step of your reasoning before providing the final answer.

User Prompt:

The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1. A:

Model Output:

Adding all the odd numbers (15, 5, 13, 7, 1) gives 41. The answer is False.

Explanation:

The model follows the instruction to explain its reasoning before giving the answer.

In a User Prompt

Alternatively, you can include examples directly in your user prompt to guide the model.

User Prompt:

The odd numbers in this group add up to an even number: 4, 8, 9, 15, 12, 2, 1.A: Adding all the odd numbers (9, 15, 1) gives 25. The answer is False.The odd numbers in this group add up to an even number: 15, 32, 5, 13, 82, 7, 1.A:

Model Output:

Adding all the odd numbers (15, 5, 13, 7, 1) gives 41. The answer is False.

Explanation:

By providing an example that demonstrates the chain of thought, the model follows suit and provides reasoning before giving the answer.

Zero-Shot Chain-of-Thought

Zero-Shot Chain-of-Thought is a variant where you guide the model to think step by step without providing examples. This technique was introduced by Kojima et al. in 2022 and involves appending a simple prompt like “Let’s think step by step” to your question.

Prompt:

I went to the market and bought 10 apples. I gave 2 apples to the neighbor and 2 to the repairman. I then went and bought 5 more apples and ate 1. How many apples did I remain with? Let’s think step by step.

Model Output:

  1. Started with 10 apples.
  2. Gave 2 to the neighbor: 10 - 2 = 8 apples.
  3. Gave 2 to the repairman: 8 - 2 = 6 apples.
  4. Bought 5 more apples: 6 + 5 = 11 apples.
  5. Ate 1 apple: 11 - 1 = 10 apples.
  6. Answer: You have 10 apples remaining.

Explanation:

Even without previous examples, the model provides detailed reasoning steps, leading to the correct answer.

Automatic Chain-of-Thought (Auto-CoT)

Creating effective CoT prompts can sometimes be time-consuming, especially when crafting diverse and accurate examples. Automatic Chain-of-Thought (Auto-CoT) is an approach proposed by Zhang et al. (2022) to automate this process.

How It Works:

  1. Question Clustering:
    • Purpose: To ensure diversity in the examples.
    • Method: Automatically cluster questions from a dataset into groups based on similarity.
  2. Demonstration Sampling:
    • Purpose: To select representative questions that cover different aspects of the task.
    • Method: Choose a question from each cluster.
    • Generating Reasoning Chains: Use the model’s zero-shot CoT capability by prompting it with “Let’s think step by step” to generate reasoning for these questions.

Benefits:

  • Reduces Manual Effort: Eliminates the need to hand-craft examples.
  • Improves Diversity: Diverse examples help the model generalize better to new questions.
  • Mitigates Mistakes: While auto-generated reasoning may contain errors, the diversity and volume of examples help balance out potential inaccuracies.

Example of Auto-CoT Prompt:

Let’s think step by step to solve the following question.

Question: If a container holds 30 liters, and you remove 15 liters, then add 10 liters, how much liquid is in the container now?

Model’s Reasoning:

  1. Starting with 30 liters.
  2. Removing 15 liters: 30 - 15 = 15 liters.
  3. Adding 10 liters: 15 + 10 = 25 liters.
  4. Answer: The container now holds 25 liters.

Explanation:

By using Auto-CoT, you can quickly generate reasoning steps for various questions, which can then be used as examples in your prompts. However, LLMs are not calculators so they still can produce errors whenever doing more complicated math!