Prompt chaining

using the output from an LLM for another prompt (and so on)

2026-01-09 12:59 // updated 2026-01-09 17:01

Prompt chaining refers to linking multiple prompts and their outputs in a leapfrog style:

prompt 1
-> output 1
-> prompt 2 (with output 1 as part of prompt 2)
-> output 2
-> prompt 3 (with output 2 as part of prompt 3)
...and so on...

Setup

Let's connect to the OpenAI LLM API (with more details in that link) and create a basic chat completion function:

# app.py

import os
from openai import OpenAI
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

def get_completion(prompt):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],        
    )
    return response.choices[0].message.content

Chaining chat completions

We can then create several content creation functions, each of which will use the chat completion function, get_completion:

# app.py (continued from last snippet)

def summarize(original_prompt, original_response):
  
  chained_prompt = f'''
  
    <context>
      Here is the original prompt:
      {original_prompt}
      Below is the original response: 
      {original_response}
    </context>
    
    <request>
      You are a summarizing agent for busy people. Write a summary of the context with 3 bullet points (no more than 25 words each), each with a direct response to the prompt
    </request>
    
    <guidelines>
    - Begin the summary that summarizes the prompt
    - Do not try to insert any more information in each bullet point then what the prompt asks, e.g. if the prompt asks for hotels, just list the hotel's name in each bullet point!
    </guidelines>
    
  '''
  return get_completion(chained_prompt)

def prompt_chain(prompt):
  original_response = get_completion(prompt)
  summary = summarize(prompt, original_response)
  return summary

Let's look at the prompt_chain function first:

this outlines at a higher level what happens:
- user can make any prompt they want
- the LLM will respond to that prompt via a call to get_completion
- then, we grab that result and feed it in to a summarize function

Now, let's look at the summarize function:

we create a chained_prompt that
- takes in the original prompt and original response
- uses the original prompt and response in those XML-like tags to create a new prompt
  - the tags <context> + <request> + <guidelines> allow us to make ourselves more clear as to what we want from the LLM
- calls get_completion by feeding in the new prompt to get a new response

Notice how we can keep chaining many content creation functions together, calling get_completion in each one, as many as we wish! For example, we could take the value of summary and feed that into another prompt - it can never end!

Runtime

Anyway, with our finite chain, let's create the runtime sequence:

# app.py (continued from last snippet)

prompt = "What are some good but economical hotels to book to stay in before a cruise that begins in Auckland, New Zealand?"
print(prompt_chain(prompt))

Then, we run it with the command line:

% python3 app.py

The output would then look something like this:

Economical hotels in Auckland for a cruise:
* Ibis Auckland Ellerslie
* Novotel Auckland Ellerslie
* Quest Auckland Serviced Apartments

...instead of the much-longer default length of LLM responses!

ai llm prompt engineering prompt chaining groq python

⬅️ older (in snippets)
🔌 Connecting to an LLM API with Python (for beginners)

newer (in snippets) ➡️
LLM-generated JSON output ❇️📒

⬅️ older (posts)
🔌 Connecting to an LLM API with Python (for beginners)

newer (posts) ➡️
LLM-generated JSON output ❇️📒