Prompt chaining
// updated 2026-01-09 17:01
Prompt chaining refers to linking multiple prompts and their outputs in a leapfrog style:
- prompt 1
- -> output 1
- -> prompt 2 (with output 1 as part of prompt 2)
- -> output 2
- -> prompt 3 (with output 2 as part of prompt 3)
- ...and so on...
Setup
Let's connect to the OpenAI LLM API (with more details in that link) and create a basic chat completion function:
# app.py
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
def get_completion(prompt):
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}],
)
return response.choices[0].message.contentChaining chat completions
We can then create several content creation functions, each of which will use the chat completion function, get_completion:
# app.py (continued from last snippet)
def summarize(original_prompt, original_response):
chained_prompt = f'''
<context>
Here is the original prompt:
{original_prompt}
Below is the original response:
{original_response}
</context>
<request>
You are a summarizing agent for busy people. Write a summary of the context with 3 bullet points (no more than 25 words each), each with a direct response to the prompt
</request>
<guidelines>
- Begin the summary that summarizes the prompt
- Do not try to insert any more information in each bullet point then what the prompt asks, e.g. if the prompt asks for hotels, just list the hotel's name in each bullet point!
</guidelines>
'''
return get_completion(chained_prompt)
def prompt_chain(prompt):
original_response = get_completion(prompt)
summary = summarize(prompt, original_response)
return summaryLet's look at the prompt_chain function first:
- this outlines at a higher level what happens:
- user can make any prompt they want
- the LLM will respond to that prompt via a call to
get_completion - then, we grab that result and feed it in to a
summarizefunction
Now, let's look at the summarize function:
- we create a
chained_promptthat- takes in the original prompt and original response
- uses the original prompt and response in those XML-like tags to create a new prompt
- the tags
<context>+<request>+<guidelines>allow us to make ourselves more clear as to what we want from the LLM
- the tags
- calls
get_completionby feeding in the new prompt to get a new response
Notice how we can keep chaining many content creation functions together, calling get_completion in each one, as many as we wish! For example, we could take the value of summary and feed that into another prompt - it can never end!
Runtime
Anyway, with our finite chain, let's create the runtime sequence:
# app.py (continued from last snippet)
prompt = "What are some good but economical hotels to book to stay in before a cruise that begins in Auckland, New Zealand?"
print(prompt_chain(prompt))Then, we run it with the command line:
% python3 app.pyThe output would then look something like this:
Economical hotels in Auckland for a cruise:
* Ibis Auckland Ellerslie
* Novotel Auckland Ellerslie
* Quest Auckland Serviced Apartments...instead of the much-longer default length of LLM responses!