The OpenAI API is the most common production endpoint for AI features. Setting a system prompt correctly is the difference between a model that does what you want and one that drifts into off-topic, off-format, or off-brand responses. This tutorial walks through the message format, code examples in Python and Node.js, and best practices for cost and reliability.
Generate the prompt itself in 2 minutes.
Open System Prompt Generator →OpenAI's chat completions API uses a messages array. The system prompt is the first element with role "system":
{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a senior support agent for Acme. Always confirm the user's plan before discussing pricing."},
{"role": "user", "content": "Can I get a refund for last month?"}
]
}
The system message can appear in any position in the array, but convention (and best practice) is to put it first. Put any few-shot examples after the system message and before the user's actual message.
Using the official openai Python SDK:
from openai import OpenAI
client = OpenAI(api_key="sk-...")
system_prompt = """You are a senior customer support agent for Acme SaaS.
Always:
- Confirm the user's plan tier before discussing features or pricing
- Use a warm, helpful tone
- End each response with a yes/no question to keep the conversation moving
Never:
- Promise refunds — escalate to a human
- Mention competitor product names
- Invent feature roadmap items"""
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": "I want a refund for last month."}
]
)
print(response.choices[0].message.content)
That's the full pattern. Most production code follows this structure, with the system prompt loaded from a constant or file rather than inlined in the function.
Using the official openai Node.js SDK:
import OpenAI from "openai";
const client = new OpenAI({ apiKey: "sk-..." });
const systemPrompt = `You are a senior customer support agent for Acme SaaS.
Always:
- Confirm the user's plan tier before discussing features or pricing
- Use a warm, helpful tone
Never:
- Promise refunds — escalate to a human`;
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: systemPrompt },
{ role: "user", content: "I want a refund for last month." }
]
});
console.log(response.choices[0].message.content);
For quick testing without an SDK:
curl https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer sk-..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
}'
For ongoing conversations, you append each new user message and each new assistant response to the messages array. The system prompt stays at index 0 throughout:
messages = [
{"role": "system", "content": system_prompt}, # always first
{"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi! How can I help?"},
{"role": "user", "content": "I need a refund"},
{"role": "assistant", "content": "I understand. Which plan are you on?"},
{"role": "user", "content": "Pro plan, billed monthly"},
# ... new request goes here
]
Every time you call the API, you send the full array. The system prompt is always present in every request, which is why it's included in your token cost on every call.
OpenAI supports automatic prompt caching for prefixes longer than 1024 tokens. When you reuse the same system prompt across many requests, OpenAI caches it and discounts subsequent reads to 50% of normal input price.
Two implementation tips to maximize cache hits:
Before you deploy, count how many tokens your system prompt uses. The free token counter shows the count for any text you paste. For a typical chatbot, expect 200-500 tokens for the system prompt. Coding assistants often run 500-1000. Complex agents can hit 2000-5000.
Before shipping a system prompt to production:
If any test fails, refine the prompt and re-run all tests. Iterate until clean.
Generate a tested system prompt in 2 minutes.
Open System Prompt Generator →