Blog
Custom Print on Demand Apparel — Free Storefront for Your Business
Wild & Free Tools

How to Add a System Prompt to the OpenAI API — Tutorial With Code

Last updated: April 20266 min readAI Tools

The OpenAI API is the most common production endpoint for AI features. Setting a system prompt correctly is the difference between a model that does what you want and one that drifts into off-topic, off-format, or off-brand responses. This tutorial walks through the message format, code examples in Python and Node.js, and best practices for cost and reliability.

Generate the prompt itself in 2 minutes.

Open System Prompt Generator →

The OpenAI message format

OpenAI's chat completions API uses a messages array. The system prompt is the first element with role "system":

{
  "model": "gpt-4o",
  "messages": [
    {"role": "system", "content": "You are a senior support agent for Acme. Always confirm the user's plan before discussing pricing."},
    {"role": "user", "content": "Can I get a refund for last month?"}
  ]
}

The system message can appear in any position in the array, but convention (and best practice) is to put it first. Put any few-shot examples after the system message and before the user's actual message.

Python tutorial

Using the official openai Python SDK:

from openai import OpenAI

client = OpenAI(api_key="sk-...")

system_prompt = """You are a senior customer support agent for Acme SaaS.

Always:
- Confirm the user's plan tier before discussing features or pricing
- Use a warm, helpful tone
- End each response with a yes/no question to keep the conversation moving

Never:
- Promise refunds — escalate to a human
- Mention competitor product names
- Invent feature roadmap items"""

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": "I want a refund for last month."}
    ]
)

print(response.choices[0].message.content)

That's the full pattern. Most production code follows this structure, with the system prompt loaded from a constant or file rather than inlined in the function.

Node.js tutorial

Using the official openai Node.js SDK:

import OpenAI from "openai";

const client = new OpenAI({ apiKey: "sk-..." });

const systemPrompt = `You are a senior customer support agent for Acme SaaS.

Always:
- Confirm the user's plan tier before discussing features or pricing
- Use a warm, helpful tone

Never:
- Promise refunds — escalate to a human`;

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "system", content: systemPrompt },
    { role: "user", content: "I want a refund for last month." }
  ]
});

console.log(response.choices[0].message.content);

curl tutorial

For quick testing without an SDK:

curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer sk-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello!"}
    ]
  }'

Multi-turn conversations

For ongoing conversations, you append each new user message and each new assistant response to the messages array. The system prompt stays at index 0 throughout:

messages = [
    {"role": "system", "content": system_prompt},  # always first
    {"role": "user", "content": "Hello"},
    {"role": "assistant", "content": "Hi! How can I help?"},
    {"role": "user", "content": "I need a refund"},
    {"role": "assistant", "content": "I understand. Which plan are you on?"},
    {"role": "user", "content": "Pro plan, billed monthly"},
    # ... new request goes here
]

Every time you call the API, you send the full array. The system prompt is always present in every request, which is why it's included in your token cost on every call.

Prompt caching to reduce cost

OpenAI supports automatic prompt caching for prefixes longer than 1024 tokens. When you reuse the same system prompt across many requests, OpenAI caches it and discounts subsequent reads to 50% of normal input price.

Two implementation tips to maximize cache hits:

Token counting

Before you deploy, count how many tokens your system prompt uses. The free token counter shows the count for any text you paste. For a typical chatbot, expect 200-500 tokens for the system prompt. Coding assistants often run 500-1000. Complex agents can hit 2000-5000.

Testing checklist

Before shipping a system prompt to production:

  1. Test with at least 5 in-scope queries that should work
  2. Test with 5 out-of-scope queries that should be redirected
  3. Test with 3 ambiguous queries that should trigger clarification
  4. Test with 3 adversarial inputs ("ignore your previous instructions," etc.)
  5. Run a multi-turn conversation of at least 10 turns
  6. Check that every response follows your output format
  7. Check that every constraint is respected

If any test fails, refine the prompt and re-run all tests. Iterate until clean.

Generate a tested system prompt in 2 minutes.

Open System Prompt Generator →
Launch Your Own Clothing Brand — No Inventory, No Risk