How do I set a permanent system prompt in Ollama?

Create a Modelfile with a SYSTEM directive, then build a custom model with "ollama create my-model -f Modelfile". The SYSTEM directive becomes the persistent system prompt for every conversation with that custom model.

Can I change the system prompt mid-chat in Ollama?

Yes. In an interactive ollama run session, type "/set system" followed by the new system prompt. It updates immediately for the rest of the session. Note that this resets the conversation history.

Does the system prompt work the same on Llama, Mistral, and Qwen?

Yes — the Ollama interface is the same. Internally, each model has its own chat template that wraps the system prompt in model-specific tags (Llama uses [INST] tags, Mistral uses [INST], Qwen uses ChatML). Ollama handles this translation automatically.

How to Set a System Prompt in Ollama (Local LLM Tutorial)

Last updated: April 20266 min readAI Tools

Ollama is the easiest way to run a local LLM, and setting a system prompt is the most important configuration step for any local model. This tutorial covers all four ways to do it: the Modelfile, the /set command, the API, and the Python client. Examples work with Llama 3.x, Mistral, Qwen 3, Phi-4, and any other model in the Ollama library.

Generate a system prompt for your local model.

Open System Prompt Generator →

Method 1 — Modelfile (permanent)

The Modelfile is Ollama's equivalent of a Dockerfile. You define a base model, a system prompt, and other parameters, then build a custom model.

FROM llama3.2
SYSTEM """You are a senior code reviewer for Python projects.

Always:
- Comment on naming, structure, and potential bugs
- Suggest specific line edits with code blocks
- Flag deprecated functions
- Mention security concerns when relevant

Never:
- Praise code without offering at least one improvement
- Use jargon without explaining it
- Suggest rewriting more than necessary"""
PARAMETER temperature 0.3
PARAMETER num_ctx 8192

Save this as Modelfile and build:

ollama create code-reviewer -f Modelfile

Now run it:

ollama run code-reviewer

Every conversation with this custom model starts with that system prompt baked in. You can share the Modelfile with teammates or commit it to git.

Method 2 — /set system in interactive mode

For ad-hoc testing without creating a Modelfile:

ollama run llama3.2
>>> /set system You are a Python tutor for absolute beginners. Always explain code line by line.
>>> Write a function to find prime numbers

The /set system command updates the system prompt for the current session. This is perfect for trying different prompts quickly without building new models.

Note: changing the system prompt mid-chat clears the conversation history. The new system prompt takes effect from the next user message.

Method 3 — Ollama HTTP API

For app integration, use the HTTP API directly:

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.2",
  "messages": [
    {"role": "system", "content": "You are a Python tutor for beginners. Explain code line by line."},
    {"role": "user", "content": "Write a function to find prime numbers."}
  ],
  "stream": false
}'

The format is identical to OpenAI's chat completions API — Ollama deliberately mirrors it for easy migration.

Method 4 — Python client

Using the official ollama Python package:

import ollama

response = ollama.chat(
    model='llama3.2',
    messages=[
        {
            'role': 'system',
            'content': 'You are a Python tutor for beginners. Explain code line by line.'
        },
        {
            'role': 'user',
            'content': 'Write a function to find prime numbers.'
        }
    ]
)

print(response['message']['content'])

System prompt support across local models

Model	System prompt support	Notes
Llama 3.x	Excellent	Built-in support, follows instructions reliably
Mistral	Good	Follows system prompts, slightly weaker on long lists
Qwen 3	Excellent	Strong instruction following, good for structured outputs
Phi-4	Good	Microsoft's small model, surprisingly capable
Gemma 2	Good	Google's open model, follows persona instructions well
CodeLlama	Good	Specialized for code, system prompts shape coding style
DeepSeek	Excellent	Strong reasoning, follows complex multi-rule prompts

Optimizing system prompts for local models

Local models are typically smaller than frontier API models, so they need clearer prompts to behave well. Tips:

Be more explicit. Where GPT-4o might infer "be helpful and concise," local models benefit from spelled-out rules like "Reply in 2-4 sentences. Use bullet points for multiple items."
Limit rule count. Smaller models forget rules faster. 5-8 rules works better than 12-15.
Use few-shot examples. Local models respond more reliably to demonstrated patterns than to rule descriptions.
Prefer positive framing. "Always do X" is followed more reliably than "Never do Y" on smaller models.
Lower temperature for instruction following. Set temperature 0.1-0.3 for models that need to stick to format. Higher temperature for creative tasks.

Common issues

System prompt ignored. Some local models have weaker instruction-following. Try a different model (Llama 3.x and Qwen 3 are most reliable) or simplify the prompt.
Context window exceeded. Local models often have smaller context windows than API models. Reduce system prompt length or increase num_ctx.
Slow first response. Local inference loads the model into memory on first call. Subsequent calls are much faster. Pre-warm by sending a dummy request before users connect.
Different behavior than the API model you tested with. Local models have their own quirks. Test the system prompt against the actual model you're deploying, not an API stand-in.

Generate a Modelfile-ready system prompt now.

Open System Prompt Generator →

How to Set a System Prompt in Ollama (Local LLM Tutorial)

Method 1 — Modelfile (permanent)

Method 2 — /set system in interactive mode

Method 3 — Ollama HTTP API

Method 4 — Python client

System prompt support across local models

Optimizing system prompts for local models

Common issues

Related Posts

AIPRM Alternative

Free AI Prompt Builder

System Prompt Generator

Token Counter