guide

LLM Prompt Formatting: OpenAI, Claude, Gemini, and Llama API Formats Explained

By Rui Barreira · Last updated: 13 June 2026

Every major LLM API uses a different JSON structure to send prompts. OpenAI puts the system prompt inside a messages array. Claude separates it as a top-level parameter. Gemini wraps everything in a contents array with system_instruction. Llama uses special delimiter tokens. Use brevio Prompt Formatter to convert your system and user messages into the correct format for each provider — copy-paste ready JSON in one click.

Why Prompt Format Matters

LLM APIs are not interchangeable. Even if your prompt text is identical, sending an OpenAI-formatted request to the Claude API will fail immediately with a JSON schema error. Each provider designed their API structure independently, and the differences reflect genuinely different architectural choices about how the system prompt relates to the conversation.

Getting the format right is the first hurdle when switching providers or building multi-provider applications. The Prompt Formatter eliminates the need to memorise each schema — enter your text once, switch between tabs.

OpenAI Format

OpenAI's Chat Completions API uses a flat messages array where the system prompt is the first element with role: "system". Subsequent turns alternate between user and assistant roles.

{
  "model": "gpt-4o",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Explain quantum computing." }
  ]
}

The system message can appear anywhere in the array, but convention places it first. Multiple system messages are allowed but unusual. OpenAI also supports the newer developer role as an alias for system in newer models.

Claude (Anthropic) Format

Anthropic's Messages API separates the system prompt from the conversation. The system field is a top-level string parameter, while messages contains only user and assistant turns.

{
  "model": "claude-opus-4-5",
  "max_tokens": 1024,
  "system": "You are a helpful assistant.",
  "messages": [
    { "role": "user", "content": "Explain quantum computing." }
  ]
}

The separation reflects Claude's design: the system prompt is a persistent instruction set, conceptually different from the conversation. This structure also makes multi-turn conversations cleaner — the system prompt does not appear as a message in the conversation history.

Gemini Format

Google's Gemini API uses system_instruction as a top-level object with a parts array, and wraps conversation turns in a contents array where each turn has a role and parts.

{
  "system_instruction": {
    "parts": [{ "text": "You are a helpful assistant." }]
  },
  "contents": [
    {
      "role": "user",
      "parts": [{ "text": "Explain quantum computing." }]
    }
  ]
}

The parts array in both system_instruction and contents exists to support multimodal content — a single turn can contain multiple parts (text, image, audio). For text-only prompts, each parts array has a single object.

Llama Chat Template

Llama models use a special token-based chat template rather than a JSON API schema. The tokens are embedded in the model's vocabulary and must be reproduced exactly — they serve as structural delimiters the model was trained to recognise.

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

Explain quantum computing.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

<|begin_of_text|> starts every sequence. <|start_header_id|>role<|end_header_id|> opens a role block followed by two newlines before the content. <|eot_id|> (End of Turn) closes each block. The final assistant header with no closing <|eot_id|> tells the model to continue generating.

Provider Format Comparison

ProviderSystem Prompt FieldMessage StructureFormat
OpenAImessages[0] with role: systemFlat messages arrayJSON
ClaudeTop-level system stringmessages array (user/assistant only)JSON
Geminisystem_instruction.parts arraycontents array with partsJSON
LlamaSpecial token blockToken-delimited role blocksToken string

DevTools Verification

When using brevio Prompt Formatter, open DevTools and check the Network tab. No POST requests are made when you type or switch providers — all formatting runs in JavaScript in your browser. The tool is safe to use with real system prompt content.

Common Mistakes

  • Sending OpenAI format to Claude. Claude's API expects a system string at the top level, not a system-role message object. The API returns a 400 validation error if you use OpenAI's message structure.
  • Missing max_tokens for Claude. Claude's Messages API requires the max_tokens parameter. OpenAI makes it optional (defaulting to the model's maximum). Claude will reject requests without it.
  • Gemini parts array with plain strings. Gemini expects parts: [{"text": "..."}], not parts: ["..."]. Each part must be an object with a text key.
  • Llama template without final assistant header. If you omit the trailing <|start_header_id|>assistant<|end_header_id|>, the model may repeat the user turn or refuse to generate.

Frequently Asked Questions

Why does OpenAI use a messages array but Claude uses a separate system field?
OpenAI's Chat Completions API puts the system prompt inside the messages array as a system role object. Anthropic's Messages API separates the system parameter from the messages array because the system prompt is treated as a configuration parameter rather than a conversational turn, which simplifies multi-turn conversation handling.
What are Llama chat template tokens?
Llama models use special tokens to delimit conversation roles: <|begin_of_text|> starts the sequence, <|start_header_id|>role<|end_header_id|> opens a role block, and <|eot_id|> closes it. These tokens are baked into the model's training and must be reproduced exactly — misformatted templates cause degraded output quality.
Can I use the same system prompt across all providers?
Yes, the text content of your system prompt can be identical across providers. Only the JSON structure wrapping it changes. The Prompt Formatter tool handles the structural differences so you can reuse the same instructions across OpenAI, Claude, Gemini, and Llama without rewriting.
What is the difference between Gemini's system_instruction and OpenAI's system role?
Gemini's system_instruction is a top-level parameter containing a parts array, similar to Claude's approach. OpenAI's system role is an object inside the messages array. Both achieve the same result — setting persistent instructions for the model — but the JSON paths differ.
More free toolsSee all 162
Merge PDFsCompress ImageJSON FormatterPassword GeneratorVAT CalculatorQR Code Generator