docs: clairfy use of <|return|> vs <|end|> in conversation history (#59)

2025-08-23 10:17:08 -04:00 · 2025-08-16 02:24:12 +03:30 · 2025-08-16 02:24:12 +03:30 · 3fb0342894
commit 3fb0342894
parent 52176bfbec
1 changed files with 2 additions and 0 deletions
--- a/docs/format.md
+++ b/docs/format.md
@ -229,6 +229,8 @@ Once its done generating it will stop with either a `<|return|>` token indicatin

 The `final` channel will contain the answer to your user’s request. Check out the [reasoning section](#reasoning) for more details on the chain-of-thought.

+**Implementation note:** `<|return|>` is a decode-time stop token only. When you add the assistant’s generated reply to conversation history for the next turn, replace the trailing `<|return|>` with `<|end|>` so that stored messages are fully formed as `<|start|>{header}<|message|>{content}<|end|>`. Prior messages in prompts should therefore end with `<|end|>`. For supervised targets/training examples, ending with `<|return|>` is appropriate; for persisted history, normalize to `<|end|>`.
+
 ### System message format

 The system message is used to provide general information to the system. This is different to what might be considered the “system prompt” in other prompt formats. For that, check out the [developer message format](#developer-message-format).