diff --git a/docs/format.md b/docs/format.md
index 7ec0113..49cc1ec 100644
--- a/docs/format.md
+++ b/docs/format.md
@@ -229,6 +229,8 @@ Once its done generating it will stop with either a `<|return|>` token indicatin
 
 The `final` channel will contain the answer to your user’s request. Check out the [reasoning section](#reasoning) for more details on the chain-of-thought.
 
+**Implementation note:** `<|return|>` is a decode-time stop token only. When you add the assistant’s generated reply to conversation history for the next turn, replace the trailing `<|return|>` with `<|end|>` so that stored messages are fully formed as `<|start|>{header}<|message|>{content}<|end|>`. Prior messages in prompts should therefore end with `<|end|>`. For supervised targets/training examples, ending with `<|return|>` is appropriate; for persisted history, normalize to `<|end|>`.
+
 ### System message format
 
 The system message is used to provide general information to the system. This is different to what might be considered the “system prompt” in other prompt formats. For that, check out the [developer message format](#developer-message-format).