back to reflections

The Model Doesn't Carry Your Mood

LLM weights stay fixed during inference. A conversation creates transient activations and caches, not persistent emotion or memory shared between users.
Petko D. Petkovon a break from CISO duties, building cbk.ai

I see a claim that LLMs "learn from your prompt" or "carry a mood between users." It sounds plausible and it is also wrong, so it is worth pinning down what actually happens.

A trained model's weights are fixed during inference. When you send a prompt, GPT-style or Claude-style transformers do not edit those weights. Nothing about your conversation gets written back into the model. The model is not picking up an emotional state from you, and it is not handing one to the next person.

What does move is runtime state. A forward pass produces hidden activations, internal vectors that depend on the current input. During generation the model also keeps a KV cache so it does not recompute earlier tokens at every step. That cache is a speed trick, scoped to the request.

In a normal serving setup each conversation gets its own temporary state tied to its own context. Person A does not inherit some leftover feeling from person B. If that happened it would be a serious isolation bug. Prefix caching can reuse computation across requests, but only when the tokens share the same prefix under the same model. It has nothing to do with a shared mood.

So when a paper talks about "emotion" or "internal representations," it usually means those hidden activations during a forward pass (a single run / completion). Certain inputs light up certain directions that researchers have labeled refusal, deception, sycophancy, etc. The model does not feel anything. Those patterns just influence the output while the computation runs, then they are gone.

The long context window is the one place this gets fuzzy. A big conversation does act like short-term working memory, because every future token is conditioned on everything already in the window. Drop the context and the effect disappears. The weights kept nothing.

The honest version is simple. Current models do not experience emotion and do not rewrite themselves across users. Their weights stay fixed and their per-request state is throwaway. Saying they have "no internal state" overcorrects in the other direction. They have transient activations and caches you can measure, steer, and describe. Calling that internal computational state is fair. Calling it a persistent feeling is where people go wrong.