Context poisoning

Learn how inaccurate or irrelevant data can contaminate Bob's context, causing errors and task failures, and how to recover by starting a new session.

Context poisoning occurs when inaccurate or irrelevant data contaminates the language model's active context. This causes Bob to draw incorrect conclusions, provide erroneous information to tools, and progressively deviate from the intended task with each interaction.

Once a chat session's context is compromised, the only reliable solution is to start a new session. Starting fresh with a clean context is crucial for maintaining accuracy and effectiveness.

Symptoms of context poisoning

You can identify context poisoning by observing these behaviors:

  • Degraded output quality: Suggestions become nonsensical, repetitive, or irrelevant.
  • Tool misalignment: Tool calls no longer correspond to your requests.
  • Orchestration failures: Orchestrator mode chains may stall, loop indefinitely, or fail to complete.
  • Temporary fixes: Reapplying a clean prompt or instructions offers only brief relief before issues resurface.
  • Tool usage confusion: Bob struggles to correctly use or recall how to use tools defined in the system prompt.

Common causes

Context poisoning can be triggered by several factors:

  • Model hallucination: Bob generates incorrect information and subsequently treats it as factual context.
  • Code comments: Outdated, incorrect, or ambiguous comments in the codebase can be misinterpreted, leading Bob down the wrong path.
  • Contaminated user input: Copy-pasting logs or text containing hidden or rogue control characters.
  • Context window overflow: As a session grows, older useful information may be pushed out of the context window, allowing poisoned data to have greater relative impact.

Once bad data enters the context, it tends to persist. Bob reevaluates this tainted information in subsequent reasoning cycles, similar to a permanent flaw affecting perception until the context is completely reset.

Can a "wake-up prompt" resolve context poisoning?

Short answer: No.

A corrective prompt might temporarily suppress symptoms, but the problematic data remains in the conversational buffer. The model will likely revert to the poisoned state as soon as the interaction deviates from the narrow scope of the corrective prompt.

Detailed explanation:

  • Reinjecting the full set of tool definitions or core directives can sometimes mask the damage for one or more interactions following the initial context poisoning.
  • However, the underlying poisoned context remains. Any query or task outside the immediate patch will likely retrigger the original issue.
  • This approach is unreliable, similar to placing a warning label on a leaking pipe instead of repairing it.

Effective recovery strategies

To reliably recover from context poisoning:

  • Start a new session: The most dependable solution is to start a new chat session. This clears the contaminated context entirely.
  • Be selective with data: When pasting logs or other data, only include the essential information Bob requires.
  • Break down complex tasks: For large or complex tasks, break them into smaller, focused chat sessions. This helps ensure that stale or irrelevant information ages out of the context window more quickly.
  • Validate tool output: If a tool returns nonsensical or clearly incorrect data, delete that message from the chat history before Bob can process it and incorporate it into context.

Addressing a common question: The "magic bullet" prompt

A frequent question from the community is:

"Have you found a prompt that wakes it back up? Maybe a prompt that just has the tools instructions we can push back in manually?"

No single prompt offers a lasting fix. Any immediate improvement is superficial because the corrupted text persists in the session's history, ready to cause further issues. The only robust solution is to discard the compromised session and start a new one with a clean prompt.

How is this topic?