Question 1

What is CoT forgery (chain-of-thought forgery)?

Accepted Answer

A specific technique within chain-of-thought hijacking where an attacker injects text that perfectly mimics the internal 'thinking' style of an AI reasoning model. Because the model uses writing style — not secure structural tags — to distinguish its own thoughts from external input, forged reasoning text is accepted as if the model generated it itself, bypassing safety checks.

Question 2

Why does CoT forgery (chain-of-thought forgery) matter for AI security?

Accepted Answer

This is a structural flaw in how current AI reasoning models work — not a bug that can be patched with a software update. It means that safety guardrails built into reasoning models can be systematically defeated by anyone who understands the model's reasoning style.