back

CoT May Be Highly Informative Despite “Unfaithfulness”

Source

Published

Oct 12, 2025

Share On

Get SIGNAL/NOISE in your inbox daily

Recent work from Anthropic and others claims that LLMs’ chains of thoughts can be “unfaithful”. These papers make an important point: you can’t take everything in the CoT at face value. As a result, people often use these results to conclude the CoT is useless for analyzing and monitoring AIs. Here, instead of asking whether the CoT always contains all information relevant to a model’s decision-making in all problems, we ask if it contains enough information to allow developers to monitor models in practice. Our experiments suggest that it might.

CoT May Be Highly Informative Despite “Unfaithfulness”

Recent Stories

‘People want to use a lot of AI and don’t want to pay’: Sam Altman explains controversial new ChatGPT feature

Robotics and world models are AI’s next frontier, and China is already ahead of the West — research shows almost 13,000 robots deployed in 2025 alone

10 things I learned from burning myself out with AI coding agents