back
Get SIGNAL/NOISE in your inbox daily
Recent work from Anthropic and others claims that LLMs’ chains of thoughts can be “unfaithful”. These papers make an important point: you can’t take everything in the CoT at face value. As a result, people often use these results to conclude the CoT is useless for analyzing and monitoring AIs. Here, instead of asking whether the CoT always contains all information relevant to a model’s decision-making in all problems, we ask if it contains enough information to allow developers to monitor models in practice. Our experiments suggest that it might.
Recent Stories
Jan 19, 2026
Solving AI’s energy challenge: sustainable data centers for a competitive UK future
AI is pushing UK data centers to breaking point on power and cooling
Jan 19, 2026AI helps reveal global surge in floating algae
For the first time and with help from artificial intelligence, researchers have conducted a comprehensive study of global floating algae and found that blooms are expanding across the ocean. These trends ...
Jan 19, 2026Agent Lightning: Train ANY AI Agents with Reinforcement Learning
We present Agent Lightning, a flexible and extensible framework that enables Reinforcement Learning (RL)-based training of Large Language Models (LLMs) for any AI agent. Unlike existing methods...