Are you the asshole? Of course not!—quantifying LLMs’ sycophancy problem
Text settings Story text Size Small Standard Large Width * Standard Wide Links Standard Orange * Subscribers only Learn more Minimize to nav Researchers and users of LLMs have long been aware that AI models have a troubling tendency to tell people what they want to hear, even if that means being less accurate. But many reports of this phenomenon amount to mere anecdotes that don’t provide much visibility into how common this sycophantic behavior is across frontier LLMs. Two recent research papers have come at this problem a bit more rigorously, though, taking different tacks in attempting to quantify exactly how likely an LLM is to listen when a user provides factually incorrect or socially inappropriate information in a prompt. Solve…