Study cracks the code behind why AI behaves as it does

Celebrity Gig
Phase diagram for the example of a 3-dimensional token embedding given a 4-word vocabulary. Credit: arXiv (2025). DOI: 10.48550/arxiv.2504.04600

AI models like ChatGPT have amazed the world with their ability to write poetry, solve equations and even pass medical exams. But they can also churn out harmful content, or promote disinformation.

In a new study, George Washington University researchers have used physics to dissect and explain the attention mechanism at the core of AI systems. The research is published on the arXiv preprint server.

READ ALSO:  'Slow travel' start-up launches cross-Channel crossings by sail

Researchers Neil Johnson and Frank Yingjie Huo looked into why AI repeats itself, why it sometimes makes things up and where harmful or biased content comes from, even when the input seems innocent.

The researchers found that the attention mechanism at the heart of these systems behaves like two spinning tops working together to deliver a response. AI’s responses are shaped not just by the input, but by how the input interacts with everything the AI has ever learned.

READ ALSO:  Language Barriers Divide Global Workforces — But Not For Long With This New Technology

This analysis could lead to solutions that would make AI safer, more trustworthy and resistant to manipulation.

More information:
Frank Yingjie Huo et al, Capturing AI’s Attention: Physics of Repetition, Hallucination, Bias and Beyond, arXiv (2025). DOI: 10.48550/arxiv.2504.04600

Journal information:
arXiv


Provided by
George Washington University


Categories

Share This Article
Leave a comment