Ask a large language model (LLM) like GPT-4 to smell a rain-soaked campsite, and itâll politely decline. Ask the same system to describe that scent to you, and itâll wax poetic about âan air thick with anticipationâ and âa scent that is both fresh and earthy,â despite having neither prior experience with rain nor a nose to help it make such observations. One possible explanation for this phenomenon is that the LLM is simply mimicking the text present in its vast training data, rather than working with any real understanding of rain or smell.
But does the lack of eyes mean that language models canât ever âunderstandâ that a lion is âlargerâ than a house cat? Philosophers and scientists alike have long considered the ability to assign meaning to language a hallmark of human intelligenceâand pondered what essential ingredients enable us to do so.
Peering into this enigma, researchers from MITâs Computer Science and Artificial Intelligence Laboratory (CSAIL) have uncovered intriguing results suggesting that language models may develop their own understanding of reality as a way to improve their generative abilities.
The team first developed a set of small Karel puzzles, which consisted of coming up with instructions to control a robot in a simulated environment. They then trained an LLM on the solutions, but without demonstrating how the solutions actually worked. Finally, using a machine learning technique called âprobing,â they looked inside the modelâs âthought processâ as it generated new solutions.
After training on over 1 million random puzzles, they found that the model spontaneously developed its own conception of the underlying simulation, despite never being exposed to this reality during training. Such findings call into question our intuitions about what types of information are necessary for learning linguistic meaningâand whether LLMs may someday understand language at a deeper level than they do today.
âAt the start of these experiments, the language model generated random instructions that didnât work. By the time we completed training, our language model generated correct instructions at a rate of 92.4 percent,â says MIT electrical engineering and computer science (EECS) Ph.D. student and CSAIL affiliate Charles Jin, who is the lead author of a new paper on the work.
âThis was a very exciting moment for us, because we thought that if your language model could complete a task with that level of accuracy, we might expect it to understand the meanings within the language as well. This gave us a starting point to explore whether LLMs do in fact understand text, and now we see that theyâre capable of much more than just blindly stitching words together.â
The paper is published on the arXiv preprint server.
Inside the mind of an LLM
The probe helped Jin witness this progress firsthand. Its role was to interpret what the LLM thought the instructions meant, unveiling that the LLM developed its own internal simulation of how the robot moves in response to each instruction. As the modelâs ability to solve puzzles improved, these conceptions also became more accurate, indicating that the LLM was starting to understand the instructions. Before long, the model was consistently putting the pieces together correctly to form working instructions.
Jin notes that the LLMâs understanding of language develops in phases, much like how a child learns speech in multiple steps. Starting off, itâs like a baby babbling: repetitive and mostly unintelligible. Then, the language model acquires syntax, or the rules of the language. This enables it to generate instructions that might look like genuine solutions, but they still donât work.
The LLMâs instructions gradually improve, though. Once the model acquires meaning, it starts to churn out instructions that correctly implement the requested specifications, like a child forming coherent sentences.
Separating the method from the model: A âBizarro Worldâ
The probe was only intended to âgo inside the brain of an LLM,â as Jin characterizes it, but there was a remote possibility that it also did some of the thinking for the model. The researchers wanted to ensure that their model understood the instructions independently of the probe, instead of the probe inferring the robotâs movements from the LLMâs grasp of syntax.
âImagine you have a pile of data that encodes the LMâs thought process,â suggests Jin. âThe probe is like a forensics analyst: You hand this pile of data to the analyst and say, âHereâs how the robot moves, now try and find the robotâs movements in the pile of data.â The analyst later tells you that they know whatâs going on with the robot in the pile of data. But what if the pile of data actually just encodes the raw instructions, and the analyst has figured out some clever way to extract the instructions and follow them accordingly? Then the language model hasnât really learned what the instructions mean at all.â
To disentangle their roles, the researchers flipped the meanings of the instructions for a new probe. In this âBizarro World,â as Jin calls it, directions like âupâ now meant âdownâ within the instructions moving the robot across its grid.
âIf the probe is translating instructions to robot positions, it should be able to translate the instructions according to the bizarro meanings equally well,â says Jin. âBut if the probe is actually finding encodings of the original robot movements in the language modelâs thought process, then it should struggle to extract the bizarro robot movements from the original thought process.â
As it turned out, the new probe experienced translation errors, unable to interpret a language model that had different meanings of the instructions. This meant the original semantics were embedded within the language model, indicating that the LLM understood what instructions were needed independently of the original probing classifier.
âThis research directly targets a central question in modern artificial intelligence: Are the surprising capabilities of large language models due simply to statistical correlations at scale, or do large language models develop a meaningful understanding of the reality that they are asked to work with? This research indicates that the LLM develops an internal model of the simulated reality, even though it was never trained to develop this model,â says Martin Rinard, an MIT professor in EECS, CSAIL member, and senior author on the paper.
This experiment further supported the teamâs hypothesis that language models can develop a deeper understanding of language. Still, Jin acknowledges a few limitations to their paper: They used a very simple programming language and a relatively small model to glean their insights. In an upcoming work, theyâll look to use a more general setting. While Jinâs latest research doesnât outline how to make the language model learn meaning faster, he believes future work can build on these insights to improve how language models are trained.
âAn intriguing open question is whether the LLM is actually using its internal model of reality to reason about that reality as it solves the robot navigation problem,â says Rinard. âWhile our results are consistent with the LLM using the model in this way, our experiments are not designed to answer this next question.â
âThere is a lot of debate these days about whether LLMs are actually âunderstandingâ language, or rather if their success can be attributed to what is essentially tricks and heuristics that come from slurping up large volumes of text,â says Ellie Pavlick, assistant professor of computer science and linguistics at Brown University, who was not involved in the paper.
âThese questions lie at the heart of how we build AI and what we expect to be inherent possibilities or limitations of our technology. This is a nice paper that looks at this question in a controlled wayâthe authors exploit the fact that computer code, like natural language, has both syntax and semantics, but unlike natural language, the semantics can be directly observed and manipulated for experimental purposes. The experimental design is elegant, and their findings are optimistic, suggesting that maybe LLMs can learn something deeper about what language âmeans.'â
More information:
Charles Jin et al, Emergent Representations of Program Semantics in Language Models Trained on Programs, arXiv (2023). DOI: 10.48550/arxiv.2305.11169
This story is republished courtesy of MIT News (web.mit.edu/newsoffice/), a popular site that covers news about MIT research, innovation and teaching.
Citation:
Experiments reveal LLMs develop their own understanding of reality as their language abilities improve (2024, August 14)
retrieved 15 August 2024
from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.