Alibaba’s ZeroSearch method uses simulated search results to slash LLM training costs

Celebrity Gig
Demonstration of PPO and GRPO training without the search engine. Credit: arXiv (2025). DOI: 10.48550/arxiv.2505.04588

A team of AI researchers at the Alibaba Group’s Tongyi Lab, has debuted a new approach to training LLMs; one that costs much less than those now currently in use. Their paper is posted on the arXiv preprint server.

As LLMs such as ChatGPT have become mainstream, the resources and associated costs of running them have skyrocketed, forcing AI makers to look for ways to get the same or better results using other techniques. To this end, the team working at the Tongyi Lab has found a way to train LLMs in a new way that uses far fewer resources.

The idea behind ZeroSearch is to no longer use API calls to search engines to amass search results as a way to train an LLM. Their method instead uses simulated AI-generated documents to mimic the output from traditional search engines, such as Google.

READ ALSO:  Minister advises engineers on infrastructure development

The team at Alibaba suggests such an approach not only lowers resource needs, but improves the quality of the training because the data in simulated documents does not have the unpredictable nature of public search results. They also note that the new technique allows for slowly degrading the quality of documents that are produced as a way to challenge retrieval scenarios.

READ ALSO:  The OnePlus 13 could arrive with a redesigned, curved display

When testing their approach in an AI model, the researchers found that training costs associated with ZeroSearch came to $70.80 per 64,000 queries. The same queries, using Google APIs, cost $586.70. They found testing other models using more parameters reduced costs even more. The quality of results produced by the ZeroSearch-based models generally matched or exceeded those received from API-based models.

The researchers acknowledge that there is a trade-off with their approach. The ZeroSearch method can require up to four A100 GPUs whereas the Google API method has no GPU requirement. While ZeroSearch training is more cost-effective, this would present a tradeoff in terms of sustainability and hardware requirements.

READ ALSO:  Where the tech giant goes from here

More information:
Hao Sun et al, ZeroSearch: Incentivize the Search Capability of LLMs without Searching, arXiv (2025). DOI: 10.48550/arxiv.2505.04588

Journal information:
arXiv


© 2025 Science X Network

Citation:
Alibaba’s ZeroSearch method uses simulated search results to slash LLM training costs (2025, May 16)
retrieved 16 May 2025
from

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.

Categories

Share This Article
Leave a comment