Almost everyone in tech is investing heavily in artificial intelligence right now, and Google is among those most committed to an AI future. Project Astra, unveiled at Google I/O 2024, is a big part of that – and it could end up being one of Google’s most important AI tools.
Astra is being billed as “a universal AI agent that is helpful in everyday life”. It’s essentially something like a blending of Google Assistant and Google Gemini, with added features and supercharged capabilities for a natural, conversational experience.
Here, we’re going to explain everything you need to know about Project Astra – how it works, what it can do, when you can get it, and how it might shape the future.
What is Project Astra?
In some ways, Project Astra isn’t any different to the AI chatbots we’ve already got: you ask a question about what’s in a picture, or about how to do something, or request some creative text to be generated, and Astra gets on with it.
What elevates this particular AI project is its multimodal functionality (the way text, images, video, and audio can all be combined), the speed that the bot works at, and how conversational it is. Google’s aim, as we’ve already mentioned, is to create “a universal AI agent” that can do anything and understand everything.
Think about the Hal 9000 bot in Kubrick’s 2001: A Space Odyssey, or the Samantha assistant in the movie Her: talking to them is like talking to a human being, and there isn’t much they can’t do. (Both those AIs eventually got too big for their creators to control, but let’s ignore that for the time being.)
Project Astra has been built to understand context and to take actions, to be able to work in real time, and to remember conversations from the past. From the demos we’ve seen so far, it works on phones and on smart glasses, and is powered by the Google Gemini AI models – so it may eventually be part of the Gemini app, rather than something that’s separate and standalone.
When is Project Astra coming out?
Project Astra is in its early stages: this isn’t something that’s going to be available to the masses for a few months at least. That said, Google says that “some of these agent capabilities will come to Google products like the Gemini app later this year”, so it looks as though elements of Astra will appear gradually in Google’s apps as we go through 2024.
When we were given some hands-on time with Project Astra at I/O 2024, these sessions were limited to four minutes each – so that gives you some idea of how far away this is from being something that anyone, anywhere can make use of. What’s more, the Astra kit didn’t look particularly portable, and the Google reps were careful to refer to it as a prototype.
Taking all that together, we get the impression that some of the Project Astra tricks we’ve seen demoed might appear in the Google Gemini app sooner rather than later. At the same time, the full Astra experience – perhaps involving some dedicated hardware – is probably not going to be rolling out until 2025 at the earliest.
Now that Google has shared what Project Astra is and what it’s capable of, it’s likely that we’re going to hear a whole lot more about it in the months ahead. Bear in mind that ChatGPT and Dall-E developer OpenAI is busy pushing out major upgrades of its own, and Google isn’t going to want to be left behind.
What can I do with Project Astra?
One of Google’s demos shows Astra running on a phone, using its camera input and talking naturally to a user: it’s asked to flag up something in view that can play sounds, and correctly identifies a speaker. When an arrow is drawn on screen, Astra then recognizes and talks about the speaker component highlighted by the arrow.
In another demo, we see Astra correctly identifying world landmarks from drawings in a sketchbook. It’s also able to remember the order of objects in a list, identify a neighborhood from an image, understand the purpose of sections of code that are shown to it, and solve math problems that are written out.
There’s a lot of emphasis on recognizing objects, drawings, text, and more through a camera system – while at the same time understanding human speech and generating appropriate responses. This is the multimodal part of Project Astra in action, which makes it a step up from what we already have – with improvements in caching, recording, and processing key to the real time responsiveness.
In our hands-on time with Project Astra, we were able to get it to tell a story based on objects that we showed to the camera – and adapt the story as we went on. Further down the line, it’s not difficult to imagine Astra applying these smarts as you explore a city on vacation, or solve a physics problem on a whiteboard, or provide detailed information about what’s being shown in a sports game.
Which devices will include Project Astra?
In the demonstrations of Project Astra that Google has shown off so far, the AI is running on an unidentified smartphone and an unidentified pair of smart glasses – suggesting that we might not have heard the last of Google Glass yet.
Google has also hinted that Project Astra is going to be coming to devices with other form factors. We’ve already mentioned the Her movie, and it’s well within the realms of possibility that we might eventually see the Astra bot built into wireless earbuds (assuming they have a strong enough Wi-Fi connection).
In the hands-on area that was set up at Google I/O 2024, Astra was powered through a large camera, and could only work with a specific set of objects as props. Clearly, any device that runs Astra’s impressive features is going to need a lot of on-board processing power, or a very quick connection to the cloud, in order to keep up the real-time conversation that’s core to the AI.
As time goes on and technology improves, though, these limitations should slowly begin to be overcome. The next time we hear something major about Project Astra could be around the time of the launch of the Google Pixel 9 in the last few months of 2024; Google will no doubt want to make this the most AI-capable smartphone yet.