All that was announced at Google I/O 2024: Trillium chip, Imagen-3, Veo and more

Among Google’s latest salvos was an addition to its family of Gemini 1.5 AI models known as Flash, an upgrade to its image-to-text offering Imagen and an AI model that creates high definition videos

by Gulf Business with Reuters

May 15, 2024

Gemini - Sundar Pichai, chief executive officer of Alphabet Inc., during the Google I/O Developers Conference in Mountain View, California, US, on Wednesday, May 10, 2023. Google introduced a new large language model, used for training artificial intelligence tools like chatbots, known as PaLM 2, and said it has already woven it into many of the internet search company’s marquee products. Photographer: David Paul Morris/Bloomberg

Image credit: Getty Images

Google parent Alphabet on Tuesday showed how it is building on artificial intelligence across its businesses, including a beefed-up Gemini chatbot and improvements to its prized search engine as it races to compete with AI rivals.

The flurry of announcements underscores Google‘s efforts to refresh its products since Microsoft-backed OpenAI’s 2022 launch of ChatGPT dazzled the public, threatening the incumbent’s long reign over online search and AI.

Among Google’s latest salvos was an addition to its family of Gemini 1.5 AI models known as Flash that is faster and cheaper to run; a prototype called Project Astra, which can talk to users about anything captured on their smartphone camera in real time; and search results categorised under AI-generated headlines.

“This is a moment of growth and opportunity,” Alphabet CEO Sundar Pichai told reporters, when asked if the AI updates could risk Google’s profitable business.

Here’s a full recap of our news and updates from #GoogleIO — in under 10 minutes 🎉 pic.twitter.com/O2B8QPsNTg

— Google (@Google) May 15, 2024

The product presentation at Google‘s annual I/O developer event in Mountain View, California, followed a shorter showcase by rival OpenAI on Monday.

OpenAI demonstrated how ChatGPT could voice answers with human-like intonation to any written or visual prompt. The startup’s CEO, Sam Altman, wrote that OpenAI had delivered software that “feels like AI from the movies.”

Read: GPT-4o: OpenAI unveils new AI model as race heats up

Google’s news at times covered similar ground, underscoring the fierce competition between the two AI developers.

Project Astra

For instance, Alphabet’s AI unit, Google DeepMind, has worked to build technology that can carry out day-to-day tasks for consumers. Early results have manifested in Project Astra, a tool that can use a smartphone camera and draw conclusions about the world around it.

In a demo video shown during Google I/O, a user deployed it to identify a speaker and locate glasses they had left in another part of the room. The company also teased how it could pair Project Astra with what it calls Gemini Live, a potentially more natural-sounding voice and text aide than its Google Assistant of the past.

Demis Hassabis, CEO of Google DeepMind, said of the work behind Project Astra: “We wanted to build a universal AI agent that can be truly helpful in everyday life.”

Google teases Veo

Another area in which Google showed how it is facing off against competitors was video generation.

The company teased Veo, an AI model that can spin up 1080p-resolution videos lasting longer than a minute, available to approved creators on a preview basis, filmmaker Donald Glover among them.

Introducing Veo: our most capable generative video model. 🎥

It can create high-quality, 1080p clips that can go beyond 60 seconds.

From photorealism to surrealism and animation, it can tackle a range of cinematic styles. 🧵 #GoogleIO pic.twitter.com/6zEuYRAHpH

— Google DeepMind (@GoogleDeepMind) May 14, 2024

OpenAI has promoted film-conjuring software of its own among Hollywood executives, enthralling and worrying the creative industry.

Google also announced improvements to its Gemini Pro 1.5 model that is capable of making sense of a massive amount of data. On Tuesday, it said it was doubling that amount, to 2 million tokens, meaning the AI potentially could answer questions when given thousands of pages of text or more than an hour of video to ingest.

The Pro model – starting with prompt sizes of up to 1 million tokens, or pieces of data – will also be available to subscribers to Google’s Gemini Advanced service.

Alphabet shares were up 1 per cent at $172.59 on Tuesday afternoon.

Imagen-3

Demis Hassabis, head of Google DeepMind unveiled Imagen 3, Google’s latest AI text-to-image generation model.

The tool can be used via the ImageFX tool in Google’s AI Test Kitchen page. Developers and enterprise users will get the tool “soon” through Vertex AI.

We’re introducing Imagen 3: our highest quality text-to-image generation model yet. 🎨

It produces visuals with incredible detail, realistic lighting and fewer distracting artifacts.

From quick sketches to very high-res imagery, here’s a look at what it can create. 👀 #GoogleIO pic.twitter.com/XMrQYGeSiO

— Google DeepMind (@GoogleDeepMind) May 14, 2024

New chip

Alphabet unveiled a product called Trillium in its artificial intelligence data center chip family that it says is nearly five times as fast as its prior version.

“Industry demand for (machine learning) computer has grown by a factor of 1 million in the last six years, roughly increasing 10-fold every year,” Alphabet CEO Sundar Pichai said in a briefing call with reporters.

“I think Google was built for this moment, we’ve been pioneering (AI chips) for more than a decade.”

Alphabet’s effort to build custom chips for AI data centers represents one of the few viable alternatives to Nvidia’s top-of-the-line processors that dominate the market.

Together with the software that is closely tied to Google’s tensor processing units (TPUs), the chips have allowed the company to take a significant share of the market.

A new way to search

Meanwhile, for US users of Google Search browsing the Web in English, the company said it soon will use AI to help organise search results for queries on dining, recipes, and eventually movies, books and other content.

Also for Google search, the company is rolling out AI Overviews to all users in the US this week, after a long period of public testing since last year’s I/O event. The feature uses generative AI to synthesise information and answer more complex queries for which there is no simple answer on the Web.

Analyst Jacob Bourne of eMarketer said, “The AI Overviews launch reception this week will be an indicator of how well Google can adapt its Search product to meet the demands of the generative AI era.”

He added, “To maintain its competitive edge and satisfy investors, Google will need to focus on translating its AI innovations into profitable products and services at scale.”

Ads will remain in slots throughout a given Google Web page, the company said, and AI Overviews will roll out to more than a billion people by year end.

Alphabet posted revenue of $307.4bn in 2023, the majority of which came from ads on Google Search and other properties.

The company also showed off an experiment that will let users ask questions of videos they upload to Google Search, as they can do with images today. The company demonstrated how this could help diagnose what is wrong with a broken record player.