Google Gemini: Bard gets its biggest upgrade as generative AI market heats up

Model can combine text, code, audio, image and video, Alphabet-owned company says

Next year, Google aims to introduce Bard Advanced, which will allow users access most advanced models and capabilities. Reuters
Powered by automated translation

Google’s generative artificial intelligence tool Bard got its biggest upgrade on Wednesday as the company launched Gemini, its largest and most capable AI model yet.

Coming with multi-modal reasoning capabilities, the Gemini 1.0 model will be rolled out in three different sizes – nano, pro and ultra – so it can run on everything from resource-intensive data centres to small mobile devices, Google told a media roundtable.

Initially, Bard will use a specifically tuned version of Gemini pro in English for advanced reasoning, planning, coding, summarising, understanding and detailed interpretations. It is available from Wednesday.

Early next year, the company aims to introduce Bard Advanced, which will allow users access most advanced models and capabilities – starting with Gemini ultra.

The Alphabet-owned company, which is wrestling Microsoft-backed Bing and ChatGPT for a greater share of the generative AI market, had originally launched Bard in February.

What is Google Gemini?

Gemini can generalise and understand, operate across and combine different types of information including text, code, audio, image and video. Before bringing the new model to the public, Google said it ran Gemini pro through several industry-standard benchmarks.

“In six out of eight benchmarks, Gemini pro outperformed GPT 3.5, including in MMLU [massive multitask language understanding], one of the key leading standards for measuring large AI models,” said Sissie Hsiao, vice president and general manager for Assistant and Bard at Google.

“We are seeing great results … in blind evaluations with our third-party raters, Bard is now the most preferred free chatbot compared to leading alternatives.”

Users can try Bard with Gemini pro for text-based prompts, with support for other modalities coming soon. It will initially be available in English in more than 170 countries and territories and in more languages and places in the coming months, the company said.

“Developers are using our models and infrastructure to build new generative AI applications, and start-ups and enterprises around the world are growing with our AI tools … this is incredible momentum, and yet, we are only beginning to scratch the surface of what’s possible,” Sundar Pichai, Alphabet’s chief executive, said.

What are Gemini ultra and Gemini nano?

Gemini ultra is the California-based company’s largest and most capable model, designed for highly complex tasks and built to quickly understand and act on different types of information – including text, images, audio, video and code. It will be available next year.

“One of the first ways you will be able to try Gemini ultra is through Bard Advanced, a new, cutting-edge AI experience in Bard that gives you access to our best models and capabilities,” Ms Hsiao said.

Google said it is completing “extensive safety checks” and will launch a trusted tester programme before opening Bard Advanced to more people next year.

However, it introduced Gemini nano on its Pixel 8 Pro smartphones on Wednesday.

In the coming months, Gemini will be available in more of Google’s products and services like Search, Ads, Chrome and Duet AI.

Google said it is already experimenting Gemini in Search, where it is making search faster for users with a 40 per cent reduction in latency in English in the US, alongside improvements in quality.

From December 13, developers and enterprise customers can access Gemini pro to customise the technology and use it in their applications and inventions.

How has Google tested Gemini?

Google said it is testing Gemini models and evaluating their performance on various tasks, such as natural image, audio and video understanding to mathematical reasoning.

Gemini ultra’s performance exceeds current results on 30 of the 32 widely used academic benchmarks used in large language model research and development. With a score of 90 per cent, it is the first model to outperform human experts on MMLU, which uses a combination of 57 subjects such as maths, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities.

How safe is Gemini?

Google is developing Gemini “boldly and responsibly”, Mr Pichai said.

“That means being ambitious in our research and pursuing the capabilities that will bring enormous benefits to people and society while building in safeguards and working collaboratively with governments and experts to address risks as AI becomes more capable.”

The company said the strategy aligns with the approach it has taken since the launch of Bard early this year. For example, Bard is built based on Google’s AI Principles, including contextual help, like Bard’s “Google it” button to double-check its answers.

Updated: December 08, 2023, 6:00 AM