Gemini: How Google's answer to ChatGPT aims to disrupt the AI market

Gemini comes in three different sizes – nano, pro and ultra – to ensure it can run on everything from data centres to mobile devices

Google said Gemini is the most capable and general AI model it has ever built. EPA
Powered by automated translation

Alphabet-owned Google has launched Gemini, its most capable generative artificial intelligence tool yet.

With the launch, the California-based company is aiming to grab a huge chunk of the generative AI market that was first disrupted by Microsoft-backed OpenAI last year with the launch of ChatGPT.

Here, The National looks at various features of Gemini and other options available in the market.

What is Gemini?

Gemini is the first AI model to beat human experts on MMLU (Massive Multitask Language Understanding) that is one of the widely used methods to test the knowledge and problem-solving abilities of AI.

It can comprehend diverse tasks and generate code based on different inputs – an innovation poised to revolutionise problem-solving capabilities. It can independently navigate and merge diverse information types such as text, code, audio, images and video, thereby operating across varied data formats.

Its ability to extract insights from hundreds of thousands of documents through reading, filtering and understanding information will help deliver new breakthroughs at digital speeds in many fields from science to finance, Google said.

Who developed Gemini?

Started in 2010, Google DeepMind played a crucial role in developing Gemini.

So far, it has brought together new ideas in machine learning, neuroscience, engineering, maths, simulation and computing infrastructure, along with new ways of organising scientific endeavours.

Gemini is the result of DeepMind’s efforts to produce AI that “feels less like a smart piece of software and more like something useful and intuitive”, said Demis Hassabis, chief executive and co-founder of Google DeepMind.

Three variants

The first version, Gemini 1.0, is optimised for different sizes – nano, pro and ultra – to ensure it can run on everything from resource-intensive data centres to small mobile devices.

  • Gemini pro: The pro version is merged with Google’s generative AI tool Bard, which was originally launched in February. It adds to Bard’s ability for advanced reasoning, planning, coding, summarising, understanding and detailed interpretations. It is available from Wednesday.
  • Gemini nano: It is available on the company’s Pixel 8 Pro smartphones. It will help prevent sensitive data from leaving the phone, as well as offer the ability to use various features without a network connection.
  • Gemini ultra: The largest and most capable model for highly complex tasks, ultra will be available early next year. Google said it will launch Bard Advanced version to let users gain access to the best models and capabilities offered by ultra.

Is Gemini Google’s most capable AI model yet?

Google said Gemini is the most capable and general AI model it has ever built.

Thus far, the standard approach to creating multimodal models involved training separate components for different modalities and then stitching them together. These models can sometimes be good at performing certain tasks, like describing images, but struggle with more conceptual and complex reasoning.

However, Gemini is pre-trained on different modalities.

“We [have] fine-tuned it with additional multimodal data to further refine its effectiveness. This helps Gemini seamlessly understand and reason about all kinds of inputs from the ground up, far better than existing multimodal models … its capabilities are state of the art in nearly every domain,” Mr Hassabis said.

How users can gain access to Gemini

Gemini’s pro and nano versions are currently accessible across Google's Bard chatbot and Pixel 8 smartphones, respectively.

In the coming months, Gemini will be available in more of Google’s products and services like Search, Ads, Chrome and Duet AI.

Google said it is already experimenting Gemini in Search, where it is making search faster for users with a 40 per cent reduction in latency in English in the US, alongside improvements in quality.

From December 13, developers and enterprise customers can gain access to Gemini pro to customise the technology and use it in their applications and inventions.

How Gemini differs from other AI chatbots

Google said before launching it to the public that it had run Gemini through a number of industry-standard benchmarks. In six out of eight benchmarks, Gemini Pro outperformed GPT-3.5.

It surpassed GPT-3.5 on MMLU and GSM8K, which measures grade school maths reasoning.

What other options are on the market

Microsoft-backed OpenAI launched ChatGPT in December last year and it was an instant success. It is a programme that comes up with humanlike responses to prompts in seconds, based on information publicly available on the internet. However, it has also raised concerns about what it is being used for and its accuracy.

In September, the Abu Dhabi government-supported research centre the Technology Innovation Institute launched Falcon 180B – an advanced version of its flagship language model – to boost generative AI in the region.

Last month, cloud company Amazon Web Services launched a generative AI tool specifically for businesses.

How big is the market?

Investors have put more than $4.2 billion into generative AI start-ups in 2021 and 2022 through 215 deals after interest surged in 2019, recent data from CB Insights showed.

Globally, AI investments are projected to hit $200 billion by 2025 and could possibly have a bigger impact on gross domestic product, Goldman Sachs Economic Research said in a report in August.

Meanwhile, in Saudi Arabia, the Arab world's biggest economy, the generative AI market is expected to surpass $1 trillion by 2030, growing at a compound annual rate of more than a quarter, from nearly $220 million in 2023, data from Statista shows.

Updated: December 08, 2023, 6:00 AM