OpenAI Rival Google DeepMind’s Multimodal AI Model Gemini

0
105
Gemini from Google DeepMind

During its testing, Gemini Ultra surpassed the current leading results in 30 out of 32 key academic benchmarks used in the large language model (LLM).

Google announced the launch of Gemini, their most sophisticated and largest AI model to date. Designed to be multimodal, Gemini is capable of understanding and integrating a diverse range of information types, including text, images, audio, video, and code. This model is distinguished by its exceptional multimodal reasoning and sophisticated coding abilities. It is available in three sizes – Ultra, Pro, and Nano.

Gemini is designed to be natively multimodal, pre-trained from the start on different modalities instead of training separate components for different modalities and then stitching them together. It is fine tuned with additional multimodal data to refine its functionality. This enhances its complex reasoning that traditional multimodal models struggle with.

GitHub mentions, “The open source implementation of Gemini, the model that will ‘eclipse ChatGPT’, seems to work by directly taking in all modalities without an encoder for some kind which means that the encoding is built into the modal.”

During its testing, Gemini Ultra surpassed the current leading results in 30 out of 32 key academic benchmarks commonly used in large language model (LLM) research and development, covering tasks related to natural image, audio, video understanding, and mathematical reasoning.

It achieved a 90.0% score on the MMLU (massive multitask language understanding) benchmark, which involves 57 subjects ranging from mathematics to ethics. It also excelled in coding tasks, evidenced by its score of 59.4% on the MMMU benchmark.

Gemini has already been integrated into several key Google products. Bard is using an enhanced version of Gemini Pro to improve its reasoning, planning, and comprehension. The Pixel 8 Pro is the first smartphone to feature Gemini Nano, enhancing functionalities like the ‘Summarize’ feature in Recorder and Smart Reply in Gboard. Furthermore, Google is experimenting with incorporating Gemini into Google Search to accelerate the Search Generative Experience (SGE).

In the future, Google plans to introduce Gemini Ultra into a new Bard Advanced experience early in the next year. Gemini will also be integrated into additional Google products and services like Ads, Chrome, and Duet AI in the coming months.

LEAVE A REPLY

Please enter your comment!
Please enter your name here