In a news that shook the tech world, the GoogleDeepMind, GoogleResearch, and other Google experts introduced Gemini, a new generation of multimodal artificial intelligence models. These models, designed to process images, audio, video, and text, have already shown impressive results.
The highest standard among them is Gemini Ultra, which set new records in 30 out of 32 test categories. It was the top performer in text and reasoning tasks, as well as in image and video understanding and speech recognition. Gemini Ultra demonstrated its power by reaching the MMLU expert level in 57 areas, with scores above 90%. This model also set a new record in MMMU with 62.4%, surpassing previous models by more than 5%.
Gemini is not just a technological achievement, but also a versatile tool covering areas ranging from education to various applications. The model is able to recognize illegible handwriting, convert tasks into mathematical equations, detect errors, and offer optimal solutions. Gemini has already been integrated into several Google products, including Google Bard, and will be available via APIs on Google AI Studio and Google Cloud Vertex AI in the near future.
It is equally important to note that according to Google tests, Gemini outperforms even ChatGPT 4 by OpenAI. This step forward opens up new prospects for the development of artificial intelligence, making it more accessible and efficient in various industries. We are looking forward to seeing how this innovation will change the way we think about AI capabilities!