ChatGPT vs.  Gemini: who is winning the artificial intelligence race?  |  OpenAI |  Google |  Artificial intelligence |  AI |  Microsoft |  TECHNOLOGY

During a live performance on May 13, OpenAI presented ChatGPT-4oa new free version, faster and with improved capabilities of its popular chatbot. On the other hand, during its I/O 2024 developer conference, which took place the next day, Google announced a series of updates to its model Gemini as well as new tools IA.

Below, we share a summary of the latest innovations in artificial intelligence presented by both companies, highlighting the most relevant advances.

This is ChatGPT’s new multimodal model

Until before the announcement of OpenAI all the models GPT-4 They were only available to subscribers who paid a monthly fee. However, the good news is that ChatGPT-4o is available to all users, including those using the free version. Even so, subscribers will be able to make more queries.

In addition to processing information in text format, this new AI is now capable of processing and generating information from images, video and audio.

ChatGPT can now see, hear and speak” reads on his blog.

According to the company, ChatGPT-4o can respond to audio requests, such as user questions, in an average of 320 milliseconds, a time comparable to a human response. Additionally, the AI ​​understands when the user interrupts it, making the interaction more natural.

Not only does the model respond quickly, but it can also generate its responses with different emotive tones of voice, such as sarcasm, and can even laugh, sing, and make jokes.

Another functionality is that GPT-4o has the ability to analyze and understand images in real time as they are displayed on the camera. Also, it can recognize emotions in facial expressions and know if you are sad or happy. This feature was developed in partnership with the Be My Eyes app from Denmark, with the purpose of providing assistance to people with visual impairments.

Real-time translations are another highlight of the new ChatGPT. This model can play the role of a translator during a conversation between two people who speak different languages. For example, during the live presentation, Mira Murati, CTO of OpenAI, had a conversation with a company engineer. Even though she was speaking in Italian and he was speaking in English, ChatGPT was able to translate the conversation instantly and naturally, making communication between them easier.

How to access GPT-4o

However, not all the new features presented at the event are available immediately. For example, audio and video interaction will be available to paying users in a few weeks. However, what can be tested from this moment is the interaction through text and images with GPT-4o. All you have to do is access the chatbot from the web or from your mobile phone, whether iOS or Android.

MIRA: Professors generated by artificial intelligence teach at a Hong Kong university

The new Gemini from Google

The Mountain View company has announced an improved version of Gemini 1.5 Pro, which features a one million token context window. Additionally, this window is expected to expand to 2 million tokens for some developers through a waiting list.

According to the company, with one million tokens, the model can understand multiple large documents, up to 1,500 pages in total, or summarize 100 emails in seconds. And to take advantage of such information capacity, Google adds the option to upload files directly from Google Drive.

Gemini 1.5 Pro It also receives improvements in image understanding, allowing you to make various requests from a single image, such as obtaining recipes from photos of dishes or receiving step-by-step instructions to solve mathematical problems.

This version is available in the Gemini Advanced subscription, which is priced at $19.99 per month and is available in more than 35 languages, including Spanish, in 150 countries.

At the same time, the company introduced Flash 1.5, a new version of its artificial intelligence designed to be “fast and efficient.” This is a minor version of Gemini 1.5 Pro and is available for testing on Google AI Studio and Vertex AI with a capacity of one million tokens.

Another new feature includes Project Astra, an artificial intelligence agent developed by Google to help with everyday tasks through quick and adaptive responses. One of its main features is the development of hardware, such as glasses that integrate cameras and a microphone, allowing us to interact with AI in a practical way.

Google has also revealed its new image generation technology, called Image 3. This innovation improves both the text and the visual effects of images. Additionally, it introduced Veo, a video creation system with advanced editing features and the ability to generate moving images using text commands.

Google has introduced new features to its search engine results with the launch of ‘AI Overviews’ for users in the US. This feature presents answers generated by artificial intelligence along with links to websites at the top of the search results.

Driven by technology Gemini AI‘ AI Overviews’ provides quick and useful information without the need to click on multiple links.

El Comercio spoke with César Beltrán, who is the coordinator of the Artificial Intelligence Research Group at the Pontificia Universidad Católica del Perú (PUCP), in order to address this question.

It is a little difficult to compare them, but if I want to compare the level of understanding, of refinement that the models have, I think that OpenAI is ahead. Yes, it is definitely ahead“, answered. According to the expert, OpenAI took by surprise Google with his sudden announcement. “What they took out was a pill of what their voice assistant is. And also its image recognizer (…)“, he pointed.

Meanwhile, Google has attempted to incorporate its AI models into its existing apps, resulting in a more dispersed offering.

That’s the problem with being a large company, they are integrating (AI) into all their applications, they cover too much. OpenAI It’s something quite simple”he added. Despite this, the Mountain View company remains a relevant force in the field of AI thanks to its wide variety of tools and applications.

“But if we focus solely on the capabilities of the language model, as we have seen, I think they are tied”, he highlighted. Finally, it is expected that OpenAI present news with GPT-5 soon, which could change the situation and make a greater difference in the competition.

By Editor

Leave a Reply