Meta launches Llama 3, its language model to take AI to another level |  TECHNOLOGY

Meta has introduced the next generation of its open source large language model (LLM) Llama 3, with the release of two models trained with parameters 8B and 70B, capable of supporting “a wide range of use cases” with improved reasoning, becoming thus in “best-in-class open source models”, integrated into the Meta AI assistant.

The company led by Mark Zuckerberg has shared its intention to continue driving “the next wave of innovation in AI across the board,” both when creating applications, as well as development tools and inference optimizations.

To this end, although its arrival was expected in the month of May, Meta has launched the first two models of the next generation of its Artificial Intelligence (AI) technology Llama 3. These models are text-based, and have been trained and tuned with two sizes, 8 billion parameters (8B) and 70 billion parameters (70B).

Thus, as the company explained in a statement on its blog, with Llama 3 they have built “the best models currently existing” in comparison with other best models within the same scale of parameters.

Following this line, this generation of Llama ensures “state-of-the-art performance” in a wide range of industry benchmarks, while offering new capabilities. In fact, these two new models represent “a big leap” compared to the previous generation Llama 2.

Specifically, with Llama 3 improvements have been achieved in reasoning, code generation and instruction capabilities. Likewise, alignment has been improved and diversity in responses has increased.

According to data the company has shared, Llama 3 is capable of outperforming similarly sized models such as Google’s Gemini and Anthropic’s Claude, in the MMLU benchmark, which measures the models’ overall knowledge. Specifically, Llama 3 8B surpassed the Gemma 7B and Mistral 7B models. In the same way, Llama 3 70B also managed to surpass Gemini Pro 1.5 and Claude 3.

The model has also been evaluated by people, who have tested the new capabilities of Llama 3 against other models. Thus, the evaluation consists of covering twelve key use cases of the model, such as asking for advice, brainstorming, classifying topics, answering closed and open questions, coding skills, creative writing, reasoning, rewriting and summarizing. , among other. YesAccording to these tests, Llama 3 70B managed to outperform OpenAI’s GPT 3.5.

FLAME TRAINING 3

To train this language model, Llama 3 has been trained with more than 15T tokens that were collected from “publicly available” sources. That is, this training is based on a data set “seven times larger” than that used for Llama 2, and includes “four times more code.”

However, these data have been filtered by different systems, such as the use of heuristic filters, semantic duplication approaches and text classifiers to predict data quality.

Likewise, Meta has developed that, in order to prepare for the upcoming multilingual use cases, more than 5% of the Llama 3 pre-training data set is information in languages ​​other than English, covering a total of more than 30 languages.

On the other hand, the technology company has highlighted its commitment to developing Llama 3 in a “responsible” way. Based on this, it has indicated that it has made several resources available to users designed to encourage the safe use of the model.

Specifically, these resources are Llama Guard 2, which filters prompts and responses safely; Code Shield, which is responsible for sifting out insecure code that AI may create; and CyberSecEval 2, which carries out cybersecurity tasks to prevent abuses in the code interpreter or attacks through rapid injection.

LLAMA 3 INTEGRATED IN META AI

At the moment, The company has integrated its latest Llama 3 models into the Meta AI assistant. So you can use Facebook, Instagram, WhatsApp and Messenger on social networks, as well as on the web, to help users carry out activities, learn, create and connect “with the things that matter to them.”

Likewise, users can now download the Llama 3 models and they will also soon be available on Amazon Web Services, Databricks, Vertex AI from Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA and Snowflake. In addition, they will be supported by hardware platforms offered by AMD, AWS, Dell, Intel, NVIDIA and Qualcomm.

With all this, Meta has announced that, In the coming months, they will introduce new capabilities, longer context windows, additional model sizes, such as a model with 400B, and improved performance for Llama 3. Likewise, he has indicated that they will also share the research work on this model.

By Editor

Leave a Reply