Microsoft introduces Phi-3 mini, a small language model that runs natively on the smartphone

Microsoft has presented a new small language modelPhi-3 mini, designed to run on a modern smartphone and offering performance similar to OpenAI’s GPT-3.5.

The new iteration of Microsoft’s lighter language model has been trained with 3.3 billion tokens coming from “larger and more advanced” data sets than those used in the predecessor model, Phi-2, which was trained with 1.4 billion tokens.

Phi-3 mini comes with a size of 3.8 billion parameters sufficient for use in a modern smartphone, since it occupies around 1.8GB of memory and can be quantified to 4 bits, as stated in the text published on Arxiv.org.

The researchers used a iPhone 14 con un chip A16 Bionic in which, they say, “it runs natively and completely offline, achieving more than 12 tokens per second.” The general performance of this model “rivals” that of other larger ones, such as the Mixtral 8x7B and GPT-3.5.

The technology company has employed a transformer decoder architecture, which supports 4K text length, and by being based on a block structure similar to Meta’s Llama 2, it not only “benefits” the open source community, but also supports all the packages developed for Llama 2.

The model supports a conversational, chat format, and is aligned with Microsoft’s robustness and security values, as highlighted in the research text.

Along with Phi-3 mini, Microsoft has also trained two additional models from the same family: Phi-3 medium of 14,000 million parameters, and Phi-3 small with 7 billion parameters, both trained with 4.8 billion tokens.

By Editor

Leave a Reply