TechingToday

Microsoft's new Phi-3 is one of the smallest AI models available, and its performance is better than its larger rivals.

General

Microsoft has unveiled an impressive new artificial intelligence model, Phi-3, which is small compared to the GPT-4, Gemini, and Llama 3, but packs a punch commensurate with its size.

Known as small language models, these lightweight AIs allow certain tasks to be performed more cheaply and easily without using the heavy computational power of larger models.

Despite its small size, the Phi-3 mini already performs as well as the Llama 2 in some benchmarks, and Microsoft says it is as responsive as models 10 times its size.

What is not clear is whether this will be part of a future Copilot update or remain a stand-alone project as Microsoft seeks to bring Copilot functionality to more devices.

According to Eric Boyd, VP of Azure, Microsoft Phi-3 was trained with a "curriculum"; when interviewed by The Verge, Boyd said they obtained a list of 3000 words and asked him to create a children's book to teach Phi to a larger language model He said they asked them to do so. They then had the students learn in these new books.

Phi-3 comes in three sizes: mini with only 3.8 billion parameters, small with 7 billion parameters, and medium with 14 billion parameters. In contrast, GPT-4 has over a trillion parameters, and the smallest Llama 3 model has 8 billion.

Phi-3 builds on the learning of Phi-1, which focused on coding, and Phi-2, which learned inference. This is why this improved reasoning can rival GPT-3.5 in response quality, despite being trained on a much smaller data set.

This trend toward smaller models performing as well as or even better than the big boys is gaining momentum; Meta's Llama 3 70B has nearly reached GPT-4 levels on several benchmarks, and smaller models are finding a specific niche It seems that the smaller models are finding a specific niche.

Phi-3's performance is "comparable to models such as the Mixtral 8x7B and GPT-3.5," Microsoft researchers explained in a paper on the new model. This was achieved "despite being small enough to fit in a cell phone."

According to the research team, this innovation was entirely due to the dataset. This dataset consisted of highly filtered web data and synthetic data from children's books produced by other AIs.

Microsoft Phi-3 is designed to run on a wider range of devices and is much faster than is possible with larger models: the StabilityAI Zephyr, Google Gemini Nano, Anthropic's Claude 3 such as Haiku, are located in the group of Internet-free models that run on laptops and cell phones.

In the future, these models could be bundled with smartphones, built into chips in smart speakers, or integrated into refrigerators to give dietary advice.

Cloud-based models such as the Google Gemini Ultra, Claude 3 Opus, and GPT-4-Turbo will always outperform smaller models in all areas, but they have some disadvantages.

These smaller models allow users to chat with virtual assistants without an Internet connection, have the AI summarize content without sending data offline, and run on Internet of Things devices without being aware that the AI is running enabling them to do so.

Apple is said to be relying almost entirely on these local models to power the next generation of generative AI features in iOS 18, while Google is deploying Gemini Nano on more Android devices.