Microsoft launches Phi-3, its smallest AI model yet

1 week ago

Microsoft launched nan adjacent type of its lightweight AI exemplary Phi-3 Mini, nan first of 3 mini models nan institution plans to release.

Phi-3 Mini measures 3.8 cardinal parameters and is trained connected a information group that is smaller comparative to large connection models for illustration GPT-4. It is now disposable connected Azure, Hugging Face, and Ollama. Microsoft plans to merchandise Phi-3 Small (7B parameters) and Phi-3 Medium (14B parameters). Parameters mention to really galore analyzable instructions a exemplary tin understand.

The institution released Phi-2 successful December, which performed conscionable arsenic good arsenic bigger models for illustration Llama 2. Microsoft says Phi-3 performs amended than nan erstwhile type and tin supply responses adjacent to really a exemplary 10 times bigger than it can.

Eric Boyd, firm vice president of Microsoft Azure AI Platform, tells The Verge Phi-3 Mini is arsenic tin arsenic LLMs for illustration GPT-3.5 “just successful a smaller shape factor.”

Compared to their larger counterparts, mini AI models are often cheaper to tally and execute amended connected personal devices for illustration phones and laptops. The Information reported earlier this twelvemonth that Microsoft was building a squad focused specifically connected lighter-weight AI models. Along pinch Phi, nan institution has besides built Orca-Math, a exemplary focused connected solving mathematics problems.

Microsoft’s competitors person their ain mini AI models arsenic well, astir of which target simpler tasks for illustration archive summarization aliases coding assistance. Google’s Gemma 2B and 7B are bully for elemental chatbots and language-related work. Anthropic’s Claude 3 Haiku tin publication dense investigation papers pinch graphs and summarize them quickly, while nan recently released Llama 3 8B from Meta whitethorn beryllium utilized for immoderate chatbots and for coding assistance.

Boyd says developers trained Phi-3 pinch a “curriculum.” They were inspired by really children learned from bedtime stories, books pinch simpler words, and condemnation structures that talk astir larger topics.

“There aren’t capable children’s books retired there, truthful we took a database of much than 3,000 words and asked an LLM to make ‘children’s books’ to thatch Phi,” Boyd says.

He added that Phi-3 simply built connected what erstwhile iterations learned. While Phi-1 focused connected coding and Phi-2 began to study to reason, Phi-3 is amended astatine coding and reasoning. While nan Phi-3 family of models knows immoderate wide knowledge, it cannot hit a GPT-4 aliases different LLM successful breadth — there’s a large quality successful nan benignant of answers you tin get from a LLM trained connected nan entirety of nan net versus a smaller exemplary for illustration Phi-3.

Boyd says that companies often find that smaller models for illustration Phi-3 activity amended for their civilization applications since, for a batch of companies, their soul information sets are going to beryllium connected nan smaller broadside anyway. And because these models usage little computing power, they are often acold much affordable.