AI: Microsoft bolsters 'Big AI' options. RTZ #349

Michael Parekh AI: Reset to Zero

1 year ago

4 MIN READ

It’s been an acute reality of the AI Tech Wave in these early days, that the top companies up and down on the AI tech stack below, are both each other’s best partners, customers, and long-term competitors. They may all be rowing together in the same direction, but one needs to keep in mind that it’s in separate boats. Coopetition is in full bloom across the AI tech stacks.

Today the Information reports that the iconic partnership between LLM AI King OpenAI and Microsoft, may see their ‘Coopetition Inflection’ extend to the core large LLM models as well. In “Meet MAI-1: Microsoft Readies New AI Model to Compete With Google, OpenAI”, they summarize the context to date:

“For the first time since it invested more than $10 billion into OpenAI in exchange for the rights to reuse the startup’s AI models, Microsoft is training a new, in-house AI model large enough to compete with state-of-the-art models from Google, Anthropic and OpenAI itself.”

“The new model, internally referred to as MAI-1, is being overseen by Mustafa Suleyman, the ex-Google AI leader who most recently served as CEO of the AI startup Inflection before Microsoft hired the majority of the startup’s staff and paid $650 million for the rights to its intellectual property in March. But this is a Microsoft model, not one carried over from Inflection, although it may build on training data and other tech from the startup. It is separate from the models that Inflection previously released, according to two Microsoft employees with knowledge of the effort.”

The reason this is also notable, is that it diverges from Microsoft ‘Small AI’, open-source SLM releases like Phi-3 that I’ve talked about before:

“MAI-1 will be far larger than any of the smaller, open source models that Microsoft has previously trained, meaning it will require more computing power and training data and will therefore be more expensive, according to the people. MAI-1 will have roughly 500 billion parameters, or settings that can be adjusted to determine what models learn during training. By comparison, OpenAI’s GPT-4 has more than 1 trillion parameters, while smaller open source models released by firms like Meta Platforms and Mistral have 70 billion parameters.”

Bringing it all together:

“That means Microsoft is now pursuing a dual trajectory of sorts in AI, aiming to develop both “small language models” that are inexpensive to build into apps and that could run on mobile devices, alongside larger, state-of-the-art AI models. It also shows Microsoft’s willingness to chart a new path in AI separate from the technology developed by OpenAI, which currently underlies all of the AI “Copilot” chatbots in Microsoft’s products that can automatically spin up emails or quickly summarize documents. But the exact purpose that the new model will serve hasn’t been determined yet, and will depend on how well it performs, one of the people said.”

“To train the new model, Microsoft has been setting aside a large cluster of servers equipped with Nvidia-made graphics processing units, and has been compiling a corpus of training data to improve the model. Some of that includes data drawn from various datasets that it previously used to train smaller models, including text created by OpenAI’s GPT-4, as well as from other sources such as public data across the internet, one of the people said.”

“Microsoft could preview the new model as soon as its Build developer conference later this month, depending on how well development goes in the coming weeks.”