MosaicML introduces Open-Source MPT-30B LLMs, Trained on H100s for Enhanced Generative AI Applications

Published on:

On June 28, 2023, MosaicML has recently made an exciting announcement regarding the release of their latest models, MPT-30B Base, Instruct, and Chat, as part of their open-source MPT series. These cutting-edge models, trained with an impressive 8k token context window, surpass the quality of the original GPT-3. They can be readily employed for inference tasks or serve as a solid foundation for developing proprietary models. The training process involved the utilization of NVIDIA’s advanced H100 accelerators, which are now accessible to MosaicML customers. By leveraging the capabilities of MPT-30B, businesses can harness the potential of generative AI while upholding security and data privacy.

The MosaicML MPT family of models has already gained significant popularity as one of the most robust and widely-used open-source language models for commercial applications. Since their launch on May 5, 2023, the MPT-7B models (Base, Instruct, Chat, StoryWriter) have been downloaded over 3.3 million times. With the introduction of the MPT-30B models, which are larger and of higher quality, MosaicML expands the possibilities for various applications. As always, MosaicML emphasizes efficient training and inference optimization in their MPT models, catering to the needs of developers and researchers alike.

The MosaicML MPT series has already established itself as a leading collection of powerful and popular open-source language models suitable for commercial applications. Since their launch on May 5, 2023, the MPT-7B models (Base, Instruct, Chat, StoryWriter) have been downloaded over 3.3 million times, indicating their widespread adoption. With the introduction of the MPT-30B models, MosaicML expands the capabilities of the MPT family by offering larger and higher quality models that unlock a broader range of applications. MosaicML remains committed to optimizing their MPT models for efficient training and inference processes.

As we commemorate the third anniversary of GPT-3, it is worth highlighting that MPT-30B was specifically designed to surpass the quality of this iconic model. When evaluated using standard academic benchmarks, MPT-30B demonstrates superior performance compared to the originally published GPT-3. Enterprises are increasingly deploying MPT models for various use cases, including code completion and dialogue generation. Moreover, these models can be fine-tuned using proprietary data, enabling organizations to tailor them to their specific needs and requirements. The versatility and quality of MPT-30B make it a valuable tool for businesses seeking advanced language generation capabilities.

About MosaicML:

MosaicML is a leading generative AI platform that empowers enterprises to build their own AI solutions. With a team of over 60 research and engineering professionals across offices in San Francisco, New York, Palo Alto, and San Diego, MosaicML utilizes cutting-edge scientific research to develop products that enable fast, cost-effective, and simplified training of popular machine learning models. The platform prioritizes model ownership and data privacy, ensuring that developers maintain full control over their AI models. With over 25+ speed-ups, MosaicML optimizes the training and inference processes, allowing businesses to deploy their AI models efficiently and derive valuable insights while upholding data privacy and control.


Leave a Reply

Please enter your comment!
Please enter your name here