Cerebras, an AI company in Silicon Valley, has released seven open-source GPT models as an alternative to the controlled and proprietary systems currently available. These royalty-free models, including their weights and training recipe, are provided under the Apache 2.0 license. Cerebras, known for its AI infrastructure, aims to empower researchers and businesses by offering the capability to train custom language models using their Andromeda AI supercomputer.
In a blog post, Cerebras explains that they trained all their GPT models using their CS-2 Cerebras Wafer-Scale Cluster called Andromeda. This cluster facilitated quick completion of experiments without the need for complicated distributed systems engineering and model parallel tuning typically required with GPU clusters. By simplifying the training process, Cerebras enables researchers to focus on ML design. To promote openness and accessibility to AI technology, Cerebras has made the Wafer-Scale Cluster available on the cloud through the Cerebras AI Model Studio.
Cerebras GPT Models and Transparent Collaboration
Cerebras justifies the creation of their seven open-source GPT models by highlighting the concentration of AI technology ownership in a few companies. Unlike OpenAI, Meta, and Deepmind, who tightly control and keep much information about their systems private, Cerebras believes in open, reproducible, and royalty-free models. By using the latest techniques and open datasets, they have trained a family of transformer models called Cerebras-GPT, which are available under the Apache 2.0 license. Cerebras hopes that the release of these models on platforms like Hugging Face and GitHub will encourage further research and collaboration.
The training of the Cerebras-GPT models was made possible by their Andromeda AI supercomputer, which amazingly only took weeks to accomplish. In contrast, other models like OpenAI's GPT-4, Deepmind, and Meta OPT lack transparency in their training processes. Cerebras-GPT, on the other hand, shares the entire process openly and transparently, including the model architecture, training data, weights, checkpoints, and compute-optimal training status. The models can be used under the Apache 2.0 License.
The seven versions of the models have varying sizes, ranging from 111M to 13B parameters. Cerebras researchers completed this work in a few weeks thanks to the speed and efficiency of the Cerebras CS-2 systems comprising the Andromeda supercomputer and the weight streaming architecture. These models demonstrate the excellent performance of Cerebras' systems in training large and complex AI workloads.
Advancements in Open Source AI
The open-source AI movement is still in its early stages but is gaining momentum. Companies like Mozilla, with their Mozilla.ai project, are building trustworthy and privacy-respecting open-source GPT and recommender systems. Databricks has also released an open-source GPT Clone called Dolly, aiming to democratize "the magic of ChatGPT." In addition to Cerebras, Nomic AI has developed GPT4All, an open-source GPT that is compatible with laptops. This growing open-source movement in AI has the potential to transform industries and prevent technology concentration in the hands of a few corporations.
If this pace of open-source advancement continues, we may witness a shift in AI innovation that promotes inclusivity and collaboration.
For more information, read the official announcement: "Cerebras Systems Releases Seven New GPT Models Trained on CS-2 Wafer-Scale Systems."