Stability AI, Hugging Face, and Canva Establish New Non-Profit Organization for AI Research

In Brief

Developing cutting-edge AI systems like ChatGPT requires massive technical resources, because they’re costly to run.

However, open source efforts have faced difficulty re-running closed source systems created by commercial labs like DeepMind and OpenAI.

ChatGPT AI systems require a lot of technical resources to develop, in part because of the high building and operating costs. As commercial labs like Alphabet’s DeepMind and OpenAI produce proprietary closed-source systems, open-source efforts to reverse-engineer them have frequently encountered roadblocks—mostly due to a lack of sufficient funds and expertise.

To avoid being wiped out, one community research group EleutherAI is establishing a non-profit foundation. Non-profit research institute EleutherAI Institute will be established through donations and grants from backers, including AI startups Hugging Face and Stability AI, former GitHub CEO Nat Friedman, Lambda Labs, and Canva.

“As an organization, we can build a full-time staff and participate in longer and more involved projects than we would as a volunteer group,” Stella Biderman, an AI researcher at Booz Allen Hamilton who will lead the EleutherAI Institute, told TechCrunch. “With regard to nonprofit work specifically, it must be a no-brainer given our focus on research and open source.”

Connor Leahy, Leo Gao, and Sid Black founded EleutherAI several years ago as a grassroots movement of developers working to open source AI research. They coded the content and gathered the data to create a machine learning model that approached the GPT-3 text-generating OpenAI, which was gaining a lot of attention at the time.

GPT3-like models were trained to complete text, code, and more using The Pile datasets. Several models were released under the Apache 2.0 license, including GPTJ and GPTNeoX language models that previously fuelled a new wave of startups.

EleutherAI relied mostly on the TPU Research Cloud, a Google Cloud program that supports projects in hopes that the outcomes will be made public. In exchange for AI workloads that CoreWeave, a U.S.-based specialized cloud provider built for large-scale GPU-accelerated workloads, provides as cloud services, CoreWeave provided compute resources to EleutherAI customers. Additionally, CoreWeave has worked extensively with EleutherAI — here is a bit more information on this for reference, CoreWeave Unlocks the Power of EleutherAI’s GPT-NeoX-20B.

Over 20 of the community’s regular contributors are now working full-time, focusing mostly on research. For 18 months, EleutherAI members have co-authored 28 academic papers, trained dozens of models, and released ten codebases. However, EleutherAI decided to cancel its planned GPT3-sized model because of the cloud providers’ fickleness.

In late 2022, EleutherAI became well acquainted with Stability AI, the now well-financed startup behind the image-generating AI system. Along with other collaborators, it helped to create the initial version of Stable Diffusion. Since then, Stability AI has donated a portion of its AWS cluster for EleutherAI’s ongoing language model research.

Biderman says that after Hugging Face approached EleutherAI and non-profit discussions started, many EleutherAI employees were involved with BigScience, which sought to train and open-source a model like GPT3 over the course of a year.

Biderman said that EleutherAI has focused on ChatGPT-like large language models in the past and will most likely continue to do so. Beyond training large language models, we are excited to devote more resources to ethics, interpretability, and alignment work. however, there might be just one problem with the research: EleutherAI’s research may be somewhat influenced by commercially motivated ventures like Stability AI and Hugging Face, both of which are backed by significant venture capital. There is a study that shows how corporations influence policy through nonprofit donations.

Biderman says the EleutherAI Foundation will remain independent, and there are no issues with the donor pool so far.
Biderman said, “We are not developed by commercial companies. Instead, I think we benefit from being financed by a range of organizations. In theory, that makes our independence greater.”

The EleutherAI Foundation will have to overcome another challenge: ensuring its coffers do not run dry. In 2015, OpenAI was established as a nonprofit but later converted to a “capped profit” structure in order to fund its ongoing research.
In terms of broad strokes, the mixed bag of nonprofit initiatives that fund AI research.

The Allen Institute for AI (AI2) is among the impressive successes in the field of artificial intelligence. There is also the Alan Turing Institute, the UK-based, government-funded national institute for data science and machine learning. Cohere For AI (despite its corporate affiliation) is another innovative startup. Distributed AI Research by Timnit Gebru, a global distributed research organization, also seems promising.

The EleutherAI Foundation, no doubt, will change and expand its mission over time—hopefully for the better.

