- | 8:00 am
How Scale became the go-to company for AI training
The company works with giants like OpenAI and Meta—and has paid out hundreds of millions of dollars to freelance trainers in just the past year.
For the large language models (LLMs) that power apps like ChatGPT, Anthopic’s Claude, and Google’s Gemini to be good conversational partners and assistants, they need to be trained by humans with plenty of examples of appropriate answers.
AI companies often use techniques like reinforcement learning with human feedback (RLHF), where humans provide examples of good answers to AI systems or evaluate and score the responses AI provides. And training AI at the level of today’s chatbots and agents can take a lot of human effort, but major AI labs like OpenAI and Anthropic typically have a relatively small number of employees.
To provide the labor power necessary to train generative AI, including locating specialized experts who can help the AI learn esoteric skills like solving math and science problems, AI companies and other businesses looking to develop and refine AI models increasingly turn to specialized businesses, chief among them Scale AI.
Scale got its start about seven years ago, at the time focused on pre-generative AI work like building pipelines of labeled image data to help self-driving cars learn to recognize pedestrians, road signs, and other sights they were likely to encounter in their travels.
“We became known for some key techniques that we developed during that time, data pipelines that we built that ended up actually powering a lot of what’s now the gen AI revolution,” says Vijay Karunamurthy, Scale’s field CTO.
More than a temp agency
And about three years ago, Scale began working with OpenAI on RLHF techniques to refine systems like ChatGPT. Today, Scale operates a sprawling AI training platform called Outlier, which Outlier general manager Xiaote Zhu says paid out hundreds of millions of dollars to tens of thousands of freelance contributors around the world over roughly the past year. A second, smaller Scale work platform called Remotasks also operates mostly with freelancers outside the U.S. and is still focused primarily on computer vision and autonomous cars.
Scale now counts OpenAI, Microsoft, Meta, Nvidia, and Character.ai as clients, among numerous other businesses and government agencies, and the company’s rapid rise has reportedly made CEO and founder Alexandr Wang a billionaire by age 27. And Outlier regularly advertises for hundreds of roles helping AI with languages from Norwegian to Farsi and with a liberal arts curriculum’s worth of skills, including coding, music, nuclear physics, philosophy, “self help,” and law.
“The domain experts are incredibly important for this sort of work,” says Karunamurthy. “Having expert feedback, but also culturally aware, language-specific feedback, all of that’s really important things to consider when you’re fine-tuning those models.”
But while a big and necessary part of Scale’s operations is managing those workers—and finding those with the experience to teach AI systems about even the most esoteric fields clients expect them to work with—the company is much more than just a specialized temp agency. Scale works with AI companies to continually test the latest versions of their models, which are often being trained and tweaked round the clock on billion-dollar arrays of powerful GPUs, providing detailed, expert-driven feedback on what’s actually changed.
“Every time we get feedback that the model’s changed its thinking on a given topic, we go back and we interrogate that model even further, and we see what’s really changed under the hood,” says Karunamurthy. “Is this like a real robust, lasting change that’s been made to the model, or is it something a little bit more shallow?”
Scale’s experts can provide the AI with detailed guidance on how to solve particular problems or help ensure that models can explain their own work, necessary in some applications where AI can be subject to auditing. The company also has developed a set of proprietary benchmarks, with the general methodology made public but the details kept secret so AI developers (or AIs themselves) can’t simply study for the test. They measure performance in a variety of fields. Some of the tests verify that models perform well across multiple languages, critical for those being deployed in fields like healthcare where they might be asked questions in a variety of languages, and make sure the AI behaves properly even when users try to manipulate them into breaking their own rules, Karunamurthy says.
Scale works with businesses that use AI, as well as the companies that build systems for public use. The company can help clients evaluate the best AI, data setup, and other parameters, sometimes being brought in by big name AI labs asking Scale to assist enterprise customers. Scale can also help businesses fine-tune open source models for use on proprietary information—think insurance claims data or financial trades—within their own data centers. In July, for example, the company announced a partnership with Meta to help companies “customize, evaluate, and deploy” a version of Meta’s Llama open source AI model, and on November 19, it announced a similar deal to help enterprises build around Microsoft’s Azure AI systems.
“We fine-tune those models against their sets of data, but we keep those model weights secure so that even their own employees don’t leak the information that’s gone into training those models,” says Karunamurthy. “And that turns out to be a really powerful paradigm.”
When Scale works with its outside contractors to train AI, it also naturally takes steps to maintain secrecy. It’s been reported that the company uses code names to refer to big tech companies, so freelancers often don’t know whose AI they’re training. And Scale has at times seen complaints that assignments and rates can be unpredictable, so that while workers can pick their own hours, they may not know in advance how much work will be available. Some workers have also complained that support staff can be difficult to reach even when there are issues with payments. Other companies in the rapidly growing field have faced similar complaints.
‘They get to leverage the expertise’
On November 1, upon officially assuming her role as Outlier general manager, Zhu published a blog post declaring “a new era of Outlier,” announcing that the company was taking steps to improve the worker experience and adding new features to better connect clients with potential contributors.
“It’s a very common thing for any rapidly growing platform, where as you continue to scale up, obviously, there’s more challenge on the platform, and you have to keep investing and improving that,” Zhu tells Fast Company.
The company has also taken steps to prevent fraud, like people lying about their identities or using bots like ChatGPT to write responses intended to be human, without locking out legitimate users. And in the blog post, she described steps the company is taking to improve worker support issues and ensure greater pay transparency.
“This includes faster resolution of issues, like account management questions, and enhanced pay transparency through a detailed earnings tab and visible pay rates during tasks,” Zhu wrote in the blog post. “To reduce payment delays, we’ve added helpful tooltips, and our revamped support system now resolves 90% of pay-related inquiries within three days, ensuring a smoother, more reliable experience.”
As Scale comes to rely more on specialized contributors, at times even recruiting experts like scholars with doctorates or winners of international math competitions, providing prompt payments and a good work environment may be particularly critical to recruit and retain reliable workers with the knowledge AI systems need.
“The way I explain it usually is that it’s me using every inch and every little bit of knowledge that I have taken a whole lifetime to learn and using it in different and very creative ways,” says Gabriela Sanders, an Outlier contributor who previously worked as an elementary school interventionist.
Sanders now trains AI more or less full-time, working about 40 hours a week and enjoying the flexibility of the job, which lets her find time to work around her family’s schedule. She compares her experience working with AI models to working with students and finding ways to help them understand the subject matter.
“The model needs to have various things given to it very specifically in a very tailored way for it to learn and to build on what its knowledge is,” she says.
And for some Outlier contributors, Zhu says, the learning goes both ways, with workers getting more comfortable with the technology as they see its limitations and the details of how it gets trained.
“They get to leverage the expertise, the passion, the skills they have, and then also, in many cases, we’ve seen that it changed their outlook on AI,” Zhu says. “For some people, before joining Outlier, AI might be one of those kind of scary things, because they don’t have much understanding of how it works. And after being part of the platform, they understand how the model works—they feel like they understand more about the limitations and the use cases for it, so it changes their outlook.”