Scale AI’s path to becoming a $7.3 billion company was paved with real image, text, voice and video data. Now, it’s using that foundation to jump into the synthetic dice game, one of the hottest and emerging categories in AI.
They announced on Wednesday an early access program for Synthetic Scale, a product that machine learning engineers can use to enhance their existing real-world datasets, according to the company. Scale hired two executives to build this new division of its business. Scale has hired Joel Kronander, who previously headed machine learning at Nines and was a former Apple computer vision engineer working on 3D mapping, as his new head of synthetic data. The company also hired Vivek Raju Muppalla as director of synthetic services. Muppalla was previously director of engineering for AI and simulation at Unity Technologies.
Synthetic data is what it sounds like: fake data that was created by machine learning algorithms instead of using real-world information. It can be a powerful and useful tool for generating data – such as medical images – when privacy is a top concern. Developers can use synthetic data to add more complexity to their training models and help remove biases that can often be found in datasets collected from the real world.
Founder and CEO Alexandr Wang described his new offering as a hybrid approach to data, similar to lab-grown meat.
“We started with real data, just like raw lab meat starts from real animal cells, and then we grow and iterate and build the product from there,” he told Ploonge. By using real-world data as a foundation to create synthetic data, the company is able to deliver a truly unique and powerful offering to customers, Wang said, adding that this was a gap they saw in the market.
Scale customers have also seen this gap. The company’s foray into synthetic data was in response to demand from its customers, Wang told Ploonge, which said it started developing the product less than a year ago. Autonomous vehicle technology developer Kodiak Robotics, Tractable AI and the US Department of Defense used Scale for their new synthetic data product, Wang said.
Scale, which today employs around 450 employees, sees synthetic data as a top priority in 2022 and an area it will continue to invest in as it develops its product line. But that doesn’t mean it will take over your real data business. Wang sees synthetic data as a complementary tool that will help developers “get more bang for their buck from their algorithms and other AIs and particularly with edge cases.
For example, autonomous vehicle companies typically use simulation to recreate real-world scenarios and play them back to see how the autonomous system will handle it. But real-world data might not provide the scenario they’re looking for.
“You don’t come across real-world scenarios very often where there could be, say, 100 cyclists crossing at the same time,” Wang explained. “We can start with real-world data and synthetically add all cyclists or all people, and that way you can train the algorithm properly.”