• | 9:00 am

Meta’s new AI tool could capture real-world objects for the metaverse and its ad business

The Segment Anything model can ‘clip’ objects from images it’s never seen before.

Meta’s new AI tool could capture real-world objects for the metaverse and its ad business
[Source photo: Suzy Hazelwood/Pexels]

Meta has developed a new AI model called Segment Anything that can cut any object out of any digital image or video, even if it’s never seen the object or the image before. The research could have big implications for Meta’s metaverse (if it shows up) as well as its core ads business.

The technology looks similar to that used in the iPhone’s photo app to remove the background from images of people or things. But Meta’s model is probably more powerful and certainly more versatile. The image set used to train Segment Anything is said to be 400 times larger than the next largest of its kind. (Meta made two training data sets available as open source.)

Using a massive image dataset, Meta researchers taught the model methods of identifying the pixels that make up objects, vs. recognizing specific objects themselves. So the model can recognize any object—from cancer cells to undersea creatures—regardless of context. That means that third-parties (and Meta itself) can put the foundation model to use without first bearing the expense of further training the model on specific, labeled images.

The company released a research paper on the model, and also made a tool so that people could try the AI using their own images.

GENERATING A METAVERSE?

The company has bet billions that the metaverse will be a popular place where people socialize, work, and play in the future. And the Segment Anything AI could find its most interesting applications in virtual and augmented reality glasses, which Meta’s Reality Labs group is developing as a primary access point to the virtual 3D world it calls the metaverse. Meta has already built eye tracking sensors (into its Quest Pro VR headset, for example) that can detect objects that a user sees. The Segment Anything model can be used to isolate such an object from its environment, identify it, and convert it into 3D digital content.

“In the AR/VR domain, SAM (Segment Anything Model) could enable selecting an object based on a user’s gaze and then ‘lifting’ it into 3D,” Meta says in blog post.

The technology could be especially valuable because, as Meta CTO Andrew Bosworth told Nikkei, generating the immersive content that surrounds the user and reflects their interests and tastes could be a very expensive proposition. “In the future you might be able to just describe the world you want to create and have the large language model generate that world for you,” Bosworth said. “And so it makes thing like content creation much more accessible to more people.”

ADS, ADS, ADS

And, of course, there’s a high probability that Meta would use the technology for advertising. If the user’s eyes rest on a product in a store window or on the street, the company that makes it might pay Meta to present additional information and buying options around the object within the user’s view through the AR lenses. Or Meta might simply make note of the real-world items the user’s eyes rest on, then serve up ads for that product the next time the user opens the Facebook app.

Meta announced in February that it had created a special generative AI group. That group, Bosworth says, has been busy. He told Nikkei that he, along with CEO Mark Zuckerberg, have been spending a lot of time with the new group. Facebook’s investors would be happy to see the company bringing AI to bear in amping up ad sales.

But at what cost to users? Researchers already fear that generative AI might be used to create endless permutations of misinformation posts or phishing emails as part of a process to zero in on content that’s perfectly suited to fool a certain type of person. But image generation AI models (like Midjourney, Stable Diffusion, and DALL-E) could be used in the same way to create endless permutations of ads, systematically testing to find the most irresistable of the batch.

And it appears that Meta may be heading in that direction. Bosworth told Nikkei that his company has been developing generative AI models for ad creation, and that the generative models could go into production this year.

  Be in the Know. Subscribe to our Newsletters.

ABOUT THE AUTHOR

Mark Sullivan is a senior writer at Fast Company, covering emerging tech, AI, and tech policy. Before coming to Fast Company in January 2016, Sullivan wrote for VentureBeat, Light Reading, CNET, Wired, and PCWorld More

More Top Stories:

FROM OUR PARTNERS