• | 8:00 am

Now anyone can have their own Disney studio, thanks to generative AI

Stable Animation is like having a stable of talented animators at your fingertips.

Now anyone can have their own Disney studio, thanks to generative AI
[Source photo: Curious Refuge, Stable Animation]

Today’s generative AI roundup was going to be titled, “The whimsical fellowship of the ring versus the menagerie of imperial stormtroopers: How Wes Anderson keeps remaking iconic films without even knowing it.” But if generative AI has taught us anything, it’s that it’s impossible to plan ahead, given the neck-breaking announcements happening every other minute of the day.

So, now I have to trash that headline and a 200-word intro essay on pedicured hobbits and waxed ewoks to talk about Stable Animation and some other things, like how Meta’s new ImageBind can create coherent scenes merging images, video, sound, music, and depth. Or a new platform capable of capturing human motion and replay it from any angle you want, called HumanRF. And, of course, how someone got fake Owen Wilson to play Sauron and Darth Vader.

STABLE ANIMATION: EVERYONE IS DISNEY NOW

I don’t know how much longer Stability.ai can maintain its onslaught of new releases. The organization has been shipping new generative AI products every week since it announced the new hyperrealistic image generator Stable Diffusion XL. There was ChatGPT’s rival StableLM, the typesetting AI Deep Floyd (the first in its class), and now Stable Animation.

Think of Stable Animation like the video generator Runway Gen-2, but exclusively for animation. It can produce pretty much any illustration style imaginable in clear and defined resolution (at least per the demo trailer below). Or, in other words, it’s like having your very own animation studio loaded with talented animators working to materialize your vision.

It’s not an app you can access on the web, but a free, open source downloadable software kit is available for artists and developers who can use Stable Diffusion 2.0 or Stable Diffusion XL generative AI engines to create very complex and sophisticated animations.

The engine upscales and interpolates to create video of any resolution with an unlimited number of frames. It can also do 3D warp and renders, control color, extend images with outpainting, and even interpolate prompts to get wild transitions between scenes. Like Gen-2, the engine supports text prompts to create animation from scratch and can combine video or image with text to guide the animations.

IMAGEBIND: AN AI WITH AN IMAGINATION

Meta’s new AI, ImageBind, is impressive too (and open source, so expect to see apps using it soon). According to the company’s press release, this generative AI brings machines “one step closer to humans’ ability to learn simultaneously, holistically, and directly from many different forms of information—without the need for explicit supervision (the process of organizing and labeling raw data).” It can understand how different information is part of the same whole, seamlessly integrating text, image, video, audio, depth, inertial data, and thermal data.

This word salad means you can feed an image, a video, or a text to ImageBind and this thing will guess how it sounds, its 3D shape, how warm or cold it actually is, and how it moves. Meta describes this practical application: “Imagine that someone could take a video recording of an ocean sunset and instantly add the perfect audio clip to enhance it, while an image of a brindle Shih Tzu could yield essays or depth models of similar dogs. Or when a model like Make-A-Video produces a video of a carnival, ImageBind can suggest background noise to accompany it, creating an immersive experience.”

But of course it will be more than that once others start to use these models. This takes machines one step closer to having an imagination. I don’t know if I want to purr in awe or scream in terror.

HUMANRF: MOTION CAPTURE FROM ANY ANGLE

Here’s another project that will make video and game production a lot easier: HumanRF—which stands for High-Fidelity Neural Radiance Fields for Humans in Motion—allows you to capture NeRFs of humans in motion. NeRFs are neural networks capable of translating the 3D world into scenes that you can navigate, but they were limited to static scenes and some limited motion until now.

Developed by Synthesia, HumanRF can capture humans in full motion using multiple cameras and turn them into human NeRFs that will allow you to watch that human from any vantage point imaginable. Remember those silly motion-capture pajamas with colored balls that production companies use to turn Josh Brolin into Thanos? Well, no more of that.

A MENAGERIE OF WES ANDERSON TRAILERS

The TikTok fad of turning your life into a Wes Anderson short just got one-upped by the nice folks of Curious Refuge, who are now releasing trailers for iconic films remade by the eccentric director. No matter if you love or hate Anderson, stop saying words and and watch these videos:

Curious Refuge claims it used AI to assist on everything: idea generation, scriptwriting, scene creation, editing, animation, voice over, and even distribution. I would guess that the editing is probably just Premiere, Final Cut or DaVinci Resolve, but I’m seeing some Midjourney or Stable Diffusion XL, maybe ElevenLabs for narration, conceivably D-Id to do the subtle facial animations, perhaps ControlNet depth map for the smooth camera effects, and most probably ChatGPT for the script. I have no idea how AI could be involved in distribution but, whatever.

The good news: they are planning to make more. You can send them your suggestion on their page or on Instagram. My vote: The Shining.

  Be in the Know. Subscribe to our Newsletters.

ABOUT THE AUTHOR

Jesus Diaz founded the new Sploid for Gawker Media after seven years working at Gizmodo, where he helmed the lost-in-a-bar iPhone 4 story. He's a creative director, screenwriter, and producer at The Magic Sauce and a contributing writer at Fast Company. More

More Top Stories:

FROM OUR PARTNERS