AI: A flood of slurpable 'AI Slop' ahead. RTZ #731

Michael Parekh AI: Reset to Zero

8 months ago

8 MIN READ

We’ve been on the threshold of an abundance of mass creation of AI generated content, both human and machine generated. As discussed last month, it’s now called ‘AI Slop’ with increasing frequency in the media. It’s something we’ve been ‘on the verge’ of for some time now, with concern about AI generated ‘deepfakes’ to existential questions on ‘What is a Photo’, to coming waves of both ‘Synthetic Data’ now generating torrents of ‘Synthetic Content’.

OpenAI signing up over a million users in a day with users generating ‘Ghibli’ style images recently was a vivid case in point.

Not to mention increasing use of ‘Digital Twins’ for both enterprise and consumer use. And the increasing generation of ‘machine to machine’ (m2m) content for ultimate mainstream consumption. Turbo charged by AI Agents soon, no less.

This issue has come to the forefront again with Google unveiling its answer to OpenAI’s text to video AI Sora models with Veo 3 at last week’s Google I/O conference which had over two hundred mentions of ‘Gemini AI’ both as one phrase and two words. It’s new trick is the ability to generate realistic audio and dialogue to AI generated video. So far the text to video had been like ‘silent films’ of yore. With rising concerns about the implications for Hollywood.

And even though Veo 3 in a usable form is behind a hefty paywall of $250/month subscription, the internet has been flooded with examples of creative folks can do with it, both good and bad. Fortunately guardrails have blocked off the really bad stuff, but the waves of concern are rising in the media.

The Verge lays it out in “Google’s Veo 3 AI video generator is a slop monger’s dream”:

“Even at first glance, there’s something off about the body on the street. The white sheet it’s under is a little too clean, and the officers’ movements are totally devoid of purpose. “We need to clear the street,” one of them says with a firm hand gesture, though her lips don’t move. It’s AI, alright. But here’s the kicker: my prompt didn’t include any dialogue.”

Veo 3, Google’s new AI video generation model, added that line all on its own. Over the past 24 hours I’ve created a dozen clips depicting news reports, disasters, and goofy cartoon cats with convincing audio — some of which the model invented all on its own. It’s more than a little creepy and way more sophisticated than I had imagined. And while I don’t think it’s going to propel us to a misinformation doomsday just yet, Veo 3 strikes me as an absolute AI slop machine.”

Journalists and online creators are having a blast through the Memorial Day weekend creating all kinds of examples, both cool and clever, AND crass and cringeworthy. Mirroring human nature as it were. Just with a new technology toy. A toy for now, a useful tool for millions soon enough.

“Google introduced Veo 3 at I/O this week, highlighting its most important new capability: generating sound to go with your AI video. “We’re entering a new era of creation,” Google’s VP of Gemini, Josh Woodward, explained in the keynote, calling it “incredibly realistic.” I wasn’t completely sold, but then, a few days later, I had Veo 3 generate a video of a news anchor announcing a fire at the Space Needle. All it took was a basic text prompt, a few minutes, and an expensive subscription to Google’s AI Ultra plan. And you know what? Woodward wasn’t exaggerating. It’s realistic as hell.”

“I tried the news anchor prompt after seeing what Alejandra Caraballo, a clinical instructor at Harvard Law School’s Cyberlaw Clinic, was able to produce. One of her clips features a news anchor announcing the death of US Secretary of Defense Pete Hegseth. He is not dead, but the clip is incredibly convincing. A post including a string of videos with AI-generated characters protesting the prompts used to create them has 50,000 upvotes on Reddit. The scenes include disasters, a woman in a hospital bed using a breathing tube, and a character being threatened at gunpoint — all with spoken dialogue and realistic background sounds. Real lighthearted stuff!”

Fearful stuff indeed in some of the above iterations.

The Verge has plenty of video clips illustrating the examples above and below.

“Maybe I’m being naive, but after playing around with Veo 3 I’m not quite as concerned as I was at first. For starters, the obvious guardrails are in place. You can’t prompt it to create a video of Biden tripping and falling. You can’t have a news anchor announce the assassination of the president, or even generate a video of a T-shirt-and-chain-wearing tech company CEO laughing while dollar bills rain down around him. That’s a start.”

“That said, you can generate some troubling shit. Without any clever workarounds I prompted Veo 3 to create a video of the Space Needle on fire. Starting with my own photo of Mount Rainier, I generated a video of it erupting with smoke and lava. Coupled with a clip of a news anchor announcing said disaster, I can see how you could seed some mischief real easily with this tool.”

And there’re the usual questions of what Google’s guardrails enable and curtail:

“Here’s the better news: it doesn’t seem like a ready-made deepfake machine. I gave it a couple of photos of myself and asked it to generate a video with specific dialogue and it wouldn’t comply. I also asked it to bring a pair of giant boots in a photo to life and have them walk out of the scene; it managed one boot stomping across the sidewalk with some comical crunching noises in the background.”

And of course the question of how this will impact younger minds, whose generations are of course are always eager to embrace new cool, uncouth ways to use shiny new tech:

“I had an easier time generating videos when my prompts were less specific, which is how I confirmed something my colleague Andrew Marino pointed out: Veo 3 is excellent at creating the kind of lowest-common-denominator YouTube content aimed at kids.”

“If you’ve never been subjected to the endless pit of garbage on YouTube Kids, let me enlighten you. Imagine watching the worst 3D rendering of a monster truck driving down a ramp, landing in a vat of colored paint. Next to it, another monster truck drives down another ramp into another vat of paint — this time, a different color. Now watch that again. And again. And again. There are hours of this stuff on YouTube designed to mesmerize toddlers. These videos are usually harmless, just empty calories designed to rack up views that make Cocomelon look like Citizen Kane. In about 10 minutes with Veo 3, I threw together a clip following the same basic formula — complete with jaunty background music. But the clip that’s even more troubling to me is the two cartoon cats on a pier.”

“I thought it would be funny to have the cats complain to each other that the fish aren’t biting. In just a couple of minutes, I had a clip complete with two cats and some AI-generated dialogue that I never wrote. If it’s this easy to make a 10-second clip, stretching it out to a seven-minute YouTube video would be trivial. In its current form, clips revert to Veo 2 when you try to extend them into longer scenes, which removes the audio. But the way that Google has been pushing these tools forward relentlessly, I can’t imagine it’ll be long before you can edit a full feature-length video with Veo 3.”

“Honestly, I wonder if this sort of use for AI-generated video is a feature and not a bug. Google showed us some fancy AI-generated video from real filmmakers, including Eliza McNitt, who is working with Darren Aronofsky on a new film with some AI-generated elements. And sure, AI video could be an interesting tool in the right hands. But I think what we’re most likely to see is a proliferation of the kind of bland imagery that AI is so good at generating — this time, in stereo.”

All this is to point out new ways AI is going to create the opportunity for both mayhem and mass entertainment with this iteration of AI tech. All par for the course in the early days of this AI Tech Wave.

And as usual, we’re barely begun to scratch the surface of what’s possible.

But the important thing is not to overly fear what’s coming. But to trust the inherent net goodness of people, to do the right things over time. Both for fun and profit. Stay tuned.

(NOTE: The discussions here are for information purposes only, and not meant as investment advice at any time. Thanks for joining us here)

AI: A flood of slurpable 'AI Slop' ahead. RTZ #731

Share

Want the latest?

More like this

Commvault Carries +23%

Nvidia Gets The Green Light

ASML, AI Bubble?, Goldman Enters Crypto ETF market

Let’s be friends!