Administrators Andrew Reid Posted December 9 Administrators Share Posted December 9 It's scary good Quote Link to comment Share on other sites More sharing options...
KnightsFan Posted December 10 Share Posted December 10 Really impressive. I think what's fascinating right now is the nice user-friendly buttons and sliders like "remix strength" that they have. Obviously OpenAI has designed this to be usable to the general public with very broad controls. I haven't looked into any API for Sora, but other platforms might be able to expose more technical or complex controls for more advanced users. Same applies for the filters they have added. If I recall, ChatGPT refuses to talk about certain subjects, whereas the API has no such limitation. Quote Link to comment Share on other sites More sharing options...
Administrators Andrew Reid Posted December 10 Author Administrators Share Posted December 10 It's like the very start of a director's career... A little bit of control over some minor productions, and then the skillset / toolset gets greater and greater, until you're painting a masterpiece. All about that UI The model will evolve naturally but the UI can be fucked up! I can see happening in the future a 3 or 4-way battle of AI software like the NLE wars between Adobe, Apple and Avid. Quote Link to comment Share on other sites More sharing options...
KnightsFan Posted December 10 Share Posted December 10 It will be especially interesting if the "competitor" software rely on the same underlying model, or whether anyone will make their own. These models often work in layers, where lower abstraction levels are still using just a few models that may or may not be updated or improved upon. My prediction is we end up with complex AI tools that end up like this classic xkcd, where the critical pillar will be a single image recognition model that a couple grad students trained in 2008 with 1,000 images they took around campus. https://xkcd.com/2347/ Quote Link to comment Share on other sites More sharing options...
IronFilm Posted December 12 Share Posted December 12 I thought this was a very disappointing announcement from Open AI. There was a lot of hype about Sora, but instead was just yet another example that is making it increasingly clear that "There is No Moat", and that the gap between Open AI vs the rest has been closing. In fact not only has the gap been closed, we can't even regard them as #1. For instance in the case of Sora, just look at Hunyuan, Hailuo or Kling Or for GPT4 or o1, compare against Claude. On 12/11/2024 at 9:17 AM, KnightsFan said: It will be especially interesting if the "competitor" software rely on the same underlying model, or whether anyone will make their own. No, they're completely new models we're discussing here. Sure, OpenAI does take their GPT4 model for instance and constantly releases refinements on them. Or you or I could take some open sourced weights and train something on top of that. But something like Sora is "made from scratch". Anyway, I thought this was an interesting tweet thread: https://x.com/deedydas/status/1866509455896260813 I reckon it's a great analogy, "Gymnastics is The Turing Test of Generative Video AI" Quote Link to comment Share on other sites More sharing options...
Administrators Andrew Reid Posted December 12 Author Administrators Share Posted December 12 Still very early days though. Video is a proxy for AI's understanding of reality... in terms of a world simulator. Physics, perspective, light, and a lot more. In the future it won't just be generating two dimensional eye-candy but immersive simulations which are interactive like video games. And the fine grained control over the generation of this sort of content is going to be a very powerful thing, where the director's vision can be put out into this world in a very precise way fully under human supervision, and leaving a lot of room for direct human intervention and editing. It's fascinating that the AI revolution has started in the same way as the earliest mainstream computer software. First with text commands only Then with a graphical UI like Premiere as we see with Sora Giving rise to rich multimedia, video streaming and 3D worlds It really shows something fundamental to mathematics. That the innate thing about it is that it evolves language into worlds. Makes me think that universes and reality are simply an expression of mathematical language. And that the universe itself - our world - is a kind of computer simulation. (My Sora wishlist btw... https://fullframe.ai/2024/02/21/the-ai-directors-wishlist-features-filmmakers-need-from-openai-and-sora) Quote Link to comment Share on other sites More sharing options...
IronFilm Posted December 13 Share Posted December 13 13 hours ago, Andrew Reid said: Still very early days though. Indeed! To compare it with other breathtaking revolution technology changes, maybe it's like the difference between the "mobile phones" (well, field phones) I played with as a kid: vs the mini supercomputer phones we hold in the palm of our hands today in 2024! Almost completely unimagable the massive leap forward from one to another! Then again, perhaps we'll hit a wall and have no more progress at all for the next half century? After all, since AlphaGo way back in 2016 (which was a shockingly breathtaking new development in AI! It achieved what all mainstream AI researchers thought was still decades away from being achieved) and the very famous "Attention Is All You Need" paper (published in 2017) then we haven't seen anything truly groundbreaking and paradigm shifting in AI be announced. Personally, just as an obessed AI nerd observing this over the decades, then I feel everything else since then has just been building upon and refining upon those earlier groundbreaking insights which laid the foundations for what we see now. Everything else since then has been: 1) further refinements and developments building upon those earlier breakthrough foundations that were laid, such as going from GPT3 to GPT4 2) or a better UI wrapped around existing core AI tech 3) or these new insights being applied to new unexplored fields, such as for video generation, but still the same underlying idea at work Thus why I think it's possible we might not see the decades forward leap happen again in AI like we saw happen 8yrs ago, as that did come as a surprise to everyone (well, for everyone who existed outside Google's DeepMind!). But that doesn't matter, there is enough low hanging refinements to do (GPT5 when?) and unexplored new ground (such as just recently people have been doing AI generated video games! Mind blowing) to keep people busy for many years yet to come. For instance even if current core AI tech doesn't advance another inch, it's still good enough currently to replace 80% of workers in a Call Center. Just the implementation of making it done effectively and doing the conversion process of current Call Centers to being 80% AI based is what will take a few years to get done right. Just to give an example of just one industry that will be turned upside down, even if our core AI tech doesn't improve any more. Just because they're incremental improvements upon existing ideas, doesn't mean they can't still be high impact improvements. 13 hours ago, Andrew Reid said: Makes me think that universes and reality are simply an expression of mathematical language. As a maths graduate myself, I 100% agree with this. Maths is the language of the universe. 13 hours ago, Andrew Reid said: (My Sora wishlist btw... https://fullframe.ai/2024/02/21/the-ai-directors-wishlist-features-filmmakers-need-from-openai-and-sora) Ohhh... you have a new website! ♥️ Looking good. Quote Link to comment Share on other sites More sharing options...
Django Posted December 13 Share Posted December 13 22 hours ago, Andrew Reid said: (My Sora wishlist btw... https://fullframe.ai/2024/02/21/the-ai-directors-wishlist-features-filmmakers-need-from-openai-and-sora) Excellent write-up! Been experimenting with Runway a lot lately since production company has the unlimited license. But past the initial impressiveness, these prompt based AI generation tools do show quite quickly their limits imo. Never been a fan of stock footage and to me this is really still just customisable stock footage. Great potential if indeed they added log, sensor size, lens choice etc.. Generating from stills can help get you that precise look but its still way too quirky and uncanny valley for pro use. Prompt base also leads to too much free interpretation with odd quirks. At least with Sora and the remix function you can potentially get more accurate results via subsequent prompts a bit like how Chat GPT works. Love the storyboard feature with 4 variations. Very eager to try it.. but I can already see the limits of its actual use. No 4K being a big one as you point out. For story boards and pre-production its fantastic though. Evgeniy85 1 Quote Link to comment Share on other sites More sharing options...
Evgeniy85 Posted December 16 Share Posted December 16 On 12/13/2024 at 7:37 AM, Django said: Excellent write-up! Been experimenting with Runway a lot lately since production company has the unlimited license. But past the initial impressiveness, these prompt based AI generation tools do show quite quickly their limits imo. Never been a fan of stock footage and to me this is really still just customisable stock footage. Great potential if indeed they added log, sensor size, lens choice etc.. Generating from stills can help get you that precise look but its still way too quirky and uncanny valley for pro use. Prompt base also leads to too much free interpretation with odd quirks. At least with Sora and the remix function you can potentially get more accurate results via subsequent prompts a bit like how Chat GPT works. Love the storyboard feature with 4 variations. Very eager to try it.. but I can already see the limits of its actual use. No 4K being a big one as you point out. For story boards and pre-production its fantastic though. Exactly this. It feels like a random footage generator rather than a tool. I've seen some impressive stuff in runway promos but they generate 1000s of videos in order to pick one. Quote Link to comment Share on other sites More sharing options...
majoraxis Posted Monday at 08:49 PM Share Posted Monday at 08:49 PM Google VEO 2 look to have excellent physics... here's a You Tube Video comparing AI video models at starting at 15:39. Kling just release 1.6 on December 19th, the day this video was released so this is most likely the Kling 1.5... 1.6 of course supposed to be better. One thing to note - film makers need consistent characters and the are ways to train the image generator for generating a starting video image frame. Kling allows you to train it to generate consistent characters. I think 2025 will be when AI video comes into it own. KnightsFan 1 Quote Link to comment Share on other sites More sharing options...
KnightsFan Posted Monday at 09:14 PM Share Posted Monday at 09:14 PM Google's version looks good, too. I do think we're still a couple years off from good, reliable video generators for serious videos. 2025 will see a flood of "content creators" using it for sure, but gluing together multiple layers of neural nets and traditional programming into a cohesive unit will take some time. Image generators typically work on multiple abstraction layers, so the model has a concept of what a "cat" looks like inside a prompt "cat holding a beer." To solve physics and object permanence, I believe that video will need to have a concept of 3D space and objects in that space, and specific characters are conceptualized as a character that can be reused, etc. So I think significantly more work will need to be done on each layer of the network to get beyond making portraits with minor movement. Then of course going off my earlier comment, I think that a significant constraint for something like Sora is that it's designed to be used by absolute amateurs. Professional software, like an integration into Adobe CC or DaVinci Resolve, can expose more controls or even basic scripting (e.g. in Fusion) and expect users to reference a manual to learn it all. The user base for that is so much smaller, it will take more time to get there. Quote Link to comment Share on other sites More sharing options...
IronFilm Posted Wednesday at 12:50 AM Share Posted Wednesday at 12:50 AM An interesting example of what's possible with AI now in the hands of a talented film maker. Every shot of this film was done via text-to video with Google Veo 2. It took thousands of generations to get the final film, but no VFX, no clean up, no color correction has been added: everything is straight out of Veo 2. All sound design, editing, music and prompting by Jason Zada. But remember, people shouldn't be looking at where we are at currently. But rather they should be looking at the trajectory. What could AI video generation do 25yrs ago vs 10yrs ago vs 5yrs ago vs 3yrs ago vs 18 months ago vs 1yr ago vs 6 months ago vs today? It's very rapidly improving! We're doing today what was utterly unimaginable a few short years ago. What does that mean for tomorrow, or six months, or three years or six years from now? Davide DB 1 Quote Link to comment Share on other sites More sharing options...
Davide DB Posted 21 hours ago Share Posted 21 hours ago On 12/25/2024 at 1:50 AM, IronFilm said: It took thousands of generations to get the final film, but no VFX, no clean up, no color correction has been added: everything is straight out of Veo 2. And there was no way to have the same clothes on every take. Impressive nonetheless. At this rate, another year or two and... we will become extinct 😄 Quote Link to comment Share on other sites More sharing options...
IronFilm Posted 5 hours ago Share Posted 5 hours ago 15 hours ago, Davide DB said: And there was no way to have the same clothes on every take. I feel that's a fairly fixable issue within the next few short years. You'll use a reference image that the generated video needs to adhere to. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.