Full Transcript
Jonny Burger
Episode 37 · 138 segments
Today I'm speaking with Johnny Borger, the creator and founder of Remotion, which is a framework for making videos programmatically with React. In other words, making videos with code. I actually reached out to Johnny a few weeks ago to book this interview. And then I think literally two days later, Remotion went totally viral in the kind of Claude ecosystem when they had just released agent skills for Claude. And so almost overnight, Remotion suddenly had thousands of non-coders and developers prompting ⁓ promo videos into existence, all powered by Remotion under the hood. So they kind of went from this fairly unknown, quite niche developer framework for people building video tools to suddenly having almost, I think it's three quarters of Remotion users now being non-technical people trying to one-shot videos with AI. And for me... someone who's like deeply interested in this frontier that we're in right now of like agentic video coming onto the horizon. It's like a really, really clear, one of the clearest real world examples of early agentic video becoming like really, really practical for people. So it's very timely. I'm very excited to be sharing this interview with you guys. One caveat I would say on watching the interview back is that I am still learning so much about this space and coming from the video motion background. I am still learning how to ask good questions about this topic as I learned. So honestly, this interview kind of warms up after about 25 minutes, as I get my head around things. So if you're interested in a Gentic video, as interested as I am, and you're up for sticking with me as I learned, then this will be a very interesting interview for you. So I hope you enjoy this interview with Johnny Burger.
So Johnny, ⁓ it's like a really interesting time for you and Remotion right now. You guys started or you started in 2021, which is a long time ago. And I remember seeing that and thinking like, wow, that's really interesting. That feels like the future of video. I'd love to play around with that at Glide, but ⁓ we just haven't got the resources to do it. But the idea of like a framework or making video programmatic was really interesting to me back then. But then fast forward to now and everything that's happening with like agentic stuff and Claude code and... way that that's affecting remotion, it sort of feels like maybe you had it planned, but I know that that's not, that's not true. ⁓ So maybe you could just like for history sake, take us back to the beginning. Like what was, what was the original intention with remotion? What were you kind of seeing in the first year or two of its usage? Like what were the use cases?
Yeah, indeed. It's like a super exciting time and I really enjoy like how fast we can build stuff we could not have imagined we could build ⁓ in 2026. so like having a lot of fun and getting a lot of adrenaline now with coding nowadays. ⁓ And yeah, indeed. It was not always like this. I guess like what I was always interested in was like I would see like other videos, motion graphics, ⁓ that professionals did and you know I would just like wonder, I try to decompose those videos and like figure out how are those things done like in terms of timings, shapes, ⁓ effects and like even the very complex videos ⁓ you can break down into like many small effects. And I was also an After Effects user and would try my luck at ⁓ animating small product videos. ⁓ But I was a much better developer and being...
Mm-hmm. Because you studied computer science, right? Was that right or no? Or dropped out? Yeah.
⁓ Yeah, indeed and I would say like since I was 16 or so I was always like building my own apps and that was my main strength even though I was very interested in motion and ⁓ animation and so I would try to do a bit of both and you know like if you use like an animation program like After Effects. ⁓ Let's say you want to duplicate something, you would just like copy paste the layers. I mean, at least ⁓ I think there are better, like more sophisticated ways, but most people would just like copy paste the layers. And then, you know, if you need to go back and edit one layer, only one of them would update. And then you did not have sophisticated version control. ⁓ So this got me interested in trying to make motion graphics with code because I felt like I was such a much stronger developer than animator that I felt like I might actually be able to create something better with code.
I that's such an, like I share that while not being like a traditional developer or my background, not it's, it's veering more and more towards that nowadays. I've come from like the video and music and, things like that. ⁓ but yeah, it's always felt to me, especially working with developers so much, like video needs to be really reinvented in lots of different ways. Like the U, the UX of video production and motion design has always felt really like mouse driven clicky. So that's why it like appealed to me. when, when you were, um, when you were first starting with it, what was, what was your first take on it? What was the first like iteration or the version of, re-motion?
⁓ Very basic. I think I tried to like rotate ⁓ a react logo like with these three ellipses that would form like an atom and then I would like try to animate small points in the atom like the electrons and back then this was pretty amazing and also I was like trying
You
Back then I was like a mobile app developer, like a React Native developer. And I would try to just like put some screen recordings that I made into like an iPhone frame and ⁓ maybe animate that a bit. So like very basic animations.
And maybe I think at this point for like, cause there will have a mixture of people listening, people who know what re-motion is and maybe how to use it, but also like traditional video ⁓ producers, maybe for the second group, you can kind of give a, like a one-on-one of like what re-motion is, how it works and this whole like paradigm of, of, of rendering videos ⁓ in code or on the web. ⁓ Yeah. Like the stack basically of what it is for people who don't know about Chromium or FFmpeg and things like that.
Yeah, definitely. So, re-motion, it's an engine that allows you to write code that expresses how a video should look like ⁓ in React ⁓ and turns that into like an actual encoded MP4 video. So, we have many primitives like a video tag, an audio tag, ⁓ and you can use the rest of web technology, SVG, CSS, you can use... Flexbox to just like ⁓ render multiple items in a row and align them perfectly without having to use your mouse. ⁓ Things that us web developers can take for granted, but not the traditional users. And yeah, you write this in React. ⁓ You essentially write a website. But DreamMotion also gives you a hook called useCurrentFrame ⁓ that is essentially just the current time and then it becomes your responsibility for that time to return a component. And ⁓ that is the whole idea. Based on the current time, you create an image and then to... render that video to make it an actual MP4. We spawn up a headless browser and put the whole thing together using FFmpeg. ⁓ In simple terms, there's like ⁓ whole audio mixing pipeline that's also involved and we try to make this run on various different...
Mm-hmm.
environments, ⁓ Node.js, AWS Lambda. Today we just launched a rendering solution for Vercell. We are trying to make client-side rendering happen. So guess this is also a big part of my job to... ⁓ Yeah, like make whole ecosystem of components for anybody who wants to like render videos.
knife. I really want to get into that later, but just, just going back, cause I don't want to leave the non-technical people behind. Like, ⁓ like right now it kind of makes sense because agents are able to use code really, really well. So a framework or a paradigm like Remotion to make videos makes total sense. like back then, who were the kind of people in 2021, 2022 doing this? Was it just developers? And was it like, what were the kind of reasons people would use Remotion back then?
our initial users were like really only react experts. ⁓ people who loved react so much that they wanted to do everything with it. think back then, ⁓ like the concept of react was so exciting. ⁓ like maybe nowadays, ⁓ AI is that people would like. think about the implications and what kind of different things they could do with React. And the concept of making videos with React, if that sounded exciting to you back then, which for 99 % of people it did not, it sounded very complicated, those were the people, those were our early... adapters and I think not just only That it's being in in the react but that we are like pretty low level that we have this like concept of only giving you the current time and you have to like come up with the animations yourself Like having no presets No transitions no effects nothing ⁓ That was also a a big barrier in the beginning. Nonetheless, I think by just like putting out some example videos, some demos, showing people what they can do, focusing on parametrizability, data-driven videos, we got some initial users because for any use case where you may be needed to like turn a dataset into like dataset of 1000 rows into 1000 videos, This was already from the beginning a pretty decent solution.
Mm-hmm. Again, just a tiny bit more context for the non-developers. Like what was so exciting about React back then and how does it relate to rendering video compared to other technologies in the web world?
So with React you had ⁓ composability. You could turn more difficult problems. You could encapsulate them into components and reuse them. I think this concept was very exciting for web applications in general. You had ⁓ fast refresh so that if you would make a change in your code and save it, that it would immediately update. ⁓ That was also pretty new back then. And that was exciting. And in general, having like a declarative way of writing code. So I think now this would like go into deep into programming patterns, but like the way you write react code. what was also kind of like a discovery ⁓ by itself that was appealing to a lot of people.
And if people approach this kind of like video rendering ⁓ with web in other ways, with other frameworks, with other techniques, like maybe you could give an overview of that if you know about those beyond just like, yeah, the traditional ways, but like specifically in the web, is it approached in a different way?
⁓ You mean like you're asking about other tools that were available?
Yeah. Yeah. Competitors or other frameworks or ways of doing this. Has this been like tackled before? I think it'd be interesting to hear your perspective on that because I don't have context there.
So I guess back then if you wanted to like create videos programmatically, I would say Remotion is like a tool to create videos programmatically, you had like several options like for example MoviePy where you would write your video in Python. I guess one of the big drawbacks of that was that you had no live preview that you when you change the Python code ⁓ You would have to like rerun a script rerender the video to see the output ⁓ There were other engines ⁓ like for example money, that's that's the engine that Three blue one brown uses for his math explanation videos ⁓ That is still popular ⁓ today, but was maybe going much into a specific niche of creating math videos. So ⁓ I think re-motion got popular because it ⁓ was not really aware of the previous solutions, but ⁓ came up with this. react way to do it and was also like very generic you can create motion graphics with it but you can also like edit ⁓ real-life footage with it can create a tick-tock with it you can add subtitles to footage so we were not like going into a specific ⁓ type of video
Mm-hmm. Yeah. Maybe you can just frame that up as well. You mentioned about like you guys being low level and that allowing people to, sort of define a lot of things themselves. Like what can you do with remotion and what can't you do with remotion? I think it'd be nice to sort of maybe look at like the way that motion design is done traditionally today and like tools like cavalry or after effects or whatever. And like, you know, the range of different things you can do that. And then what is possible with remotion? Like maybe you could just give context on that for everyone.
For sure, so, Remotion has strengths and weaknesses and I'll be honest about the weaknesses as well, but maybe first I start with the strengths. ⁓ Like any type of content that can be represented on the web, let's maybe think about maps. It's pretty easy to embed a map on website. You can just use Mapbox for example. ⁓
show.
that remotion is very strong at. Whenever you have like complex logic, whether if you just want to fetch data, for example, from an API, from a REST API, you need a JSON and you need to make the video content ⁓ based on that JSON, remotion is really strong at, because ⁓ you can use the fetch API. And in general, most of the APIs that are available. on the web. So ⁓ that is like one big strength of free motion that you can use actual programming, a real programming language to define the content of your video. Whereas with other video editing programs, you in the best case have a scripting language, but it's not like a type check, fast, refreshing, full featured. programming language or framework. Now about the weaknesses, ⁓ Remotion can only render graphics that you can create with web technologies. And ⁓ so we are talking about HTML, CSS, SVG, ⁓ whatever is easy to create with this, it's also easy to create with free motion, but I would say many video editing programs have effects which are based on shaders or that allow you to place shaders on top of text and image elements. And that part is a bit harder in free motion and ⁓ like, ⁓ First composing something with images and text and then applying a shader to it. Not so easy.
So maybe it'd be nice to hear about like how it's grown over the last five years, like both in terms of your contributions to it and also like the communities. And maybe you can explain like what aspect of it is, is it open source or like how do contributions come in from other people? How does it grow as a product basically?
Yeah, so from the beginning, we made Remotion source available. And maybe I quickly explain the difference between source available and open source. ⁓ Our idea was that Remotion lives as a GitHub repo and that it runs like any open source project. But I kind of had like a sense that this project could become a bit bigger. I was working at the same time and had apps that I was working on in my free time and I was kind of like fearing that ⁓ if it would blow up it would only lead to more stress and my reward would be like github issues that I would have to answer. So I put in a clause in the license that still today, it has not changed, ⁓ requires companies to obtain a commercial license. And because of this clause, I cannot call it open source, because open source does not just mean that the source is openly available, it also has ⁓ commercial implications. and because of this, I'm very glad that I did this because this allowed us to set up, set us up for sustainability. Today we are sustainable. Me, my co-founder, ⁓ Igor can work on this and sustain ourselves from remotion. ⁓ and that, yeah, would not have been possible without that. But we try to. still run it ⁓ similarly as possible as an open source project. Whereas like we work in the open, we create PRs, we accept contributions and ⁓ there's no pay walled code. ⁓ It's mostly just that we hope that the companies which do need to get a license will actually do it.
And so how, how has it been extended by you or even contributions over the last five years? Like what was the core of it? I guess, I don't know. I like the terms like the API, the core functionality and what has it grown to now? I saw, I've seen videos of you posting stuff about like an editor or a timeline editor, which for as, as video producers and motion designers, that's much more understandable to us. Yeah. How has it grown and who's contributed? What, what are some of the coolest things? that you've seen over the last five years.
Yeah, so in the beginning it was very bare bones, where it was like left to the user to create all of the effects. We then, I think the first expansion that we did was like we added a player that you could embed in your website. That looked like a video player, but was actually just like animating React components. And we built out like the server side. rendering infrastructure. Now that we had like an embeddable frameless player and the way to render videos on the server, this was like the first step towards allowing people to build their own web-based video editors. ⁓ At first, simple ones. And we had a few... like notable contributors like I remember Shankar Deep was is one of our earliest contributors, which I also ⁓ met last year for the first time. He contributed this player component. And we later built out like a whole re-motion ecosystem with more effects, transitions package. We now have like 15 free templates and also two paid templates. One of them is like a video editor in a box where you, yeah. clone and you start off with a video editor and ⁓ you either engineer or wipe code yourself towards your own product, your own SaaS, whatever your vision is that you have that is based on a video editor. And we also built out like a couple of AI integrations. So. Now we have like a template for people who want to build their own prompt to video application. We have an MCP, we have ⁓ skills. I think this was the most recent ⁓ area that we built out. Also with the help of a lot of volunteer contributors and if we see that something is lacking. We often also hire freelancers to ⁓ just work on something and send us PRs.
Mm-hmm. Yeah. I've heard you mentioned like on another interview, ⁓ about. Remotion like you have no ambition, obviously for remotion to become like a B2C kind of thing or like a product or a, ⁓ a lovable of video. ⁓ you're more way more interested in obviously being the backbone. And as you said, like a sustainable business. Like what are you seeing or expecting with this whole thing that's opening up now about people starting to build a genetic video? tools, like maybe you can maybe you can just give a sense of the kind of businesses that are already building on it and where you see this going. what again, as a video producer, I'm interested like in the difference between what emotion gives you under the hood and then like what people are then doing on top of that. I don't I'm not seeing that. Whereas you are being interested to hear your experience.
So I see video creation as a puzzle with 1000 pieces. have maybe a million pieces. You need to have like a user interface to control a lot of things. You have timings, have animations, you have effects. There are lots of considerations about...
Just 1,000. Yeah, yeah.
performance and rendering, audio. ⁓ We're never running out of things to build. And there are different types of videos. There are motion graphics. There are TikToks. There are captions. if you are talking about Atlantic video editing, then it really is about like ⁓ putting all of these pieces together in a useful way. And while the AI is good at putting the pieces together, if it were tasked with taking care of all of the 1 million pieces, ⁓ it would fail. So I think it's worth just like building lots of small pieces that can hopefully be put together in creative or interesting ways ⁓ by others. ⁓ So we create lots of small utilities for exactly that, timings, effects, and also relay primitives for rendering ⁓ fast and... different environments.
So again, like, sorry, this is probably a weird interview for you, like being interviewed by like a video producer, maybe you more go on dev podcasts because I, I lack a lot of context to actually ask the right questions here. So maybe ⁓ another vague question to us would be like, if someone was to try and rebuild After Effects in its full capabilities today, like in the cloud on top of remotion, what stuff would they need to be building that doesn't come from remotion? Does that sort of make sense? like, is, what are you focused on? What would you not do? What would you not try and bake into the remotion offering and leave to other people? That might be a helpful way to sort of draw the boundary here.
I see now, okay. ⁓ Yeah, yeah, good question. I would say, let's assume we want to build an After Effects. I think we would need like an engine that stores and displays items. ⁓ engine would like store which items are currently visible at which position, which effects are laid on top of it. And then we need a UI that allows the users to manipulate this state. This could be like a timeline or an inspector ⁓ or just like be controls like this blue outlines that are put on top of the canvas. And then you also need a way to turn this into a video. That is the rendering part. This is where FMPEG and codecs and the question of how do we capture this content comes into play. And... I'm it's also not like we have like an extremely streamlined answer to this, but very motion is today is that we have a bit of the engine, ⁓ none of the UI. You have to build the UI yourself. And, and, but, but we have a rendering engine. This is what we provide. And so now a few, ⁓ we have to build like an after effects. you would mostly have to build out the UI and the animation logic. ⁓ guess like show a good way to tweak the effects or have like a timing curve ⁓ control panel ⁓ stuff like this is what you would have to
Mm-hmm.
build on top of it.
Did you ever check out the tool Fable that was around for a while?
fable? No. ⁓ I don't know about it.
No, I can't. I don't know who was behind it, but it was a startup for two to three years. I was like crazy excited about it. was a fully browser-based after effects sort of replacement that was just doing amazing things ⁓ for being in the browser. And it, yeah, it had to, it had to close unfortunately. ⁓
no, I was just wondering why you're talking about it in the past tense.
Exactly. Yeah. Yeah. Yeah. They were like on my list to like interview at some point and then it ended and I, maybe I still will interview someone who was involved there, but like, again, I'm like, as an, come from the non-development side of being super passionate about this stuff, which is why I'm so excited about Remotion and other tools like it. ⁓ Because yeah, it does feel like so important to take it there. Like I work in a team of of designers who are always in Figma. And I always feel like this immense slowdown where like they're working in the web in Figma and modern kind of ways of working. And I sort of like pull everything down into After Effects and kind of move things around. So yeah, it would be interesting if you ever find anything out about it, like what you like.
Yeah. Yeah, you know, I don't know about Fable or what they were doing exactly, but what I feel like is taking on the entire scope of a video editor, bringing the entire feature set of Adobe to the web, and then you really have to do everything, and you have to do so much just to catch up. For now, I... I like it's going to be very hard to achieve. then there's like even the whole ecosystem that you have to build up of people who build presets and make tutorials. I feel like it's almost not possible that like one entity can achieve this. And I don't know if this is what happened to Fable, but maybe they also collapsed under the ⁓ scope, which
Do you think Adobe are doing this?
really cannot be underestimated.
Or just getting users as well, because Adobe has so much traction. I wonder if Adobe or like how, I would really love to know how much Adobe are trying to push into this space and like how they're approaching it. They haven't approached you or anything, have they?
Um, Adobe, um, so they have like a product, um, Adobe podcast, um, for which they are now also using a remotion. So, uh, yeah, we, we were working together. Yeah. It's, and, and, um, you know, how, how I also think about it, how do we tackle this problem of like too much scope? Um, or it's like, it's impossible to build up a new ecosystem and
Mm-hmm. Mm-hmm. Okay, cool. wow, I didn't know that, that's cool.
How I see this is that we try to make Remotion compose well with other tools. So if we want to show a map in a Remotion video, it's not our responsibility. There are other people who make great React map components. So we can profit from the React ecosystem. And then also maybe we can combine the tools. You can export.
Mm-hmm.
a transparent overlay from Remotion and put it into After Effects. I do this. I still use After Effects. I love it. It's a great software and I use them together. And I think that's a good way to exist by being composable with other things out there. Components, AI. ⁓ traditional programs without having to build out like massive scope.
Yeah. So I have a couple of questions about, obviously I have loads of questions about this thing about like, I keep on thinking of this spectrum, right? You've got like traditional development here and, and Remotion or programmatic video here. And I guess the question putting it really simply is like, how do we teach language models or machines to do better motion? Right? Like, cause the emotion is pretty simple as far as I've seen in the examples that I've seen in Remotion and the people who build on top of Remotion. And then like, if you look in the world of traditional motion, like world-class motion is just light years in my perspective beyond what we can do there. And I don't think we're going to be teaching machines or training machines on like our mouse clicks inside of, ⁓ NLEs, After Effects and stuff like that. How do we start bridging that gap where machines can really start training on good motion? If motion is just baked into pixels right now. ⁓ does that make any sense?
Yeah, I don't know, to be honest, if we will ever get there, like, you know, to like see if something looks good. Like you really have to look at it and then say, this is is good. And ⁓ maybe and also you have to be creative. ⁓ I think the biggest problem with these re-motion videos right now is like that.
Sure, if you knew you'd be doing things like else, but. ⁓
They are very generic because AI cannot create a unique video for everyone. And it's not like a problem that I necessarily ⁓ want or need to solve by myself. When it comes to the question, ⁓ how do we create better animations?
Mm-hmm. course.
My thought about this is that it breaks down to like math, physics, timing. ⁓ I think we have pretty low level ⁓ methods for, we have like interpolation methods, have spring based primitives and these can be like added together just like math. ⁓ So I guess. There is a bit of hope there that the AI is just like able to figure out at some points like, okay, you know what? It would look great if this would use this exact cubic best year curve and figure out that ⁓ for this use case, we need a different type of timing. And so we can kind of teach the AI that with a skill that for this type, this looks good. But I think this is more like the, the intelligence layer, hopefully like a future iteration of ⁓ AIs, more aware of how we set nice keyframes.
Mm-hmm. Yeah. Also another way of like maybe bridging the gap is like how the chat interface, if you're not a developer, ⁓ interfaces with like some kind of like control UI, like you can say, can I have this? ⁓ no, not quite. Let me just, and then do other things. That will be, that'll be a really interesting time when that happens. Or obviously we have that in software development right now with every vibe coding ⁓ tool. ⁓ But that would be really interesting. I don't suppose you've seen any tools that are doing that right now, either with Remotion or anything else.
⁓ so, so like how we try to enable this, we actually already had it a bit of, like, we already had it before AI. Let's say you had like a value that was like, let's say it would determine like the X position of an item. And it was like in the code. So to then like align it pixel perfectly, you would have to go into the code, change the value, save. go back to the preview, see if it's now aligned. So we already built something called ⁓ visual controls. It's like you would wrap this value in the code into a function. Then we would make a slider appear in our interface where you can just slide and find the perfect value and then click save and it would actually write it back to the code. So ⁓ we have not pushed this yet so far, or like so hard, but I think we should invest more in making this better ⁓ and also integrate it with skills so that people have, yeah, like ways to mix AI with their own visual observation.
You mentioned skills there and that's I forgot to ask you about like in the beginning, like again, on that interview I listened to the other day, you said that like when you went viral recently, you find it funny that you didn't like the video that you did that didn't went viral, didn't even use the things that you put in there, like the skills and stuff like that. Maybe you could explain what was left out, why that was annoying slightly to you and like what skills you're working on, like how are you making, how are you ⁓
Yeah.
capitalizing on this kind of Claude code wave and like making it better and easier for people to make video. What, what's needed for Claude to get more context on all of this.
Yeah, totally. So maybe I tell the story of how this viral tweet came to be and why I guess it's part of the reason why I'm talking to you now is... ⁓
Please, yeah, that would be great. I have to say just a quick aside note, like I think I emailed you to book this podcast like two days before that happened. And then I saw that that happened and I was like, wow, that was good timing. Cause I'd had you in mind for ages and I was like, yeah, this is going to be a really interesting conversation. And then that happened. I was like, wow, cool. Great. It's really happening now.
yeah that's right, that's right. You're totally right. You messaged me two days before, so you're essentially an OG. ⁓ And ⁓ yeah, just at that time, I was essentially just like seeing that other companies were sell, they were like pushing skills. And I was kind of like trying to figure out the merit of this. What are skills? Okay.
I got in there.
to bring domain specific knowledge into LMS. Sure, surely people are gonna ask about this next like they did for MCP and all of the other stuff. But also I thought it was like sensible to do this. So I wrote down some skills and then tried to like navigate myself through remotion with cloud code. But I was like doing the same. ⁓ sequence of steps as I would have done if I would have manually coded a video. ⁓ It was just like a bit faster. So that's what I thought was cool that ⁓ you know now it's another step towards being a bit faster in developing remotion videos. But I did not expect that people would and this is what most of our users are now doing one a day, are doing nowadays is to like give one very ⁓ sparse ⁓ prompt, like, okay, make me a promotion video for my website and ⁓ then like delegate it to cloud code to figure out all of the rest. I never tried this ⁓ and it took me a few days. to like first try it out myself. And then I realized that, wow, okay, like with AI and DreamOcean, this is not good, but way better than I would have expected. And we can work on this and we ⁓ can make it better. So it was kind of an accidental discovery, not even by me.
So, will you just?
And you asked about how we plan to capitalize on it. ⁓ the answer is, yes, we're trying to embrace it and make it better, ⁓ make this a smooth workflow.
Yeah. And I suppose I was asking also like, do you, what does Claude not know about remotion right now? Or does it know everything because it can index the, and understand the docs and stuff like that. Like what, what are those skills that weren't in that first video? I was just trying to understand that.
Yeah, so I think in the video that I posted, ⁓ none of the skills were picked up and Claude just had, because we exist since five years, a bit of knowledge about how Remotion works. Like it got this idea of that it has access to the current time and that it has to ⁓ render something. And then it kind of with its own intelligence, ⁓ just worked with this. ⁓ But I can tell you specifically what the skills would enhance, like what the skills would contribute to cloud. It would tell us how to get the duration of a video, how to do a ⁓ transition. ⁓ which timing options are available, how to layer things on top of ⁓ another, how to ⁓ calculate the duration of the timeline. Now we have also added things like specific effects, like light leaks, ⁓ maps. Today I've added a new skill for... ⁓ how it should approach if the user asks for a voiceover. ⁓
I mean, those skills, you, are you basically saying like, here's how to do it at a very low level, or are you just saying it kind of in like a simple prompt engineering kind of way? Like, I guess what I'm trying to get to here is I'm, I'm interested, like, how can people like myself or any motion designer encode motion knowledge and then give that to either their own personal Claude remotion thing or to the kind of wider community? Like, it feels to me, there's obviously a huge amount of like latent motion, knowledge and skills in people's like ability to use After Effects and stuff like that. Is there a way that we can start getting that into like systems?
So I believe there should be like as little templates or examples as possible and that we should only teach it the fundamentals like how to animate something from A to B and these are the possible options for easings or like this is how you can generate a voiceover with 11 labs and then try to make the intelligence work that is in these models that they connect like some basic rules with the user's request and let's their creativity or intelligence ⁓ work. We had like. A bunch of people try to now contribute new skills. We also had like a template before that was like a precursor of the skills and it worked a lot with ⁓ examples. like those previous skills would tell the AI, here's an example of an animated bar chart in in Remotion. And then you could type in, make me an animated bar chart in Remotion and wow, it would work. ⁓ But the big problem with the AI models is the context size. You cannot ⁓ feed in an unlimited amount of knowledge and not expect the results to degrade. And video is such a vast, broad area that if we would add all of our knowledge, ⁓ Cloud code would collapse and fail.
But, I'm really just dreaming like crazy here. Like, is there a future where you could have like a huge open source? Like, I don't know, this might seem really naive, but I'm just trying to ask a broad question here. Like ⁓ a huge, huge defined library of different ways of doing things in motion, like from bar charts to carousels to anything. And that in a way the agent could be like, okay, I need to go into this category. I need to look in that subcategory, that one, that one. There's the example. are three, let me draw on those. Do you see what I mean?
Absolutely. Yeah. Yeah, that's what we're trying to do with and skills are a good way to do this So I think now we have like 30 skill files and we have one table of contents with like the different skill files ⁓ And initially we only load the table of content and then we let Claude figure out Which which items in a table of contents it wants to load into the context. ⁓ We can barely control it to be honest. Sometimes it does not load any skills, sometimes it loads many skills. But I also think it's it's not really our problem. like the AI companies have to figure out how to make that more intelligent. ⁓
Sure. No, I know it's not your problem. Just interested in how this works and how we, how we push it forward. Cause I'm like, I would love to contribute to this space, not necessarily to what you guys are doing, but just like in future. And I'm just trying to get a picture as a non-developer, like how this is currently working. Cause this is like very like frontier. we were just all working out how this works. ⁓ but yeah, if, if, if, if, if, for example, you guys opened up that skill, contribution thing, like what would that look like from a, from a motion designer? Like what would they have to write in that? Or is this very low level and you're essentially teaching Claude about the low level parts of remotion.
⁓ so, so the skills, they are very technical. It's like small code snippets. and then yeah, this enables, ⁓ non-technical people to like express what they want in natural language. And then those skills are like the translation layer to the, ⁓ technical part. But I would say like, even non-developers. they figure out like interesting workflows. So for example, we ⁓ had one contribution today. it is open for contribution. Absolutely. ⁓ So today I merged the voiceover skill and this is really just a markdown file, which says it's like, okay, if you want to add a voiceover to your video, then it should go through 11 labs because that's like the easiest and industry standard way of doing it at the moment. So ⁓ it would tell the user to like go to 11 labs, make an account, paste their key in there. And then it would describe how to actually turn a text into a voiceover and how to load it into Remotion and how to calculate the duration of the of the voiceover so that you can match the video length to 2d voiceover ⁓ So it was like it's like a rather short ⁓ File with just some code snippets and What they achieve this is this is how the skills work and Ideally this voiceover file
Mm-hmm. Mm-hmm.
This voiceover skill does not get loaded if you do not want a voiceover. And so what we're trying to do now is to create as many useful skills as possible and hope that they compose together and that models get more intelligent over time.
Yeah. Interesting. Okay. ⁓ One last question, I think on this sort of like, yeah, motion and like control and the difference between the sort of do two different paradigms. ⁓ You have like a lot of integration or skill. Maybe you can talk a little bit about that. Like what the pipeline is there, because am I right that that's a way for someone to do something in a traditional way, turn it into something that's understandable by code and then put that in a re-motion project. Maybe you can give us an overview of. of how people use that and what's possible there.
⁓ Absolutely. So for those people who don't know what Lottie is, is a translation layer between After Effects and the web. ⁓ So how the workflow would look like somebody would make an animation in After Effects and they could export it as a file and import it in a website or a specific app. For example, the Airbnb website and app uses a lot of Lottie animations. And we also have a very simple integration to play these Lottie files in in the Remotion. And that works pretty well. You can export an After Effects animation and import it in in the Remotion. One thing to be aware of is that during the translation ⁓ during the exporting and then importing it again into another program, you lose a lot of information and that this lot of file is just like... a plain text data, plain text file containing some data of how the animation should look like. But it's not actual programming logic. ⁓ So you can import this into a remotion, but it's not like you have fully unlocked the power of programming conditionals, if else statements.
Mmm. This is going to be my next question. yeah, cause I don't, that was my assumption about it. But my hope was that like, there's something like Lottie on the horizon, like, that Lottie gets better that does make that translation layer better. So that essentially you could eventually train on the output of motion. I don't know. Like, do you know of anything that's an alternative to Lottie in that respect?
Yeah, indeed. There's something called Rife and...
yeah, of course, yeah.
We also have an integration for that and I think they they have much better performance and are more considerate about this workflow. So everybody who is using the Duolingo app or has used Spotify or app, ⁓ this was made with Rive. And I would say they've pushed it to another level. And what you can do in DreamOcean is like, can design it with with five and then you can do some basic parameterization ⁓ in motion in case you need to. just like tweak one variable and you want to export 1000 unique videos with it.
Yeah. Where does Rive sit for you guys in Remotion? Are they like a competitor in any way or, or like, yeah, how does it, how do you guys sit in the industry next to each other?
⁓ No, I don't see them as competitors. Like we recently talked via email and we wanted to like do a collaborative ⁓ blog post. ⁓ We have not done it, I guess, because we are all super busy, but we would love to like do a tutorial together where it's like you design something and arrive, you have to...
Not in a bad way, I was just trying to understand like how they relate, yeah.
graphical user interface editor and effects and all of the nice things that they have and with free motion you can do like some programmatic parametrization and turn it into an actual video and also let that run on a server so that the rendering is autonomously so that you can create an automation with that. I think that's how you would combine the two.
So we've talked about a bunch of sort of different topics. I guess a really interesting way to kind of close talking about remotion, where it sits in the industry and where it's going would be to, to force you to ask, answer these really awkward questions. Like obviously a founder is always just being, trying to look nearly always a few steps ahead and not think too far because you don't want to just like make promises or, or, or imagine things that are just never going to happen. But maybe if I was to ask you the question, what's remotion look like in five years? Does anything come to mind? Again, obviously you don't know, but like, I'm just interested whether there's anything behind that that could be interesting to hear about, like what you're seeing, what you're thinking about, what you're dreaming up.
Yeah, totally. So I guess now we had kind of like, we've shocked ourselves with now that ⁓ people use re-motion with AI and now seemingly like three quarters of our users are ⁓ non-technical, just wipe coders. So I guess right now we are like focused on keeping a calm head and not get too much. ahead of ourselves because the whole thing can still be much much optimized but i think we want to continue doing what we have done so far which is like creating many small building blocks which can hopefully be combined together into a useful way and right now you can build like a a basic, a decent web video editor ⁓ on the web. But I think we are still not tapping into like the full potential. And we are not a serious competitor to DaVinci Resolve or Adobe. And so I think ⁓ we want to like double down on a thing and like further decompose how the existing solutions work and like solve one small problem at a time. So there's not like one big North star that I could mention, but because I think ⁓ the reason why we are not there yet is like death by a thousand cuts. But we wanna like look at. the potential that the web has. are like many interesting technologies that we do not yet fully leverage. Think about like web codecs, web GPU, ⁓ web file system, which is like important for bringing the performance that you have in desktop apps to the web. And we want to like figure out how we can tap into that. And also I think we just have to like, right the way for the whole video space in general, and also like, ⁓ expect the other tools that remotion is used in conjunction with to improve. we hope that like the AI models improve and they are better at, that they understand better what, makes ⁓ a great video.
Mm-hmm.
because honestly we could write many rules and if the models are just too dumb to get it, then it won't really help. ⁓ And I think that's gonna be eventually ⁓ the solution. And also we wanna like ⁓ empower our community to compose these little blocks in... into useful ways and make discoveries, find out what works, ⁓ maybe for their own profit or maybe they contribute it back to ⁓ remotion. We really see ourselves as like ⁓ the people working in the background and doing whatever people need or solve their problems.
I've heard people say that before, people who are building tools that like, you sort of, there's an aspect of what you're doing where you're, you're kind of betting on and waiting for the models to improve enough to then take you up. It's like a rising tide almost. ⁓ I remember speaking with, ⁓ Teal Draw, the guy from Teal Draw, Steve Ruiz or Ruiz. And he was, he was saying like, that's how they think there is like,
Steve Ruiz, yes.
build a lot, build things now that are really interesting betting on the idea that the models will be able to do it well later, even if they can't use it now. And like, I've been just doing my own for interest. Like I've been doing my own local workflow with remotion and other things and a whole load of different open source stuff to try and make my own totally private kind of pipeline. That's totally bespoke to me. And some parts have worked really well. Other parts are not, but there are some parts that are not where I'm just like, I'm not going to throw this away because I think it's just about waiting a little bit. Like the main one, which everyone talks about now is like the, kind of editorial aspect of AI. Like it's really just not very good yet at saying here's a whole bunch of context is a full video. Do a really high quality edit of that video. Like it gets a seven out of 10 and then it's hard to push it further. Yeah. Don't know whether you have any reflections on that on the way that you're betting on models improving. ⁓ in future.
So I think we can categorize this failure into two reasons. One is like mistakes. It uses the improper technique or a way of editing the video and that we can just solve with iteration. Hopefully we do try it ourselves and we get lots of feedback about things that went wrong and Like if you look at our GitHub, every day we make like three or four iterations and make a new release. Trying to just like fix every possible mistake that comes up until it at some point is good. So it's not like a strategy shift that is needed, I think, but just a bunch of ⁓ iteration over years. The other big problem is I would say creativity like a good video edit that is something very subjective and ⁓ if maybe the AI model has an idea of what is a good video edit ⁓ but everybody is gonna have to but it's like all trained on the same corpus so it's it's gonna
Mm-hmm.
come out very generic. And they also don't think it's not going to be possible that everybody can prompt with no effort and get a unique video for themselves. That I think we're never going to solve. And it's good because that's the creative part that we cannot cross out of the equation. And the other.
Mmm.
I said two categories, but now that I think about it, it's like three categories. It's just like we are overwhelming the AI with a too complex problem with too many things to do at once. And just like how they work, ⁓ we know it's better if we give them a small task and they can execute it well rather than to overwhelm.
⁓ yeah. ⁓ you Yeah.
them with big tasks and they will fall on their face. And there maybe the AI models will get better. And if not, then we are stuck with this workflow of having to do it step by step.
Yeah. It's so interesting. Yeah. Like this, word that you used a couple of times about the composability and breaking this thing down. Like it's, it's going to take a long time. It might be tractable one day, but it's not, you know, end to end going to be there anytime soon. ⁓ maybe a bit like driving, ⁓ driverless cars, I guess, maybe. I don't want to, I know we haven't got loads of time left. I didn't want to move too far away from like your hopes for emotion, like more specifics. You said you had some eyes on like different web tech and stuff like that. Again, as a non-engineer, like what are some of the things that you see coming down the pipeline that like frontier web technology that could start coming into video?
⁓ So there's one technology which I'm most excited about and ⁓ I'm one of the biggest advocates for which is called WebCodecs. And WebCodecs is something pretty new and it's a thing that allows the browser to decode and encode video. ⁓ in the browser and do that leveraging the GPU. If you have any video editing software, of course it uses the GPU to quickly show you a preview of the video. And that has now arrived in web browsers. To be specific, only in the fall of 2025, WebCodecs was... was added to Safari, making it now available in all browsers. So it's relatively recent and it's still buggy. It still has to be improved upon. Like me and our friends, we've like filed lots of issues also with the browsers. But I think once it is there, only then we are even ready to have like... real video editing software very like drag things to it. You have performance timelines, et cetera. ⁓ Previously, you could only tap into the CPU and you would have a very sluggish preview ⁓ or reduce the built in video tag in the browser, but this also had its weaknesses. So, ⁓
Hmm.
I think once we properly figure out how to harness this web codecs technology, then we can build better video editing experiences on the web. And even just web codecs on its own does not solve the equation entirely because there also needs to be like a layer. called the demaxing and demaxing layer, which like understands how do MP4, webM, MOV files work ⁓ and those need to be read and written. And for this, I tried to make like my own library for abstracting this so that the browser can understand and work with video files on a low level.
Mm-hmm.
And I've gotten okay results, but then I found that there's another library, another library came up called Media Bunny, which does the same thing, but much better and much more performant. So we kind of like dropped our own effort and just started sponsoring that library. And so it's...
Mm-hmm.
It's also very new. It's less than one year old. But ⁓ once we and the AI figures out how to harness the power of that multimedia library, ⁓ we can build much, much better video applications on the web. ⁓ Things that would not have been possible ⁓ so far previously. Let me give you an example. ⁓
Mm.
⁓ Let's say I think we wrote on on LinkedIn, right? So let's say you upload a video on LinkedIn you drag the video into the the editor then you ⁓ Upload it then you wait like one or two minutes until it's like processed on the server and you have no way before you post the video to like crop the video trim the video muted ⁓ Do do basic edits and this is because there's no good way to
Yeah, yeah, yeah.
there was no good way to do it client side. ⁓ And if you would have to like upload it to their server first, it would not have been reactive. And so this web codec technology now makes this real time ⁓ edits possible and does it all client side. ⁓ So sorry, I went on a tangent there.
Hmm Mm. This is perfect.
But
No, no
I'm very excited about it and it's an open source library that we sponsor by a guy named Vanilla Chi and I think it deserves more attention.
Yeah. Again, I could ask so many more questions. I'm trying to work out the last ones. Maybe a good one to ask you about would be local first versus like web, because obviously like you're talking about web there, but I've heard you say elsewhere, like about why Claude is so good, like on your computer at doing things is because there's lots of things it needs to wrangle and understand and have immediate access to in the OS. Like that. If we look at the local environment as a really great place for an agent to of like put lots of stuff together. Do you feel like with video, the web is going to like be a lot slower than that? ⁓ Does that question make sense? I guess it's like, yeah, where do you see remotion sitting? Like, do you see it being used a lot in local stuff or is it nearly all, is your focus all on web?
Yeah, we have kind of a mix up with different environments ⁓ here. So like we have the web browser, which is kind of like an operating system on its own and which has certain capabilities. We have like local servers like Node.js or Bun, which also have great capabilities like creating servers, calling into FFmpeg. And you have like server side infrastructure, like data centers. And those are good if you need high computing power available to anybody. If you need to do an expensive re-encode or you need to call an LLM, you cannot do it locally, neither with Node.js nor in the browser. So I mean, how it is right now is... everybody just like ties those strings together in a way that somewhat works for them. And these parts need to communicate and every part and every part does ⁓ what they do best. And so I do think that as we, as each one of these parts gets more, as each one of these parts gets better. So let's say the browser now has better multimedia capabilities, or maybe it's also something that Google Chrome is planning that the browser has like a built-in LLM. And it's mostly the browser who is advancing. So I think that if things shift, it's most likely gonna be towards the browser. But we... You are kind of in a situation where you have like a lot of nice tech, but it cannot run everywhere and you need to tie them together somehow in a non-elegant way.
So last question, how is it running the company at the moment? Like ⁓ it's a really busy time. There's only three of you. It's kind of changing and stuff like that. You must have way more communications, I guess, going on right now that you have to choose to engage in or not. ⁓ Are you like, how are you guys changing as a company, the way that you work? Are you using AI at all? Not necessarily for development, but for your own processes and workflows. Like I'd love to. know about that because I know you're quite keen on not scaling up, not taking loads of VC money, being in control, like enjoying the process rather than going for some crazy exit. Um, and that I deeply identify with that. And I want to do something myself one day in a similar, in a similar vein, not, not like building a tool like Remotion, but building my own kind of tool. And so I'd love advice really, like how are you guys finding this at the moment to, to stay grounded? Like how are you using technology to do that? A a few questions in there. for you.
yeah. And there's like definitely so many, ⁓ so, so many things to consider. I cannot go too deep into everything, but, I'm, I'm thinking a lot about like, what is the work that I need to do every day, which is, ⁓ replying to messages. This I still, I still do by, by hand, ⁓ solving.
Of course.
technical issues for this I use a lot of AI like Cloud code and other tools maybe like ⁓ Doing like operational stuff maybe like a customer has a billing issue or we need to like pull some data from our database for this we also ⁓ Tried to make an MCP for ourselves so that the, we can prompt the AI and it can hopefully do the task. It helped us ⁓ a little bit. ⁓ So these are like things I do ⁓ every day and. It's not like AI is always the answer. One common scenario is like every person asks the same question. Then the solution is we write a documentation page for it. We have like a lot of documentation, over 700 pages. So guess like one of the things that we just do to cut down the work is to be able for every... action that somebody wants to do or for every question that somebody has, we just sent them the right link. ⁓ And then I think that helps a lot. And the other simple tip I have for ⁓ just like... managing your time is to just do less. ⁓ Don't explode the scope that you're trying to do. There are many things that we just, ⁓ no, we don't want to do. We don't want to create a consumer app. And even though a lot of people ⁓ are asking for it and before.
Hmm.
AI people would ask when is the remotion cloud coming? When will you render the videos for us? It's just something that we realized, okay, it would make things too complicated for us. So it's like everybody has to run remotion on their computer or on their server. It's not convenient, but it's a way that we can just like... steer ahead without getting distracted with the core mission. And then I have one final point to this question is like embrace composability. It's like make it interoperable with cloud code, After Effects, Figma, Rive, LoTV. We embrace it. We love all of this. And if... you enable interoperability and any of these other parts of the equations make progress, then we also win and our users ⁓ win and we can move together. We can move forwards together, ⁓ not against.
One last question. What are you most enjoying at the moment and what are you kind of most excited about over the next year?
Yeah. So I enjoy seeing other people, talking to other people and seeing them have good experiences, seeing them post nice videos that they have made, whether it is ⁓ somebody who does not know how to program and they have managed to get some use out of it or if it's like... established entrepreneurs, big companies, fan companies, using it and making a business out of it. All of this, hearing this and knowing that we had played a part in that, ⁓ I enjoy a lot. just like running this company, our team, ⁓ doing cool stuff, enjoying it, making technical progress ourselves. At the end of the day, there's a new feature or we have created a video ourselves and it's hopefully shiny, delightful, whatever that brings me a lot of joy. I try to... Oftentimes on the weekend I don't work on the project but I try to make a video myself and recently I had a lot of joy just like prompting it and if something cool came out of it which is now a lot faster then I like generally like jumped out of my chair and was like I love it.
And so this next year, anything coming up that you're particularly on the horizon, you're excited about, or just generally keeping on doing what you're doing.
I feel like just keep doing what we are doing is the wrong answer. But I don't know, I guess people really expect us to like...
Yeah, yeah, I thought so. No, no, you can answer that. I think it's the right answer if it is.
make a bold move now. I have no idea. Still in the phase of just like keeping our head straight and trying to like assess where we are and what makes sense to do. ⁓ I lean towards just continue iterating on the current thing. for a while to then hopefully at some point have all of the pieces there for it to like unlock something new like it happened now we did not really plan or work towards this.
Yeah, you did not necessarily plan this moment, you, you know, by enjoying the process, you laid the foundation for it to happen. So just keep doing that, I guess.
Yeah, yeah, that's hard to say. We really have no idea. And I also think we are like bad at strategy and stuff, but I think that's we're going to do.
No, I don't think, yeah, I vibe with what they're saying a lot. So that's what I expected. That's great. Johnny, thank you so much for doing this and answering the questions from a slightly non-technical perspective. It's kind of really interesting. Like what I try and cover on this podcast is this intersection of video software design. And obviously you are like a perfect person to speak to here. So thanks for bringing us along into this new phase and explaining some things. Appreciate it.
Yeah, it was great also to now talk more with people outside of the strict react expert bubble. Really, really great discussion nonetheless. And I didn't even feel that I am talking to somebody who's like strictly a non-developer until you like explicitly. mentioned it so ⁓ it was great talking to you and also like a great sign that people like you are finding interest in this.
See you soon.