The Future of Effortlessly Cinematic AI-powered Video Storytelling with Aquifer

December 1, 2021 Kat Lopez

LDV Capital invests in people building businesses powered by visual technologies. We thrive on collaborating with deep, technical teams leveraging computer vision, machine learning, and artificial intelligence to analyze visual data. We are the only venture capital firm with this thesis.

Our Women Leading Visual Tech series is here to showcase the leading women whose work in visual tech is reshaping business and society.

Thrilled to introduce our next guest – Chen Zhang, co-founder & CEO at Aquifer – a platform for teams and creators to make stunning animated videos and live-streams with their brand's IP in minutes. She previously led teams in designing and developing new products and XR experiences for Fortune500 companies at Part Time Evil, an immersive studio she co-founded. Before that, Chen led product strategy and brand marketing at frog design, Under Armour, and VRBO and consulted for Gensler.

We spoke with Chen about her route to being an entrepreneur, how she is overcoming a lot of the challenges associated with being an early-stage business and how Aquifer is going to impact every brand-consumer interaction out there in the long haul. (Note: After five years with LDV Capital, Abby decided to leave LDV to take a corporate role with fewer responsibilities that will allow her to have more time to focus on her young kids during these crazy times.)

The following is the shortened text version of the interview.

Abby: How would you describe what you do to a two-year-old?

Chen: I run a company that has built a technology that allows companies and teams to create animated videos very quickly in minutes, instead of taking days, weeks, or months, which is how long it would usually take if they were going out and finding experts to do that work.

Abby: My daughter would be mind-boggled if she knew that Masha and the Bear took so long to create, and it'd be very meaningful to her to say that you could create a zillion new episodes of Masha and the Bear as quickly as possible.

Chen: You could create a new episode of Cocomelon in minutes versus I believe right now it takes six weeks.

Abby: Six weeks per episode of Cocomelon?

Chen: They have teams that run concurrently so that they can put out weekly episodes, but it takes six weeks to make one episode.

Abby: For anybody out there who doesn't have a child, you should Google it right now. They are masterful in the world of YouTube cartoons. What got you interested in this area? Why did you decide to build Aquifer

Chen: I stumbled into the world of 3D through my work. Earlier in my career, I was overseeing digital experiences, primarily for marketing and brand for at the time homeaway.com, VRBO. I very much understood the world of creative asset production from real-time production to digital asset production and knew what it was like to work with vendors and studios to get that work done.

Where it sort of flip-flopped for me was my co-founder and I was running an AR/VR and 3D content studio. I saw firsthand how powerful these experiences were for viewers and users of AR and VR and how immersive it could be and how educational and entertaining it could be beyond ‘normal’ content that was shot, but also knew firsthand how expensive it was to produce that content. We were getting frustrated that we couldn't work faster as a studio. My co-founder, Matt Udvari, created the first version of Aquifer so that we as a studio could just work faster. With this app, folks like me or other artists on the team that are not professional CG animators could make content and be hands-on involved.

That's when we realized that there's a bigger potential for this than just a way for our studio to work faster. What if we could put this directly into the hands of people that were our customers at the time? Marketers, folks running learning and education teams at large corporations, what could they do with a platform like Aquifer?

Chen Zhang and Matt Udvari, co-founders of Aquifer

Abby: Before you and Matt started working together at the studio and building all of this, is this something that you've been doing in your spare time? Are you a huge consumer of immersive content? A lot of times there's this stigma that immersive content is only VR and it's a bunch of dudes playing games on a headset, but that's not what it is at all, right?

Chen: In my career, I've always just been involved with emerging technologies as it came along with how consumers wanted to. So I remember a project I worked with at the time at Samsung's agency of record was an NFC project. I don't know if you remember NFCs, but it was this way that Samsung phones could pass data and content just by putting them close to each other. And so we created this touchscreen experience that was embedded into street bus stations and airports where Samsung phone customers could tap their Samsung phones against this touch screen and get books and videos and songs downloaded for free. That's not AR/VR, but this was back in 2010, when it was, "Okay, what can you deliver to a device and have and engage with someone outside of the typical way of just reading or watching? How else can you engage them?"

I've always been interested in that and that led me to a three-year role, working in design consulting for the Department of Defense. I was helping them figure out the next generation of cyber defense software and how they can use gamification, AI and 3D data visualization. This software is used by cyber operators who are fresh high school grads with no experience in cyber defense. Our goal was to cut down decision-making timeframes through different ways of visualizing what decisions they had to make.

Abby: So not Ender's Game at all.

Chen: Exactly. Not at all.

Abby: What made you and Matt jump to becoming entrepreneurs?

Chen: I couldn't imagine having this in our hands, seeing the potential and letting it be an internal tool. It felt like a crime to not try to share this with the world.

Abby: It sounds like that was wholly you, right? Would Matt have been happy using it for the studio and improving your processes? And you're, "No, the world needs this."

Chen: He's a creator. He creates experiences, films and products.

I'm always looking out and seeing, "Can this solve big problems for other people? And how can I take something that he and the team have built, give it value, put it into the hands of people and give them new talents, new expertise and skillsets that they wouldn't have had ordinarily." That's how I'm wired and that's how I think.

Abby: Where did that come from?

Chen: I'm one of those high-level granular thinkers where I jump back and forth all day every day. I have this constant, "Okay, there's this micro thing I'm thinking about and doing. At a macro level, how can it have an impact? How does it have an impact?" And then coming back and using that to figure out, "Is this micro thing I'm doing worth doing? Is there something bigger here?" I've always just been that way.

Abby: Did you envision the challenges of entrepreneurship before you began down this road?

Chen: Lots of entrepreneurs have those stories where they're like, "In my high school, I sold scrunchies, and then I made $16,000 on scrunchies." That was never me.

I'm a first-generation immigrant. My parents were first-generation immigrants and they had me on a straight path to go to school, get a great degree, find a nice job, over-perform at that job, make everyone's lives easier around you at your job and be good, have stability and stay there.

I didn't necessarily have aspirations of being an entrepreneur. My mother didn't have a US college education and didn't speak English when she came here but she had been running a successful art business on her own for decades. Even though she told me, "Don't be an entrepreneur," what she modeled was being an awesome entrepreneur and building something out of nothing. In my heart and soul, that was always there. There were a couple of tries when I was younger and earlier in my career. I made a connected foam roller because I was into health and fitness, still am, and tried to figure out where I could take it and met up with some people, get their feedback on the prototype. Then it turned out manufacturing is this whole crazy thing that makes a startup challenging. That didn't work out.

Since then, I had been always thinking about how I can make an impact. What is a product or an experience that could change lives or change people's work out in the world? And with Aquifer that all came together.

Abby: With a foam roller, you're like, "Oh, wait – Manufacturing – don't want to get into that whatsoever." Is there something in the process of building a software platform that you weren't anticipating at all?

Chen: You have to have self-belief and motivation to bring a viable business into the world. A lot of people think, and I certainly did when I was younger, that it's all about the product. It's all about the idea. Once you build the thing, they [customers] will come. That is not true. This is a lesson a lot of entrepreneurs learn at the beginning. It's not as easy as “build it and they will come”. There's a ton of work that goes into finding the market.

You can have a great product but if there's no market for it, it will never succeed.

For Aquifer, we had an instinct about what that market might be but it's taken exploration. It's taken a lot of hard work to go out there and get the numbers and get the data to support who exactly we should be selling into and why they would pay to use a product like Aquifer.

Chen Zhang, co-founder & CEO at Aquifer

Abby: That's been part of our conversation since the onset. As we were talking about LDV investing you ended up leading your seed round. We talked a lot about who could use this and it seems like the possibilities are endless. Everybody is going to need this. It's going to be a part of everybody's future strategy when it comes to brand and IP management. Right now, you're trying to focus on who for this instant has already identified this as a giant pain point that you can go be the morphine for. And so who are you honing in on at the moment? Who do you think is that “sweet spot” where you want to start and want to target your first customers?

Chen: We've honed in on media and entertainment, particularly organizations and brands that are targeting a younger generation, so specifically Gen Z or younger. The reason we're doing that is that Gen Z grew up with the concept of animation as being a part of their everyday experience, whether it was through Pixar movies or through video games that are now the #1 form of entertainment preferred by younger people.

For older generations, there's an inclination towards live-action, and animation is nice. But for younger folks, whether it's animated or NFTs, AR/VR experiences, games, films, even the idea of virtual influencers, none of that takes explanation if they've grown up with it. It's a part of their day-to-day. The idea of interacting with avatars or characters is not new or weird, it's part of life. We're seeing that sales conversation and that value conversation are shorter and faster if we're approaching brands and companies that are targeting younger audiences.

Abby: It makes a ton of sense. That's amazing to me that gaming is our number one choice of activity. For all of us who are not Gen Z-ers, what do you mean by “virtual influencer”? What does that incorporate?

Chen: The virtual influencer is any persona that's not a live-action shot human person, shot on film or digital. The reason why we're talking about the concept of virtual influencers and virtual talent more now than ever is that the technology to create characters that look compelling and real, and that are emotive has evolved over the 2000s to be more accessible. Think Geico gecko, virtual Barbie, think the sausage guy on TikTok that does every single new TikTok dance and now has 5+ million subscribers. The technology to make that kind of content is more accessible to those people that have technical skills and can build. However, there is a whole world of people who don't have those types of technical skills.

You have this mismatch between the consumption of animated content and the number of people who have the technical skills to create animated content that is of high quality. That gap is where we see Aquifer delivering the most value is that we know this kind of content is going to grow.

We know that virtual influencers, virtual talent and virtual characters are going to become more and more part of our lives with the emergence of the metaverse with the concept of having a virtual identity and the need for more entertainment and education content out there in the world. We want to be the platform that enables anyone to be able to create high-quality animation.

Abby: It makes a lot of sense, especially when you're thinking about it in the context that we interact so much as ourselves online right now, with this perfect representation of who we are via our photos or our videos because that's the easiest one to do. That's what's in our hands, but would we represent ourselves this way if we had the technical ability to represent ourselves differently? Would we be ourselves on this call right now if we instead could be something else? Putting that into the hands of people is, as you mentioned, the next step in the metaverse. There's so much hype around the metaverse right now. To you, what does metaverse mean?

Chen: The metaverse is any digital/online community or a form of communication that is not just two people in a room or in a physical space talking to each other. It's inclusive of social media networks. It's inclusive of games. It's inclusive of virtual immersive social experiences like VR chat. I don't think it's a singular metaverse.

There are going to be tons of these little networks and these universes where we get to show ourselves to that world in different ways, depending on context. I show myself differently on LinkedIn than I do on Instagram. I don't post my vacation photos on LinkedIn, but I do on Instagram. It's going to be a similar situation in the metaverse where we may choose to adopt slightly different versions of our virtual identities to engage with other people in different contexts.

We see the same thing happening with brands. Burger King on Twitter has a very specific personality – they're snarky and sarcastic.

We see brands now taking on this personified way of engaging because, frankly, that's how we as humans want to connect with other identities. The next step for brands is to further personify and have virtual, physical representations of that brand. They would engage differently in different platforms as well.

This mass humanization or personification, if you want to call it, is about how we engage in the future.

Abby: That's such a great point because one of the big findings everybody has is that a brand on Twitter doesn't nearly do as well if you're interacting as a company as if you are interacting as the CEO of that company. That's hard to control because your CEO changes on average every 10 years versus if you have a brand character or characters, a whole suite of them, who are consistently managed by the brand itself but are those personas that people get to interact with and if you want the snarky king, you can follow the snarky king and interact with him. If instead, you want the nice hamburger clown, you can interact with a nice hamburger clown.

Chen: Exactly.

Abby: As the world changes over the next 5 to 10 years, what do you see as Aquifer's role in that? Is it helping the brands personify themselves? Is it going to eventually be people? Are everyday people are going to be able to record a virtual video or an animated video as easily as we do with TikTok today? What's Aquifer's big vision?

Chen: Our big vision is to make the creation of CG and animated content as easy as whatever is the most common form of creation out there in the world then. We believe that making a 3D animated scene should be as easy as making a TikTok video with you and your friends. We have these incredibly powerful devices in our hands that have amazing cameras and microphones, and you can shoot movies with the latest iPhone in terms of quality. There's no reason why you shouldn't be able to do that with Aquifer. Our immediate plan and what we've done today is to make an animation analogy to a ‘point and shoot’ that feels like point and shoot. It feels like recording a selfie video, but what you're doing is grabbing that data and manipulating scenes and characters just as easily. The next step is to continue to use AI to make creating even easier.

Sometimes if you're thinking about, "Oh my gosh, what should I make for my next TikTok video?" AI can help you make a lot of those decisions.

What's the best camera angle? How quickly should you have these cuts move depending on music? The TikTok platform does a great job of making some of those decisions for you. We believe in using AI to do the same for animation, choosing your camera angles, having your characters animated based on just audio or based on just text and then having that scene feel effortlessly cinematic. That is our vision for the future.

Abby: “Effortlessly cinematic” – I love that. It sounds like you've got machine learning, AI and a lot of camera techniques embedded in your platform. What other stuff are you working on?

Chen: Within Aquifer today, we have facial motion capture. We can choose and create virtual cameras. Virtual cameras are cameras that you use to shoot a full 3D scene but we're making that accessible on mobile.

Abby: Wait, do you mean that you can have all different versions of cameras that you would use on an actual movie set to shoot a person, but you've got the virtual versions of it within your platform?

Chen: Exactly. In the mobile app, you can choose to create dolly cameras, which are cameras that move from one stationary point to another at a predesignated speed. You can choose a handheld view where you can shoot a video and it would get light tremble to get that handheld feel.

We believe in enabling all sorts of creative possibilities in animation that you have in live-action and making it simple.

We're also close to finishing our body motion capture technology, which would allow you to do skeletal tracking with mobile and then be able to translate that in real-time to a virtual character.

You can perform the latest TikTok dance without knowing anything special, maybe except for how to dance, and then be able to animate that in a few seconds. We'll continue to push the AI and the predictive capabilities so that we can further shortcut that creation time to even being able to designate the format, the video that you're looking to make, the length of the video, and then composing the cameras and the movements for you based on a couple of quick choices.

Abby: That's fascinating because I can imagine that that will an impediment in the future too. Right now you're targeting people who know what videos that they want to make but as you start to get closer and closer towards nonprofessional users or brands, creatives who don't necessarily have the technical shoot knowledge to be able to automatically do that for them, is tremendous.

Chen: Video-based storytelling follows a lot of similar formats. For longer episodic, there's a beginning, middle, and end. Depending on how you break that up, whether it's a 5-minute episode, a 12-minute episode, or a 30-minute episode, there are suggestions we can make in the platform that would shortcut that decision-making. Something we often hear is, "Can I make a 15-second social video and a longer episodic video with Aquifer?" And the answer is yes. We want to be able to say, "Okay, make a choice." Today, I want to make four 15-second videos and one longer episodic and the platform can predictably set those up for you with the best-in-class formats. Then you can go in there, tweak it, make it your own and add a little flare here and there. The platform should take care of a lot of edits for you.

Abby: As we were doing our insights report over the summer, we were looking at virtual production and video production as a whole, and one person that we talked to said that within the next 15 years we will see somebody make it to the stage of the Oscars who has made a video with one person, no film crew, nothing like that. Do you think that's a 15-year-thing for animation or is it even shorter than that?

Chen: I think it's much shorter than that. We have aspirations of being able to support creators who are creating Netflix & Amazon Prime caliber content within the next five years. With the democratization of this technology, what we'll see is people won't be competing with each other based on knowledge of technical expertise. They'll be competing on the merit of storytelling and creativity.

Ultimately, what all of these enabling technologies do, is take away that artificial barrier of having access to very expensive hardware and software to put together a beautiful film.

Abby: We've also seen a lot of companies that are thinking about the way that they analyze feeds as they exist today, figure out what are the features that make them viral hits and how to automatically integrate that or recommend how to create the next generation of content for a brand or a person. Is that something that you guys are doing as well? It sounds like it's an embedded part of this AI storytelling feature.

Chen: Yes. Those formats are being dictated by a lot of the publishing platforms today. You think of what is so addictive about TikTok videos is that they often have a punchline. There's a setup and then there's a couple of seconds of suspense. Then there's a punchline at the end that resolves for the viewer and makes them feel satisfied and gets them to go to that next video. That is an age-old format of joke-telling – you have setups, suspense, punchline. We want to build that capability into the platform. Instead of spending a long time thinking about, "Well, what should the setup be? How long should that suspense be? And what should my punchline be?" It should just be there for you as a first pass, and then you can adjust it later on.

Abby: There are so many different features that you can fill. There must be so many demands across all the potential customers and customers that you're working with. How do you prioritize which things to work on? And when? What's your trick to figuring out where you should spend your time?

Chen: Our number one way to prioritize has always value delivery. Value delivery in the form of what would enable our customers to create content that grows their brands and transparently makes them money. Because our customers are brands they're actively engaging in social to grow the fan base but there are huge opportunities to drive revenue now in the metaverse and across social media, whether it's through the ability to give gifts on TikTok or even live streams on Twitch deals. Digital merch, NFTs, physical merch – they're all legitimate monetization channels.

We are prioritizing the features that we build first on what we can put into the hands of our customers that enables creative flexibility, that gives them additional revenue opportunities.

The next big feature we're going to be rolling out in addition to body tracking is livestreaming. Livestreaming for brands and creators is a huge revenue driver. We want to enable a CodeMiko type experience without the massive team that she has at her disposal to be able to stream as a virtual character, virtual persona and tell stories that are much more varied than just seeing a character talk on camera neck up. Characters should be able to motion around the room, interact with props, tell you about their lives and engage with their audiences in a way that a real person would.

There will be 100X more visual content online by 2027. But the good news is – creators won’t have to be technical, just creative, thanks to automation tools like Aquifer.

This interview goes perfectly in line with our 2021 Insights Report, “Content & the Metaverse are Powered by Visual Tech”. Get your free copy now.

Abby: Today, we think of being able to either have an animated or a fake background behind us or have a fake person [synthetic media], but being able to have the two that interact with and move with each other, is something that we haven't seen come to fruition yet. I guess it raises the other question. You work across all platforms. As easily as you export a video from Aquifer and put it on TikTok, you could livestream it onto Twitch, Facebook, or anywhere else.

Chen: Exactly. YouTube Live and Twitch are the two big players for our customers. We want to enable the ability on desktop to be able to quickly OBS directly into those channels or multiple channels at the same time, depending on what their business model is. Then simultaneously someone could be livestreaming with Aquifer and then someone else on the team could be creating TikTok/Instagram videos or creating a mixed reality video for YouTube because we have those top platforms in mind and want to make it as easy and flexible as we can for our customers.

Abby: I look at what’s happening in livestreaming and it seems like right now it's all in the gaming space. We haven't seen any companies be able to monetize the actual characters of the game to play, to livestream themselves, or to get in on this whatsoever. Has the biggest limitation been that there weren't any of the capabilities to take the brand's IP, like a specific character from a game, and bring them into a livestream or have them own that livestream?

Chen: There's been a couple of factors. Before you needed a huge team to pull that off because you had to have a physical setup with body motion tracking, facial emotion tracking, green screen into some virtual background. You had to make all those assets. There's a ton of setup and it's tens of thousands of dollars to get that set up for character livestreaming from modeling the character, to modeling the background, to getting the suit set up, to rigging the character so it all looks good when they move in real-time, and then having technical experts who can then manage that stream out of the game engine, which is where all the motions are tracked and the scene is put together and rendered. It's typically a 5+ person team to pull that off. It takes a lot of time to set up. If anyone listening has ever tried to set up a body motion capture suit, you know there are at least 30 minutes to an hour of calibrating to make sure it's tracking your body properly. That all takes a ton of time and you can imagine as an investment for the brand, that's a massive investment to set up an entire team just to do this. Especially with that team having to be in-person to set this up in one physical space together, COVID limitations put a stop to that as well. It's not something that one creator can do with a webcam.

Abby: On top of it that would be for one character. Then you want to multiply that across multiple characters and you want to have everybody have different presences in different ways. It makes a lot of sense that it isn't feasible today to make this happen and generate significant value versus when the value proposition that you guys bring to the table. Okay, let’s do some rapid-fire:

What was your favorite cartoon growing up? Cinderella.
Brand mascot that's most embedded in your brain? Tony the Tiger.
If you could live in a video game, which video game would you pick? Zelda. It's the prettiest.
Do you have a favorite virtual influencer or YouTuber? I like CodeMiko. I think she's pushing the boundaries.
What brand have you not connected with yet, that they should be using Aquifer? The NBA.
Who is an entrepreneur that you look up to? Melanie Perkins of Canva.
If computers didn't exist, what would your career choice have been? I would have been an architect because it's what I got into undergrad for.

Abby: Amazing! Chen, thank you so much for joining us on Women Leading Visual Tech!

See this gallery in the original post