Multimodal Avatar Studio Platform

Multimodal Avatar Studio (MAS) is an AI platform from Intuit’s Foresight team that enables teams across TurboTax, QuickBooks, Mailchimp, and Credit Karma to build avatar-powered experiences. The platform provides avatar generation, voice synthesis, and video rendering to support conversational interfaces.

The Foresight team also used MAS to build standalone products that help small businesses generate marketing videos with AI avatars.

Role

Team

Design lead on Foresight, focused on multimodal avatar experiences

Designers, engineers, product managers, legal, responsible AI

Advances in generative AI made it possible to create realistic avatars capable of speaking, responding, and delivering information dynamically. This created an opportunity for software interfaces to move beyond text and static UI toward conversational, human-like interactions.

Research showed that personifying AI with avatars and voice significantly improves engagement with Intuit products.

Opportunity

Natural conversations increase engagement

10/10 participants found avatar and voice-based experiences more helpful and felt more confident interpreting financial reports.


Multimodal interactions increase AI adoption

55% of surveyed users were interested in an AI-guided onboarding experience. Introducing avatar and voice interactions increased that to 76%, demonstrating the impact of multimodal interactions on AI adoption.


Avatar interactions build trust

Interactive avatar experiences improved trust by +3% among prospects and +7% among customers, measured by willingness to share sensitive financial data.

On Intuit’s Foresight team, the challenge went beyond feature building, and to designing a platform capable of powering avatar-driven experiences across multiple products, interaction models, and use cases.

Platform Solution

MAS was designed as a shared platform supporting avatar-powered products across Intuit. Core capabilities including:

• avatar generation and customization
 • voice synthesis and narration
 • video generation and rendering
 • integration with multiple Intuit products

Centralizing these capabilities allowed teams to build new avatar-driven products without recreating the underlying AI infrastructure.


Products Built on MAS by Foresight

The MAS platform enabled several products built on the same underlying AI capabilities. Each product targets a different workflow and problem space while leveraging the same platform infrastructure.

On the Foresight team, we leveraged the MAS platform that we built to design and ship AI avatar powered video products for our small business customers, and for internal users at Intuit.

An AI video creation product that enables small and medium-sized businesses to generate marketing videos using avatars, voice synthesis, and generative media.

Avatar Studio

Problem

Small businesses increasingly rely on video marketing but lack the time, equipment, and production skills required to create it. Producing video typically requires recording, editing, and creative resources that many small business owners do not have.

Solution

Avatar Studio enables businesses to generate marketing and customer videos using AI avatars. A “digital twin” with their likeness is created using either photos or a live video recording, and a sample of their voice. They then write a script and produce a finished video without cameras or editing tools. The system combines script generation, avatar animation, voice synthesis, and video rendering to automate the production process.

Assumptions and Challenges

We had two leap-of-faith assumptions. If they proved wrong, the product would not succeed:

  1. The technology is good enough to accurately recreate someone’s likeness.

  2. Customers will want to create a digital twin or likeness of themselves, or use a “brand ambassador.”


To test the first assumption, we ran user interviews with Intuit customers and manually created each participant’s digital twin using video to understand what worked and what did not.

To test the second assumption, I ran quick Instagram ads to understand what resonated most: an exact digital replica, a likeness, or a brand ambassador.

Much to our surprise, the exact digital replica direction resonated the most, however, this was dependent upon the industry a viewer worked in.

Learnings Reflected in Product

One key learning was that a business user’s comfort with a digital twin varies significantly by industry.

Participants also needed to feel their digital twin was a true extension of themselves, reflecting nuances in speech patterns and facial micro-expressions.

This insight led us to design an extended voice-capture experience when users create their digital twin to better capture the nuances of their speech.

An internal video creation tool built on the MAS platform to enable internaly teams at Intuit to quickly produce avatar-powered video content without relying on design or external agency resources.

MAS Video

Problem

Teams across Intuit frequently needed to create video content for marketing, onboarding, and product education. Producing video content often required support from design or production teams, or hiring a 3rd party agency, creating bottlenecks and slowing content creation.

Concierge Prototyping to Validate Demand

Before investing in building MAS Video, we manually produced avatar videos for a few teams across TurboTax, Quickbooks, and Enterprise Suites.

Representatives from each team created storyboards with scripts and assets for their videos. I then used brand-approved Intuit avatars and third-party AI tools to produce the final videos.

As other teams across TurboTax and QuickBooks began requesting similar videos, it validated the need for an internal MAS-powered video creation tool with approved brand assets and avatars. We used these learnings to align leadership on investing in MAS Video.

Below is a video created for one of the partner teams during our concierge test.

Solution

MAS Video provides an internal tool that enables teams to generate videos quickly without relying on production resources. The tool uses templates, approved brand assets, and avatar generation to simplify video creation, allowing teams to turn scripts and messaging into video content in minutes.

An AI agent that converts existing email campaigns into videos using avatars, voice, and generative media for ads on social media platforms such as Instagram and TikTok.

Video Producer Agent

Problem

Small businesses often create email marketing campaigns but struggle to adapt that content for social media. Turning an email campaign into video content for platforms like Instagram or TikTok typically requires additional production work, time, and creative resources.

Solution

Video Producer Agent automatically generates videos from existing email campaigns. The system analyzes the campaign’s text, images, and business context to create a relevant video script, generate avatar-delivered narration, and produce a ready-to-use video suitable for social media ads.