How AI Is Transforming Localization for Live Content
AI/ML made swift subtitling de rigueur on YouTube and for videoconferencing several years ago. But now LLMs are changing the game for real-time translation and localization and language-mapped reanimation for news and other live content, as PADEM Media Group’s Allan McLennan, Alchemy Creations’ Andy Beach, and Dubformer’s Anton Dvorkovich discuss in this clip from Streaming Media Connect 2025.
Improvements in Cultural Localization
Allan McLennan, founder and president of PADEM Media Group, asks Andy Beach, founder and principal of Alchemy Creations, to discuss the evolution of the streaming industry’s approach to connecting cultures via subtitling. “Are we starting to look at something new and different when it comes to connecting live and news into new forms of localization, new types of interaction? Are you sensing some of that?”
“Absolutely,” Beach says. “I think the tech has continued to mature and the latency has gotten to a point where even AI can help assist on the live feeds so that if you’re in a scale position or a scale role where you’re managing a lot of different channels, you can actually leverage it.” He references an earlier conversation when they talked about how much “better and crisper” cultural localization has gotten, to the point that it can offer regional variations within languages.
The localization can now apply to particular generations and ages. “I had mentioned earlier the notion that my 82-year-old grandmother-in-law and my 15-year-old daughter are not going to want to watch the same version of something because one of them’s going to be lost in the language, but they can technically get their own versions at this point that are a little more specific to them,” Beach notes. “And we’ve even got AI that will help do better lip sync along the way as part of it.”
How AI Aids Live Captioning
McLennan wonders where AI learns how to do this interpretation. Beach answers that there’s always a generalized large language model attached, but “the best models or the best direction that I’ve seen for, particularly, live captioning, is when you can front-load a lot of the additional source. So if it’s sports that you’re talking about, make sure you’ve got all the names in there—the team names, maybe the oddball pronunciations of an athlete’s name that might not be common. Preload all of that into some sort of augmented retrieval so that you’ve got that in there.” He adds that the same advice applies when localizing for cities: “The exact same spelling’s going to be said three different ways across the country at some point, and you want to make sure that you’re saying it right for your area that you’re in regardless of it.”
These “pieces don’t have to be part of the LLM—they can be a secondary data source that you go retrieve from and pull from,” Beach notes, adding that this option is faster, “so that you’re not waiting on a big model to be updated in order to take advantage of it.”
Building Emotional Engagement Through Dubbing and Voiceovers
“I’m just going to roll right over to Anton [Dvorkovich] because you’ve been doing this worldwide in multiple regions, geographics, areas, languages,” McLennan says. “Is that what you’ve been interpreting?”
Dvorkovich, CEO and founder of Dubformer, replies that his company focuses on dubbing and voiceovers, which he sees as growing in demand. “Our view and our mission is about creating content that can really connect with audience[s] on [an] emotional level,” he states. “We’ve been seeing several experiments and nicely set up tests where, for example, voiceover was compared to subtitling in terms of audience engagement, and we see that it does create better engagement. So this is where we’re focusing our attention: trying to build something emotional.”
Join us February 24–26, 2026 for more thought leadership, actionable insights, and lively debate at Streaming Media Connect 2026! Registration is open!
Related Articles
How has AI entered the media workflow? For this new column, we'll look at different applications used in the media industry. For this issue, we'll start with asset management, asset storefronts, and localization. While some of this functionality—speech-to-text transcription, translation, voice synthesis, natural language processing, logo detection, facial recognition, and object detection—has been around for a while, the biggest improvement is that much of it is now available on workflows with live content.
15 Dec 2025
In this article, I'll walk you through a practical workflow for producing multilingual live streams with real-time translation—a solution that's often simpler than managing a complex hybrid event.
04 Aug 2025
AI-driven dubbing has recently gained attention as major platforms like Amazon Prime Video and YouTube roll out new tools designed to expand their content's global reach. Amazon is testing AI-assisted dubbing on licensed content, while YouTube has introduced auto-dubbing for thousands of channels. Both efforts reflect a growing belief that dubbing can help platforms engage new audiences—but the results so far have been mixed.
19 Mar 2025