Save your seat for Streaming Media NYC this May. Register Now!

YouTube Leads the Way on Video Captioning

Providing captions for online videos is a tedious process, which is why we haven't seen it much so far. YouTube has decided to lead the way in the area of providing access to the hearing-impaired, however, by using technology from parent company Google.

YouTube began offering auto-captioning, which uses some of Google Voice's speech-to-text algorithms, to a select group of partners in November 2009. The technology automatically generates video captions when requested by a viewer. The video's owner is able to download those captions and fix any mistakes.

YouTube must have worked some of the bugs out of the software, because it's opened the feature to everyone. The move puts a simple CC button on the player, which viewers can click to turn captions on or off. The feature seems to be on by default, so hopefully hearing viewers will be able to figure out how to stop it when not needed.

Clicking the CC button also lets viewers change the caption font or place a black background behind the text for visibility. The site doesn't remember changes between viewings, so you'll need to enter preferences each time.

At the moment, captions are on only on a small percentage of YouTube's videos, since the auto-captioning process takes time. Video owners can request to have a video captioned, which should speed the process. Captions are currently only available in English, but that will grow to more than 50 languages in time.

To see the captions in action, watch this video of President Obama.

One person who knows about the challenges of creating a speech-to-text system is Tom Wilde, CEO of Ramp (formerly EveryZing). The company's Ramp platform automatically optimizes video for online use, which includes making a transcript and creating tags.

Ramp's service was built with $100 million in Department of Defense research and development money, and Wilde notes that few organizations have the means to create an accurate system. "Google clearly have massive resources to do so," he says.

Getting it right is expensive because the people behind it need to build up the dictionary data, algorithms, and acoustic models needed. YouTube first spoke to EveryZing about the idea in 2006, Wilde says, when they were researching build-versus-buy options. They obviously decided to create a system themselves, and Wilde thinks they've been working on it all this time.

Challenges involve building up the system's dictionary of new terms and names, and teaching the system to compensate for accents.

While Wilde finds that YouTube's captions have the drawbacks that many speech recognition applications have, he believes they'll get more accurate over time. Google probably has a team of people working on it, he guesses, updating the dictionaries and acoustic models.

"It's very ambitiousof YouTube to try to tackle it," Wilde says. "There shouldn't be an expectation of perfection." Better results will come as the system matures.

Streaming Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

3Play Media: Captions Aren't Just a Good Idea, They're the Law

Government regulations spell out exactly what online video needs to be captioned. Learn the laws to avoid fines and lawsuits.

SXSW Report: The Future of Online Video Captioning

A government act will soon require much broadcast video streamed online to contain captioning. Adobe and MTV are looking for solutions.

Video Community Reacts to Accessibility Act

While most applaud the requirements for video captioning, some question the cost.

Senate Passes Video Accessibility Act

Captioning will be required for all video content that has been shown on television.