How to Caption Live Online Video

Have you tried to caption live video online? It’s the Wild West! It’s far from plug and play. The solutions are out there, but they require a fair amount of effort to implement. Beyond being a major win for accessibility and inclusion, captions can attract more viewers, result in higher engagement, and ultimately increase the impact of your live content online.

The lack of solutions and standards is not a sign that no one’s thinking about captioning, but rather evidence that the industry is moving really fast. As viewers migrate away from over-the-air broadcast to online and OTT video delivery, services like closed captioning could see a major boost in quality as speech-to-text improves. Captioning will inevitably be enhanced by innovations that are sure to come with the digital territory.

Realistically, we’re at least a couple of years away from realizing that captioning future. Given the dizzying amount of live video hours that go online every minute, do we stand a chance of captioning anything in the meantime? If you’re ready to roll up your sleeves and dig into solutions that are far from plug and play, the answer is a resounding yes.

The Business Case for Live Captioning

Let’s start with the business case: Why caption your live video? You may have a host of good reasons already. Here are some of the benefits to consider:

Captions increase inclusion. Captions make your content more accessible to more people. While it may not be mandated today by the FCC unless you are a broadcaster (more on this later in this article), sending a message of inclusion is likely to have a positive impact on the way your audience views your brand or program. Reading also improves comprehension for some, especially second-language viewers, which means your message will come across more clearly.
Captioning means more viewers and more engagement. Videos with captions are more consumable by everyone. Facebook leadership has predicted that the platform will be all video in 5 years (for.tn/1ZTbrab), and company reps shared that internal tests indicate a 12 percent increased view time on captioned video ads (bit.ly/1Pr3AN5). Now is the time to get ahead of solving your captioning challenges so you will be ready for a video-driven future online.
Captioning is venue-agnostic. Whether your video is playing on a mobile device or on a monitor in a noisy airport terminal, the message is still conveyed. Nowadays, many public lobbies and spaces feature displays. Captioning your video makes it relevant and consumable regardless of the environment.
Captioning enhances visibility to search engines. Captioning gives you the highly valuable benefit of your content surfacing higher and faster in searches, because search engines can index the text in your video. Live captions go a step further, letting your PR team quickly pull quotes for the press and your marketing team efficiently publish companion ebooks and blog posts. You can also flag inaccurate or inappropriate content for swift removal if needed.
Captions improve content analysis. Data mining possibilities are infinite when you have the full transcript of your program immediately available. For example, you can easily determine term frequency to see what words are coming up most often in your programs.

The final consideration in the case for live captioning is cost. Live captioning services cost about $150–$250 per hour. That fee can include delivery of a corrected transcript and caption file, which you can use to enhance the video-on-demand (VOD) version of your live event. Investments in hardware and software will vary widely depending on your workflow and requirements.

Captioning Basics

While there are numerous benefits that come with captioning, how to go about doing it is not always obvious. Before we dive into solutions and implementation, let’s cover some captioning basics.

Captions vs. subtitles: While both captions and subtitles involve displaying text on screen, there are fundamental differences. Captions are a text rendering of all the audio information on a program: dialogue, music, sound effects, and cues. Subtitles, by comparison, are for comprehension—most often when the language being spoken deviates from the primary language of the program or when a viewer chooses to have a foreign language translation.
Live vs. on-demand: Captioning live video is very different from adding captions to VOD. The latter is pretty well supported across online platforms with multiple file format options available and thorough documentation on how to do it just a search away. Live captioning presents all sorts of challenges, from accuracy and timing, to the technology and equipment required to make it all work. We came up rather empty when we searched for “captioning live video online,” which is why we wrote this article.
Closed vs. open: Closed captions are encoded in the video signal and then decoded by the player or device, with the ubiquitous toggle we are all accustomed to. Open captions are burned into the video signal, visible regardless of the player, and cannot be turned off by the viewer. Open captions degrade at lower resolutions while closed captions do not.
Scrolling vs. pop-on: Scrolling and pop-on are the two main styles of displaying captions. Scrolling/ paint-on captions tend to provide a better experience for live content. Pop-on captions are carefully timed with the action on screen. While there are ways to improve the timing of live captions, this kind of precision is not possible, and so pop-ons often drop from the screen too quickly in a live scenario.
608 vs. 708: Do captioning standards still matter? Though it may not be relevant for much longer, the CEA-708 standard is the bridge from broadcast television standards to captioning live video online. The EIA-608 standard was developed many years ago; 708 was added when television went digital. For more on 608/708, see the sidebar, “Captioning Standards: Where We Are and How We Got Here.”

In the U.S., most live captioning is done by typing, using special software that converts the information into captions that can be added to the video signal. Voice-writing, also known as respeaking, is a method popular in other parts of the world that uses speech-to-text to generate the captions.

We took a U.S.-centric approach with this article. Broadcast standards differ around the globe, and multi-language and character set support for captions are commonplace.

Implementing Captions: Key Questions to Ask

Now that you understand how captioning your live video makes business sense, and you’ve got some of the basic concepts down, where should you begin with implementation? We recommend getting to know your use case(s) really well to help guide decisions you will need to make along the way. Here are a few questions to get you started:

What platform(s) will you use to live stream the video? What live captioning solutions do they support?
If you have a production workflow in place, what support does it include for closed captions? Will the encoders you use pass closed captioning as part of the signal? Does the codec support it?
Is it imperative that the user be able to toggle captions on or off? Or is it acceptable or preferred to have them “open” or burned into the video?
Are you captioning a one-time-only live event, or are you looking for a solution to an ongoing need? Does it need to scale?

How Does Captioning Work?

If you plan to do captioning on the production side, things are little more straightforward (albeit more costly). Most commonly, it involves the use of a hardware caption encoder in conjunction with a broadcast encoder. A caption encoder, such as the EEG HD492, is a pass-through device that receives the video signal (usually over SDI), and sends an audio feed over the internet to a closed-captions provider.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

The Complete Guide to Closed Captions

Closed captioning is legally required for some video, but it's the right thing to do for all video. Beyond the ethical and accessibility considerations, it engages more viewers and makes smart business sense. Here's all you need to know about captioning today.

19 Apr 2022

How to Effectively Deploy Auto Captioning Solutions for Streaming VOD

Automated speech recognition systems solve critical problems in the VOD streaming industry today, enabling service providers to improve the accuracy of captions created leveraging speech-to-text processing. However, ASR systems are not without limitations. By taking a hybrid approach that combines auto captioning with quick manual inspection before delivery, OTT service providers can improve accuracy and introduce significantly higher efficiencies into their VOD streaming workflow.

01 Dec 2021

How to Caption Live Online Video

The Business Case for Live Captioning

Captioning Basics

Implementing Captions: Key Questions to Ask

How Does Captioning Work?

The Complete Guide to Closed Captions

How to Effectively Deploy Auto Captioning Solutions for Streaming VOD

Video: How to Get Started with Live Captioning

Video: What to Look for in a Captioning Vendor

Video: How Reliable Is ASR-Generated Live Captioning?

Writing Text for Video: Did Someone Say 'Autumn Aided Cap Shins'?

New FCC Caption Requirements: What You Need to Know

Facebook Live Videos Now Support Closed Captions for Publishers

Netflix Debuts Access Improvements for the Visually Impaired

Best Practices: Fine Tuning the Live Stream

Best Practices: Analyzing Your Video Analytics

More

Live Streaming in Real Time for the Pros

Sports Streaming Tech Breakthroughs

More Web Events

Netflix Makes Quietly Aggressive Aggregation Play

JustWatch Reveals Streaming Trends for LGBTQ+ Content in the UK During Pride Month

Twitch’s Dual-Layout Streaming: Technical Innovation or Industry Revolution?

Stream It in IMAX Enhanced: A Gamechanger for Live Sports Streaming