How to Deploy Closed Captions
Deploying closed captions involves three steps: creating the closed captions, converting them into the proper format, and integrating them with the video for display. Creating and formatting captions are relatively discrete, largely non-technical activities that can be accomplished by most computer-literate people.
Integrating them with the audio/video stream is usually programming professional’s job, and there are hundreds, if not thousands, of permutations and use cases, particularly in live captioning. This makes the integration step hard to usefully address in anything short of a tome the length of War and Peace. This article focuses on the first two activities, while taking on a few examples of the actual deployment stage for demonstration purposes. In addition, I’ll focus exclusively on captioning in the U.S., since incorporating other regions would be impossible in the space provided.
The Lay of the Land
There are multiple caption formats used by various distribution and playback technologies, illustrated in Figure 1, which shows the requirements for added captions to videos displayed via the JW Player (note that JW Player can also display captions already embedded in HLS and Flash streams). So the first step you should take in all caption-related exercises is to identify the caption format or formats that you’ll need to supply for your chosen distribution method or methods.
Figure 1. If you’re using the JW Player, you have to supply your captions in one of these three formats.
Let’s start with a quick look at caption formats before we get started on the creation process. On the origin side, CEA-608 captions are the NTSC standard for analog TV in the U.S. and Canada, also called Line 21 captions. CEA-708 captions are the ATSC standard for digital TV in the US and Canada. If you’re working with a live television stream or a captioned video file from a TV station or other broadcaster, chances are the captions are in one of these formats.
If you’re working with previously captioned video that isn’t of broadcast origin, captions might be provided in a number of text-based formats. One of the most popular is the Scenarist Closed Caption format (.scc), while other common source formats include SubRip (.srt), SubViewer (.sbv or .sub), MPsub (.mpsub), and LRC (.lrc).
For distribution, you can either embed the captions in the streaming file itself -- a technique primarily used for files bound for iOS devices and Safari, which means you’ll need an encoding tool that can embed the captions into the streams -- or distribute the captions in a separate file called a sidecar file, which is more popular because it adds flexibility, particularly when supporting multiple output formats.
Here are the most common sidecar formats, as described on the Zencoder blog:
- TTML (Timed Text Markup Language) is the format recommended by the W3C.
- DFXP (Distribution Format Exchange Profile) is a profile of TTML defined by the W3C which is often used synonymously with TTML.
- SMPTE-TT (Society of Motion Picture and Television Engineers – Timed Text) is an extension of the DFXP profile recommended by SMPTE.
- SAMI (Synchronized Accessible Media Interchange) is based on HTML and was developed by Microsoft.
- WebVTT is a text format proposed as the standard by the Web Hypertext Application Technology Working Group (WHATWG).
Note that many distribution formats can use different sidecar formats. For example, HTML5 captions can be either TTML or WebVTT formats. Ditto for player technologies, such as the JW Player as shown in Figure 1. Again, once you know your target format, you can get to work. Here the tasks vary depending upon whether the captions exist already or if you’re creating them from scratch.
Working with Existing Caption Files
In a live scenario, the easiest case is when you’re receiving a real-time stream that contains integrated captions, which usually will be CEA-608 or CEA-708 captions. In these instances, if your encoding or transcoding tool can input these captions and convert them as necessary for the various outputs, you’re good to go. Most high-end encoders, such as those from Cisco, Digital Rapids, Elemental, Envivio, and Harmonic, have these capabilities, as do transcoding tools such as Wowza Transcoder, or the Aventus platform from iStreamPlanet shown in Figure 2.
Figure 2. Input and output capabilities of iStreamPlanet’s Aventus platform
I’ve highlighted the caption-related features in the figure. As you can see, Aventus can input an MPEG-2 Transport Stream with CEA-708 captions, and convert the captions to the formats required for Apple HTTP Live Streaming, Adobe HTTP Dynamic Streaming, Microsoft Smooth Streaming and Adobe RTMP-based Dynamic streaming. Essentially, this is a transmux of the original CEA-708 captions into a variety of formats, which is a function all the aforementioned systems can perform.
If you don’t have a streaming encoder capable of ingesting and transmuxing the captions, you can use a hardware device such as the EEG DE285 HD Caption Decoder to separately capture the captions, and send them to a streaming service. There are also hybrid hardware/ software live captioning workflows based upon Telestream’s MacCaption, or the Windows equivalent, CaptionMaker, which start at $1,095. Check with Telestream or EEG Enterprises for more information.
If you’re not receiving a feed with the captions embedded, or if you’re not using a live encoder than can convert embedded captions or another tool to extract the captions, you’ll have to create the captions from scratch, which I’ll cover in a moment.
As with live video, video on demand (VOD) files will already have captions, either embedded in the video file, or as a sidecar file. For example, you might get handed a broadcast file with embedded CEA-708 captions. In these cases, many high end encoding tools, such as those mentioned above, can ingest the captions from these files and convert them as needed for the required formats. If you don’t have such an encoder, a software tool such as Telestream’s MacCaption or CaptionMaker, which provide a range of caption-related functions, may be able to extract the captions for you. There are also a number of cheaper or freeware task-specific tools, such as CCExtractor, that can also perform these functions.
If you’re asked to convert a DVD or Blu-ray disc, you may be able to extract subtitles from these sources using CCExtractor or SubRip (for DVDs), or other tools. If you Google “extract subtitles from Blu-ray Disc” you’ll see a range of options. Service providers such as CaptionMax and Caption Colorado can also help.
If you receive a video file with a transcription, you can convert the transcription to captions with MacCaption/CaptionMaker using a feature called automatic time stamping, which synchronizes the transcript with the video. Using internal algorithms, the software automatically detects any problem areas and highlights them in red, so you can review and correct if necessary. Otherwise, you can use any of the caption creation tools discussed below to create the captions from the transcript, though obviously this will take a lot longer.
If you start with a video file and caption file, but need a different caption format for delivery, you can usually import these into a program such as MacCaption/CaptionMaker for export into the required delivery format. Or, there are a number of free tools and services, such as the Caption Format Converter from 3PlayMedia, that can input SRT or SBV captions, the latter a format used by YouTube, and export your captions in a number of delivery formats.
If you want to hire out the captioning work, there are a number of companies who can create or convert the captions for you, including the aforementioned CaptionMax and Caption Colorado, as well as Dotsub, Closed Captioning Services, and 3PlayMedia.
Creating Your Own Captions
To create your own captions, you have to know the purpose the captions are supposed to serve and some basic creation and formatting rules. Let’s start by exploring the differences between closed captions and subtitles. Closed captions are captions that viewers can enable themselves. In contrast, open captions are always displayed. Captions are intended for deaf and hard-of-hearing audiences, and for watchers in loud environments such as bars or health clubs, who may not be able to hear the audio. In contrast, subtitles are used to translate the audio to a different language, and generally assume that the audio can be heard. This is a subtle but critical difference.
Government regulations spell out exactly what online video needs to be captioned. Learn the laws to avoid fines and lawsuits.
Lockheed Martin explains how it webcasts to all employees at once, even the hearing impaired.
Companies and Suppliers Mentioned