Solutions Abound for High-Volume Live Video Cloud Transcoding
Live transcoding is the ideal operation to perform in the cloud since it reduces outbound bandwidth requirements and CapEx. As resolutions increase and codecs get more complex, however, it becomes harder to encode your complete encoding ladder in a single cloud instance, and splitting the task between two or more machines adds cost and complexity. Fortunately, there are multiple solutions for efficient high-volume cloud transcoding, including several showcased at Streaming Media West in November 2018.
Let’s start at the bottom, from an efficiency perspective, which is software-only encoding to x264 or x265 via FFmpeg installed in a cloud instance. So long as you’re producing a four- to six-rung H.264 encoding ladder, you should be able to encode on a single machine.
Next up are software-only encoders that are simply more efficient than FFmpeg. During his talk at Streaming Media West, IDT’s Lowell Winger explained how the company’s software- and hardware-accelerated codecs deploy multiple techniques to deliver up to 40% more throughput than FFmpeg for H.264 encoding, which translates to significant savings over a plain-Jane FFmpeg solution.
Of course, beyond the software-only x264 and x265 codecs, FFmpeg offers several hardware-accelerated codecs for more efficient encoding. For example, depending on your CPU, FFmpeg supports H.264, H.265, and VP9 encoding and decoding via Intel Quick Sync, which should be available on most cloud instances. You can also access NVIDIA-accelerated encoding of H.264 and HEVC via instances with NVIDIA GPUs. You can also leverage third-party, GPU-accelerated encoding like IDT’s HEVC encoder, which can transcode 4Kp60 10-bit HDR HEVC on a consumer off-the-shelf server platform.
The next level of performance is provided by field-programmable gate arrays (FPGA), which are general-purpose hardware devices that can be programmed to deliver close to the performance of application-specific chipsets. Cloud services like AWS are now deploying FPGA-based instances where they can be accessed by different software developers to provide a diverse array of functions. During his talk, Live Streaming with VP9 at Twitch TV, Twitch’s Tarek Amara described how the service was deploying live VP9 encoding using Xilinx FPGAs driven by software from NGCodec, a tremendous technology endorsement for VP9, FPGAs, and NGCodec. Note that you can directly provision FPGA-driven HEVC and VP9 encoding from NGCodec in the AWS Marketplace.
Of course, the most efficient transcoding will always be performed by application-specific encoding hardware. At Streaming Media West, NETINT Technologies’ Ray Adensamer described how his company’s System on Chip (SOC) encoder, in the Codensity T400, could enable 80x 1080p 30 H.265 sessions in a single 1RU server with 10 T400s installed. Rather than selling the T400 as a standalone encoding appliance, however, NETINT designed the module for installation in NVM Express-based storage servers. (Briefly, NVM Express, or NVMe, is an interface specification for connecting SSD-based storage to servers via the PCI Express bus. NVMe is used in cloud facilities and increasingly in the enterprise. Presumably, leveraging NVMe will simplify on-prem deployment for large-scale encoding shops and perhaps even convince a cloud service to install an application-specific device within a standards-based platform.)
Obviously, encoding platforms that require a GPU, FPGA, and NVMe-based solution will sport a higher per-hour cost, though it should be relatively simple to compute a per-stream cost to compare to software-only encoding. Less simple is the quality analysis. The rap against Intel and NVIDIA-based encoding has traditionally been quality, and while I’ve read that the quality gap is closing, I haven’t confirmed this through testing. I have tested enough encoders to know that if you run your own tests, you shouldn’t simply compare the high-level peak signal-to-noise ratio (PSNR), video multimethod assessment fusion (VMAF), or SSIMPLUS scores; also examine your test files for transient quality issues. For example, though faster x264 presets like Ultrafast and Veryfast deliver close to the same average PSNR/VMAF scores as Medium, they often contain multiple five- to 10-frame regions of really awful quality that would seriously degrade QoE.
And if all this talk of provisioning sounds too complex, you can always hire someone to do it for you. In this regard, there were multiple providers at Streaming Media West with live transcoding capabilities, including Bitmovin, Brightcove, Elemental, and Wowza.
[This article appears in the January/February 2019 issue of Streaming Media Magazine as "Live Transcoding Options."]
Many of today's live video encoding solutions require extensive compute resources, limiting the ability of live streaming business models to economically scale. This article will introduce a new real-time video encoding solution, combining the performance of System-on-Chip (SoC) encoding, with innovations from NVMe-based cloud infrastructure, which together provides an economical and high quality solution to deliver encoding at scale for live video streaming.
Encoding and transcoding are at the heart of every OTT and online video workflow. The first article in this three-part series gives an overview of the technologies and a look at three major players in the space: Harmonic, AWS Elemental, and Telestream.
Demand for on-prem encoding is waning, and vendors are responding with innovative hybrid approaches that offer the best of on-prem and the cloud.
In this session, Jan Ozer presents a live video comparison that includes cost, stream redundancy, packaging flexibility, bandwidth requirements, DRM and captioning support, and scalability.
Companies and Suppliers Mentioned