Save your seat for Streaming Media NYC this May. Register Now!

A Buyer’s Guide to Multiformat Media Servers

Article Featured Image

With the plethora of options available for cloud-based delivery and asset management, some streaming practitioners may ask why they should even consider buying a multiformat media server.

That’s a valid question. For those who have only a minimal number of on-demand videos, the use of an online video platform (OVP) such as YouTube could easily suffice. For those who want to reach live audiences on a large scale, a content delivery network (CDN) might fit the bill.

For those, however, who need to stream content both internally and externally, as is the case in education, enterprise, and even the worship market, the use of a multiformat media server could be just the key.

In this year’s Buyers’ Guide, we’ll take a look at a few key features you’ll need to know about to make an informed decision.

HD and 4K

While the broadcast world and consumer electronics (CE) companies are pushing 4K, we’ll likely be waiting a bit longer for 4K broadcasts than for 4K streaming. In fact, 4K streams were successfully accomplished for several sporting events during 2013, meaning that media server companies will begin advertising their 4K chops in early 2014.

The difference to remember between high-definition (HD) and Ultra HD (UHD) streaming lies in the fact that the process for HD includes adaptive bitrate (ABR), while the current approach to 4K streams involves single-bitrate streaming via proprietary means. For HD, the standards are defined for 1080p and 720p in terms of both resolution and audio formats, but we’re not quite there yet in the UHD streaming world. Caveat emptor when it comes to choosing a future-proof 4K streaming solution.


Speaking of 1080p, there’s good news for Dynamic Adaptive Streaming over HTTP (DASH), at least in the DASH-AVC/264 world.

Version 2.0 of the DASH-AVC/264 guidelines, with support for 1080p video and multichannel audio, is now publicly available on the DASH Industry Forum (IF) website (go2sm.com/dash2) and the resulting baselines move upward from 1280x720 progressive (720p) to 1920x1080 progressive (1080p). In addition, the frames per second (fps) rate in the United States is now set at 30 fps, targeted at the H.264 (AVC) Progressive 12 High Profile Level 4.0 decoder.

In Europe and the rest of the world, 25 fps is part of the version 2.0 specification as well, meaning that both broadcast and motion picture conversions can be delivered via the DASH-AVC/264 standard as it’s interpreted in the HBB broadcast specifications.

Inquiring minds might be asking about the need for a media server if delivery is provided via HTTP. The answer has three aspects.

First, many legacy devices and players are incapable of playing DASH content. Media servers allow the ability to convert an initial stream into many formats, covering both legacy formats such as RTMP and newer formats such as DASH.

Second, a number of future devices—such as iOS handsets and tablets—won’t support DASH natively in the foreseeable future, so there’s a need for media servers to enable conversion between formats.

Third, to get to HTTP delivery, there’s a need to segment content between various segmented HTTP formats. A good media server will hold on-demand files in a mezzanine format, one that can be segmented variously into one or several of Adobe’s HTTP Dynamic Streaming (HDS), Apple’s HTTP Live Streaming (HLS), Microsoft Smooth Streaming, and a few other formats.

The idea is to keep the mezzanine file intact, which is possible now in all HTTP formats—HLS now allows for bit range segmentation—so that the server doesn’t need to deal with storing a vast sea of tiny bits. On the live front, several media servers use the DASH Live Profile, allowing content to be streamed live to DASH and HLS simultaneously, through the use of trans-multiplexing (transmuxing).

Closed Captioning

Media servers offer closed-captioning solutions, including timed-text—such as SAMI, SMIL, Timed-Text Markup Language (TTML, and its less-robust TTML Lite version)—and traditional CEA-608/CEA-708 compliance. Web broadcasters and CE manufacturers alike face a challenge standardizing on common timed-text standards, which means that media servers will play a significant role in converting between the various types of timed text for the foreseeable future. Some media servers also act as broadcast playout systems, meaning that closed-caption insertion is a key feature to look for if you plan to broadcast and stream via the same media server.

DRM and Encryption

Several integrated media servers now integrate with encoders that handle digital rights management (DRM). As DRM moves “down the stack” to reside at the encoder, both DRM and encryption can be applied at the point of ingest, before the source ever leaves the encoder to be sent to the media server. In this way, content can remain encrypted through the ingest and delivery portions of the workflow, only being decoded and decrypted at the client player.

In addition, as we move closer to a common delivery protocol (HTTP) and a Common File Format (CFF, based on ISO Base Media File Format), there is also a move toward a common encryption scheme (CES). CES will provide the ability to use one of five common DRM schemes, and various media servers support a subset of these five DRM schemes.

Cloud Versus On-Premise

2013 saw an increase in cloud-based media server options. All the major media server companies offer cloud-centric, transaction-based transcoding and media delivery. 2014 will see an increase in the prominence of cloud, but also significant growth in hybrid solutions that use both on-premise and cloud-based workflows to handle intranet and internet delivery, respectively. The advent of software-defined networking (SDN) will continue to blur the lines between what you buy versus what you lease on a temporal basis.

Advertising Insertion and Custom Playlists

Several media servers offer the ability to perform “late binding” functionality, or the ability to keep the audio and video “tracks” of an “online DVD” or stream separate from one another until the last possible moment. In some instances, this binding occurs at the client player, with the media server keeping track of which audio track (e.g., language) will be paired with a particular video stream. The Common Streaming Format (CSF) based on the ISOBMFF will make late binding a common feature in media servers in 2014.


Media server options are not only maturing, they are expanding. New companies with novel ideas on how to serve content to television-scale audiences are entreating the market. Some of the key features to watch for are listed in this article. Given the choices, the only constant in the buying process for media servers is the need for a well-defined workflow against which to compare different media server product and service offerings.

This article appears in the 2014 Streaming Media Sourcebook.

Streaming Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

Buyers' Guide to Media Servers 2017

From 4K to MP4, media server software keeps innovating our industry toward its OTT future. Best of all, there are options right for every budget.

A Buyer's Guide to Multiformat Streaming Media Servers

MPEG DASH is the biggest factor to consider -- or is it? Here are the key features to know about before making a decision.

AEG Digital Media Addresses Multiformat Workflow Simplification

At a Streaming Media West panel, AEG looks at its biggest multiformat challenges, as well as how its workflow could be streamlined.

Simultaneous Multiformat Encoding

A look at the technology behind encoding for efficient multiple device delivery