How to Jump-Start Your Multi-CDN Strategy and Deliver Every Time
The advantages of pursuing a multi-CDN distribution strategy are obvious—not risking your content delivery on a single point of failure being foremost among them. But what’s the best approach to maintaining the highest-quality delivery, building redundancy and resiliency into your streams, and addressing origin-load capacity? In this article, we’ll explore multi-CDN options, highlight key providers, and identify the right questions to ask and issues to consider as you continue your multi-CDN journey.
Problems That Multi-CDN Delivery Addresses
The big driver behind the migration to multi-CDN is that the single-CDN approach limits you to a single point of failure. Regionally, some CDNs might not perform as well as others, especially when you’re delivering worldwide. Sometimes you can have issues at one CDN, essentially, that you might not be having at another CDN, or they could be systemic. Of course, not every delivery issue is CDN-driven; sometimes issues can be isolated to the last mile, and it’s important to consider that possibility too.
The number-one reason that our customers move to multi-CDN is their desire to maintain performance and availability and to deliver the best experience possible no matter what happens during a broadcast or what they’re trying to deliver. And they find that going multi-CDN helps. It doesn’t solve everything, but it adds robustness that makes their streams more successful.
Cost considerations factor in too. Some CDNs are more cost-effective than others, and shifting some users from one CDN to another might not only enable you to maintain your level of service, but reduce costs as well. The playing field has begun to level out over time, but CDN pricing always depends on the volume and how much you’re using and committing to up front.
Another driver for multi-CDN is security— the ability to migrate users off a CDN if the stream-level access security gets compromised, as well as move users to another one. Availability is a big driver, as are capacity and performance. But cost may well prove to be the determining factor if all parties are delivering at an acceptable level. Figure 1 gives a rough estimate of the issues that typically drive a move to multi-CDN, as well as their relative importance.
The complexity of live-streaming delivery is another reason content providers are moving to multi-CDN. Live is significantly more complex than video on demand (VOD), and throwing in advertisements, DRM adds even more complexity.
Content synchronization for multiple-origin streams is critical for redundancy and failover, as well as maintaining resiliency within your streams. Does your live stream have a single origin or multiple origins? How will you synchronize content across those origins? We’ve seen a lot of issues with content that has a problem getting to one origin and then to the other. For all its advantages, multi-CDN can compound those problems, depending on which origins you have serving to which CDNs.
Origin load-capacity is also something you need to consider, especially when you move to multi-CDN. If you’ve been storing your content on your own origin server, the load on that server is going to increase with, say, five different CDNs accessing it instead of just one, unless you’re using a load-balancing service like Fastly as an origin for your other CDNs. Capacity planning is an essential element of a multi-CDN strategy.
At a nuts-and-bolts level, you need to figure out how you will route traffic—when to adjust and when not to adjust. Will adjustments happen up front, midstream, or only when there’s a problem?
Data and Decision Making
To make these decisions wisely, you’ll need data. Whenever possible, you should be pulling QoS and QoE data. You should be getting a big-picture view of what’s performing well and what isn’t, collecting information about the CDN performance from the CDN, as well as from other sources.
One thing data can help you do is differentiate last-mile decisions from decisions that you should be making on your users’ behalf. Should you allow users to manage their own CDN switching individually, based on the big-picture data you’ve collected? Or does it make more sense to group your users by region and switch CDNs for them based on your best guess as to what’s likely to happen during a stream? There are benefits in both approaches. For our clients, we’ve done a lot more CDN switching on an individual-use level than on a macro level. For macro-level adjustments, there are services specifically designed to help you make them intelligently. We’ll look at those later in this article.
With any content delivered over CDNs, we generally have some form of token access control. Even with DRM, we still have customers who insist on some form of token access control. If you have three CDNs and each has a token authentication system, do you have to align them so that they use the same tokens on each?
Some services try to push you to do that, which is unfortunate, because it can potentially reduce security overall. However, if you don’t align the authentication systems, it increases the complexity that you have to manage. When you’re switching CDNs in a way that’s seamless to the user, you have to make sure that you are authenticated on all appropriate CDNs to do so. Otherwise, when you try to switch, your stream will fail.
The Three Key Components of a Multi-CDN Solution
Those are some of the key challenges you’re likely to face when pursuing a multi-CDN strategy. Now we’ll look at some of the available solutions and how they work. The solutions need to address three key areas: decisioning, routing, and data.
The decision engine can take a Domain Name System (DNS)/server-based or clientside approach. The server-based approach deals with all routing at a service layer and then just feeds down to users. This can be done through DNS so that all requests go to one place and you can then route those requests to whichever CDN you want. Alternatively, the client can make those calls based on available data and execute there. Decisions can be fed through an application programming interface (API) service as well. Either way, it’s going to be an action that takes place on the client’s request or on requests that are going upward.
We can route traffic up front, so the simplest multi-CDN routing goes like this: Before the stream begins, a decision is made. Let’s say, “User A, you’re going to CDN 1; User B, you’re going to CDN 1; User C, you’re going to CDN 2.” We can base that decision on region data or on other factors, but either way, we make a call on how we want to distribute our users up front, before they start playback. Once they’re in playback, we maintain them on that CDN in a basic approach. That doesn’t mean we have to stop there. We can go further.
We can also route that traffic dynamically. We can hard-switch users and reload the manifest midstream, so if we’re watching content, and it’s having issues or not performing as well in a specific region or user set, we can say, “We want you to switch over to alternative content. Load on a different CDN.” This does not provide a good user experience, because it means, “Stop, re-buffer, and begin playback again.” We do this only in an emergency, when we know hellfire is about to rain down and puppies are crying all over the world.
We prefer to do a seamless switch dynamically at the manifest or the segment level. This means either routing the request from the client if it’s going to a DNS-based solution or actually applying the change at the client level so that we’re telling the client, “Don’t load from CDN 1. Load from CDN 2 on this next request.” When that happens at the segment level, it’s seamless; the user never even knows that they’ve switched CDNs.
You can perform the same switch at the manifest level. If you’re doing dynamic manifest manipulations upstream, you can then rewrite the manifest to point it to an alternative path on the next request. The client will load that and then keep the segment request and go from there.
Seamless, dynamic switching at the segment or manifest level works for both live and VOD. For manifest-level switching, it works anytime the manifest is refreshed. For live, that is every segment, but for VOD, that is just up front or on a bitrate/rendition switch.
Automated Multi-CDN Management
Developing a fully integrated multi-CDN solution takes time and resources. If you want to minimize the work required to put a multi-CDN solution in place, automated multi-CDN management is available as well.
One way to simplify the process is to integrate a system that works with what you have today, such as a DNS routing and load-balancing solution based on an existing origin. With this type of system, you just put it upstream and, at most, you’ll just need to identify where you’re loading all your content from, then the system will determine how to handle that in theory.
You can adopt such solutions today. Their providers tout them as largely hands-off: “You don’t have to do anything to turn this on. You can just pay us and configure it, and you’re set.” It’s not always quite this easy, but it can definitely reduce the effort you have to put in, especially on the client slide.
Taking the DNS/server-based load-balancing approach means working with a service such as Cedexis, which was purchased by Citrix in 2018. Essentially, you sign an agreement and turn over the keys, and they set up a DNS so that when your users say, “Get me this content from my domain,” it’s going to route through them. Then they’re going to make a DNS selection and return content from whichever CDN they think is the right one for your users. This approach can get more feature-rich and complicated from there.
The idea with DNS solutions is you can have an automated ruling to determine which CDN it switches to. Distribution can be static, failover-based, basic round-robin, weighted round-robin, geo-based, or performance-based, which means it’s driven by QoS and QoE data. It can even have third-party API integration at some points, as well as having some form of client-side metric reporting. In a perfect world, you’re getting data on both the client and the individual user, plus aggregated data and server-based or network-based monitoring and calculations.
To get the performance data from the clients, they often use some sort of system that tracks the content you load, or—in many cases—loads specific test data chunks (often referred to as node checks) on certain intervals to determine the estimated performance and throughput from CDNs. Unfortunately, some CDNs can try to optimize/prioritize the delivery of these “tester” assets and provide false results in their favor. The best solution is always a measurement of the actual streaming segments, not arbitrary test assets.
The Dynamic Manifest/Segment Approach
Solutions such as Streamroot are also available to help with the dynamic manifest segment approach described earlier. This solution can run either on the server or be server/ client-based. It can be either a centralized or distributed system with automated rules. It can also be API-driven from an operational or analytics side.
Some of the customers we’ve worked with on this type of solution want to control it themselves. They want to have their operational staff examine the data—via Conviva or whatever analytics they have—look at their own user experiences, and then make the call. They say, “I want to be able to push this button or design this system with specific input datapoints and drive users to whatever we want.”
When you look at these types of integration, the question becomes, is that going to work with the provider, or do I have to build something? As with DNS-based multi-CDN, distribution can be static, round-robin, weighted round-robin, geo-based, or performance-based. But the idea with the client-side approach is that distribution decisioning happens inside a service that’s modifying the actual content the client’s going to be receiving, versus just routing what it’s going to receive.
In terms of metrics on the client side, we typically look at bandwidth/throughput, watch for flux, and check for alternatives, whether on side load or polling, doing tag-based checking, or failover. Failover is one of the easiest and most direct routes for integrating multi-CDN into your solutions: If you have a problem, you can try a different CDN. The goal is to do so seamlessly without affecting the user experience. That works only if the problem is not on your origin. Otherwise, you’re failing over to everything else that can also have that same problem.
On the service side, most of the time, you’ll work with a provider such as Conviva or Datazoom. It can pull and aggregate your data to assess the health of the content you’re distributing at that time.
You can also use node-check system aggregation. A node system is a distributed system that checks individual files or a set of files to give you some data that you can then aggregate. The Citrix/Cedexis system relies heavily on having different node systems all over the world. When you implement this system, you put it into your player so that it can do node checks and report back information on how the stream is performing for users in a specific region and so forth.
Jolokia's Pete Mastin and Limelight Networks' Rob Coluantoni discuss the advantages and disadvantages of different content delivery architectures in this clip from a Live Streaming Summit panel at Streaming Media East 2019.
When 1 or 2 CDNs isn't good enough, how about 15? Video delivery company Peer5 introduced multi-CDN with participation from 15 content delivery networks.
Streaming viewers expect an instant-on, high-quality experience. Verizon Digital Media Services improves its quality of service tools to continuously monitor for problems.
Streaming Media EVP Dan Rayburn discusses why using multiple CDNs make sense for some content owners and not for others.
Companies and Suppliers Mentioned