Streaming Media

 
Streaming Media on Facebook Streaming Media on Twitter Streaming Media on LinkedIn Streaming Media on YouTube
Sponsors

Video: How Video AI Improves Content Delivery Efficiency
Limelight VP of Architecture Jason Hofmann discusses how AI impacts content delivery optimization in this clip from Streaming Media East 2018.
Learn more about the companies mentioned in this article in the Sourcebook:
{0}

Learn more about AI and content delivery at Streaming Media's next event.

Watch the complete video of this panel, AI101: From Content Creation to Delivery: How AI is Impacting Modern Media, in the Streaming Media Conference Video Portal.

Read the complete transcript of this clip:

Jason Hofmann: At Limelight, our use case is more about the delivery of the content because generally speaking, if we changed our customers' content, they wouldn't be very happy with that. We tend to get out of that business and optimize the delivery of that content for various things--for quality, as well as also to ensure efficiency and scale. To keep up with the price compression in the CDN industry, you need to get 20-30% more efficient every year just to keep your head above water. It's an engineering challenge. If you think about it, every piece of content is a little different from a cost profile to deliver. It's why when you look at CDNs that let you sign up for services, self-service, there's always a GB fee and a request fee because those things have fixed costs, and a tiny object costs a lot more to deliver than a bigger object. If you charge GB for everything, you're going to lose money.

We needed to model that, and we spent a number of years modeling that by hand and saying, "Well, based on our experiments on a typical server and in lab, a request takes about the same amount of CPU to process as delivering 32 KB worth of data. We'll just use that as a fudge-factor estimate." It turns out that in the real world every server is different. You could have a server with hyperthreading turned off because it's on a legacy CPU that maybe had issues with it. You could have a server in a highly regulated data center in France where you get a noise complaint. You need to slow the fan speed down, so you need to turn off turbo on the processors. You could have a server that is very powerful, but because you are now doing a lot more out of that server than you did when you bought it because of the software engineering gains you've gotten over the years, that server only has dual 10 gig NICs, but it's capable of pushing more than 18. Now the NIC is the bottleneck.

You can see that the problem space expands. There is no such thing as a typical server anymore. In a company with tens or hundreds of thousands of servers, every server might have a different breaking point and might have a different sweet spot. If you're trying to target the best possible quality, you don't want to push a server to its breaking point. To figure out every server's sweet spot, what's the old way of doing it? Get every permutation of every kind of server and put it in a perf lab, and load-test it. Then you're missing another element of the equation, which is that at peak hours in London we might be doing more Sky Sports and home.bt.com website images, but at peak hours in New York we might be doing a lot more Prime Video than anything else. Now that's a different traffic mix. The server is going to behave very differently. One server might hit its breaking point or its peak operating point at 38 Gbps, and in another market it might hit it at 22 because it's dealing with smaller objects and non-cacheable objects, and more complex rules at the edge.

We applied machine learning to figure out every server's sweet spot, not just in the moment but two weeks in advance because if you make decisions in the moment and reroute traffic to another server, all of a sudden that server is a cache miss and you're disrupted. The idea is to figure things out, figure things out in advance because traditional load balancing techniques of when a server hits a high watermark, drain traffic and put it on another, don't work very well in a caching context because any time you drain traffic away from server A to server B, you're now exposing server B to new content it's never seen before, and you're putting load on the customer. Ideally, you want to never let server A go past the sweet spot, and you want to try to make that decision days or weeks in advance. That's how and when we started using machine learning.

Related Articles
Citrix' Josh Gray provides tips on AI model development and Reality Software's Nadine Krefetz and IBM's David Clevinger speculate on the possibilities of metadata-as-a-service in this clip from Streaming Media East 2018.
Google's Leonidas Kantothanassis explores the vast range of applications for machine learning in the media workflow and supply change in this clip from his Content Delivery Summit keynote.
Citrix Principal Architect Josh Gray explains how video enables higher-acuity metrics analysis in this clip from Streaming Media East 2018.