Video: How Reinforcement Learning Enables Personalized Viewing Experiences

Learn more about personalizations at Streaming Media's next event.

Read the complete transcript of this clip:

Rafah Hosn: Here's another solution to consider. This is a new paradigm in machine learning called reinforcement learning. And here we use real, end-user feedback to train our model. So, how does that work? Let's take our news example again.

We have a user. They come to their favorite news website. There's an online learner on the background. Whenever a user comes in, a user has context. That means they have some features set, the geolocation they are in, or the type of device they're using. We call this the context X. Then, inside that red box, there is a policy, a model, that's choosing the best list of new stories for this user based on that context X. We call this an action.

Now the key in reinforcement learning is that we don't stop there. The learner proposes an action, and waits for the user feedback. That's why it's called reinforcement learning, because every time the user clicks, that's a positive reinforcement feedback for the online learner. It's exactly like teaching a puppy a trick. Every time the puppy does something, you give him a treat. That's a positive feedback. Every time the puppy does something wrong, you say, "Bad puppy." It doesn't learn from that. It's exactly the same principle. That's kind of the paradigm that we use for personalization.

Now, the key in reinforcement learning is a concept called exploration. For the gentleman that asked about the cats and dogs, at least in the type of reinforcement learning we do, suppose that you love space. So in most, 80% of the time we're going to choose for you space articles because that what we learn that you like. But add some random .2%, .1%, 2%, and we're going to choose a different article. So, we'll show you cats and at some configurable number, we say, "You know what? We're going to explore this space and see if this gentleman actually likes a little bit of dogs." So, we'll show the dog. Then we'll observe your reaction to the dog. If you give us a positive feedback, we say, "Oh, well. Okay. Maybe it's not just cats. Maybe he does like dogs a little bit."

This exploration is very, very powerful. This is not complete random exploration. It's exploration over the set of your feasible actions. So in the context of news, your editorial comes with 12 lists of stories, trending stories that should be shown on the page. So, 80% of the time we will show the news that we think you like, and 20% we will randomize over this list of 12 articles and propose a different type of article and see if the user reacts positively to it. That's our positive reinforcement feedback for this algorithm.

It turns out that exploration is so powerful because it allows you to now label your data set automatically. You don't need to go and spend money labeling your data set because every positive reinforcement is a label. And any time you're exploring, you're actually increasing your data set. So, this gives you a very rich data set that you can learn from.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Video: How Reinforcement Learning Enables Personalized Viewing Experiences

Video: Demo: How to Make Your Video More Searchable

Video: The Challenge of OTT Personalization

Video: Best Practices for Training Your AI

Video: How to Know When AI Isn't the Solution to Your Problem

Video: Key Considerations When Choosing a Video AI Platform

Video: Who Are the Key Players in Video AI?

Video: Tips for Getting Started with Video AI Platforms

Video: Pros and Cons of Supervised Machine Learning for Content Personalization

Video: How Microsoft's Custom Decision Service Improves Content and Ad Engagement for Brands

Video: How Do We Define Quality of Experience for Streaming Video?

Video: How IRIS.TV Implements Machine Learning in Production Environments

Video: How USA Today Leveraged Video AI at the 2018 Winter Olympics

Video: How to Use Machine Learning to Create Personalized TV Experiences

Video: How AI Can Open Up New Markets for Your Video

Video: How IBM is Using Video AI

Video: How Video AI Helps Businesses Interpret Experience Metrics

Video: How Video AI Improves Content Delivery Efficiency

Video: Best Practices for Developing Machine Learning Algorithms for Video

Video: How Will Machine Learning Impact the Media Supply Chain?

Best Practices: Fine-Tuning the Broadcast Workflow

Best Practices: Video Conferencing Solutions

More

IBC Streaming Sensations

How to Leverage AI in Streaming Workflows

More Web Events

AI: The Secret Weapon for FAST Channels Competing in a Crowded Market

FAST, Data, and Tracking Diverse Audiences

Enhancing Ad Performance for Smaller Streamers with Moloco’s AI Solution

Are CTV Platforms Like Google TV Creating a New Entertainment Sector?

Real-Time Streaming at Scale

Nimble Streamer: Cost-Efficient Streaming Software

More