Netflix Goes to the Other Side of the Mirror With Bandersnatch
“Bandersnatch” is the first interactive title marketed to adults, and it also culminates a year’s worth of iterative improvements building on the lessons of previous branching titles. The show’s success now justifies additional engineering work to produce a truly seamless experience with new features that include a new UI, decoupling of the download and playback on the client, and adding arbitrary keyframe support in our encoding stack.
One of the most important choices for Stefan is whether to accept or refuse the opportunity to work at a new video game startup called Tuckersoft. On the left in Figure 8, the choice point is represented as a graph similar to that in Branch Manager. On the right is the Netflix UI. The choice point appears at the end of an interactive segment that we’ve labeled “Job_Offer” for the purposes of this example, and the segment defines two possible subsequent segments: “Accept” and “Refuse.”
A series of UI states define the member experience at each choice point. The phases of a choice point are Initialization, Choice Selection, Timeout, and UI Hide (Figure 9). This is when you’re actually able to interact with the UI and can see that your countdown has started. After your countdown is over, your choice is locked in and the UI fades away.
Thanks to the tight collaboration between engineering and production, the producers were able to plan all elements around the Netflix UI and to make sure that each shot was set up in such a way that it would integrate seamlessly. We can map these UI states onto the playback timeline itself. In the top right corner of Figure 10 (on the next page), you can see a snippet of the code defining the beginning and the end of each phase. These values are unique to a choice point, so creatives have ultimate control over the pacing of each choice point. One choice may feel rather leisurely, like in the tutorial, while I can say that others in “Bandersnatch” are far more frantic.
Zooming out, we can start to see what the layout of our media files really looks like (Figure 11). The default path is also used to construct the layout of the media files themselves. It’s only when users deviate from the default choice that the player seamlessly jumps to another point in the timeline. Otherwise, the player can just keep playing and downloading into pending bites like normal. Let’s say that you, like most, choose the most empathetic option for Stefan and he accepts the job. Because the “Accept” branch is also the default, the player simply rolls into the next segment, no problem. But, there’s a 50% chance you’ll select the “Refuse” branch.
In order to seamlessly transition to one of two equally probable outcomes, we simply cache both, effectively doubling the video rate and the memory on the device. Once the choice is locked, media samples from the appropriate segment are upended to the media source buffer and the cache can be purged. This is how the player teams decouple downloading logic from playback logic and content timestamps from playback timestamps to provide a seamless transition from one segment to another.
Where IMF Comes In
Before “Bandersnatch,” there was Minecraft: Story Mode. Very early in that series, you’re given a hypothetical option: Would you rather face 100 chicken-sized zombies or 10 zombie-sized chickens? I chose the chicken-sized zombies just because it seems more scalable.
Until “Bandersnatch,” our contract with the adaptive streaming engine deployed onto clients was based on 2-second segments. This essentially defined our encoding GOPs and all of our video encoding recipes for almost the entire 10 years prior. It was difficult to justify a large amount of engineering work before we’d even demonstrated any kind of success around interactive, so our initial source delivery model for Puss, Minecraft, and others pushed the complexity of our encoding pipeline upstream to our fulfillment partners, who would receive all of the media for each interactive segment from the production and prepare it for delivery to Netflix according to a custom spec.
Our initial spec required each interactive segment to begin on a 2-second boundary, ensuring that the first frame of each interactive segment was a keyframe after running it through our encoding pipeline. The keyframe is required for the player to randomly access any particular segment.
Since clients need to know where each segment begins and ends, which choice is the default, etc., we also defined a simple XML scheme, which was produced and delivered by those same fulfillment partners. The choice map XML annotates the timeline for easy traversal.
There were multiple disadvantages to this model. Even minor changes to the source media required a full re-delivery almost every time, including re-syncing the XML to match all the new media timelines. The epitome of the “giant chicken” model was episode four of Minecraft: Story Mode, “A Rock and a Hard Place.” For this one episode, we received aligned audio, video, and text assets with each having a duration of nearly 20 hours. This beat our previous non-branching record holder, Slow Tv: The Telemark Canal, which is 11 hours of riveting entertainment. Luckily, our chunked encoding system doesn’t get too hung up on large files, though there were some challenges. Ultimately, we were successful by proving that our users think this interactive thing is pretty cool. So we decided we needed a more scalable model.
Since componentized media delivery is a primary advantage for IMF users like Netflix, it seems a natural fit for this type of application. Prior to “Bandersnatch,” and following the adoption of some of our new encoding features, we updated our source spec to use the IMF marker track to annotate the timeline in-band within the Cyberathlete Professional League (CPL) and eliminate the need for a separate choice map altogether. By using the bundled marker track and a big engineering investment to add the ability to set arbitrary keyframes within our encoding stack, we were able to reduce the overhead and a lot of complexity from our mastering and delivery specs. This was how we ingested “Bandersnatch,” and it remains our current delivery model.
Scaling the Experience
Our next step has been scaling this approach, focusing on building a framework for developing new storytelling techniques and a platform for delivering a high-quality seamless experience. Many of the processes we employed will need to be streamlined and potentially standardized to make these workflows ubiquitous.
Personally, I’d like to continue to lean on IMF primitives to enable componentized delivery in whatever way makes the most sense for production and postproduction teams, allowing us to further decouple upstream and downstream systems. We’re currently evaluating options around segment-based delivery, for example, potentially enabling additional optimizations and flexibility into the future.
[This article appears in the September 2019 issue of Streaming Media Magazine as "Other Side of the Mirror."]
Images used by permission from Netflix.
Why did Netflix's stock fall in Q2? Because the OTT leader's price increase lead to service cancellations. That's a warning to all SVODs: If it can happen to Netflix, it can happen to any service.
SVOD leader Netflix had a weak quarter, but the researchers at eMarketer see it bouncing back with strong subscriber numbers for the rest of the year.
Netflix's Andy Schuler will reveal how the company makes interactive content work when he presents a Streaming Media East keynote in May.
It's important for YouTube publishers to create a great user experience for our viewers, and interactivity can help. Here's a walk-through of how to use YouTube's interactive features.
While startup Beam Interactive only launched in January, it's already made a big impression with online gamers and the team at Xbox.