Video: How IBM is Using Video AI

IBM Watson Media's David Clevinger discusses how media entities are currently using video AI in this clip from Streaming Media East 2018.

Watch the complete video of this panel, AI101: From Content Creation to Delivery: How AI is Impacting Modern Media, in the Streaming Media Conference Video Portal.

Read the complete transcript of this clip:

David Clevinger: The typical use case that we've been seeing is media entities that have large back catalogs of content, that was originally created when they didn't have complex metadata tool sets, didn't necessarily have the right people applying metadata, didn't think of all the use cases on the output side. Maybe it's historical content.

A very concrete example is work that we've done for the US Open. We actually took hundreds of thousands of video clips and photos and news articles and vocabulary terms and proper names and fed it to Watson and helped Watson to understand what tennis was about so that it could do things like, when you heard the word Ashe, it was capital-A-s-h-e, Arthur Ashe, as opposed to lowercase-a-s-h. So there was a lot of training around that. The output then became our ability to create clips based on what was happening with an event, but also to describe historical video as well.

That's critical for companies with large media back catalogs who then optimize that moving forward. You can apply it to live, of course, but that's a typical use case that we see.

It's a recursive learning system. So we took a cross-section of a set of video assets, described it to Watson, said, “This is what's going on. This is who this player is, this is what is being said.” We were able to turn it loose really on other unstructured assets, have it say what it thought it was finding, and then we were able to correct it.

So we were basically able to train it up to understand tennis specifically. And then the output was, we could then turn it loose on a bunch of different kinds of outputs for the client. That was the idea.

Nadine Krefetz: And the outputs are?

David Clevinger: Closed captioning, video clips, excitement scoring. I know you've got a video here that talks about the Masters and some work we've done there, but we were able to do things like listen for crowd noise and say, "This must be really exciting, because the crowd is making a lot of noise at this moment." So we were able to turn that into an excitement score. But we wouldn't be able to do that if we didn't help the algorithm understand what it was looking at and how it should be thinking about that body of work.

Nadine Krefetz: So you were training the algorithm to get smarter and smarter, and then you let it go?

David Clevinger: Exactly. And then we just turned it loose and let it go. And that's the idea: to get it to the point where you can just turn it loose and let it run and them move onto the next one.

