Streaming Analytics Demo: Datazoom
Learn more about streaming analytics at Streaming Media's next event.
Watch the complete presentation from Streaming Media East, VES203. Accessing Real-Time Metrics for Live Streams at Scale, in the Streaming Media Conference Video Portal.
Read the complete transcript of this clip:
Robert Reinhardt: Another offering that I was having a lot of fun playing with yesterday was Datazoom. The CEO was walking around yesterday but I think she had to leave to get to another engagement. Let me just make sure I've got the right tab. Here it is, I reconnected it. So, I'm already in my Dashboard view.
Again, they didn't make me any special account because I'm with Streaming Media or anything. I just followed their signup with automated email activation of this account, and I'm gonna just, again, what's interesting is these terms and concepts are pretty consistent from one analytic system to the next.
I think Datazoom does a great job at visualizing it for you, but they have this concept of Collectors, which, like I already talked about, is the way that you're gonna gather and collect your data. You have Connectors so that we can take the data out of the Collector and pass it somewhere useful. The Connector. though, this is really just defining your data store and the Data Pipe defines how you take your Collector and match it to a Connector, because you could have one Datazoom account manage several different entities. And I'm just gonna go right now into what you would do first, which is set up your Collectors.
Again, they have a really great visual interface here. I could just click Plus to add a Collector here. And again, just sort of just to give you, what I find, again, is super useful with these services is you can start to see what you would have to be looking into when you're building your own service.
So, they've got all the major vendors for players. We've got Fire TV for over-the-top with Amazon's Fire TV. We got THEOplayer, Brightcove, Kaltura. This list is, you know, you've seen all these vendor names here at Streaming Media. If they've been in the space for a while, then they're gonna be here. And so, I've already defined two, the JW config, and there's nothing to do in these configs.
The only thing you do after telling it what player API, notice it's looking for JW Player 8. I don't know if it's version-specific with some of the API calls, but I didn't see a JW Player 7, for example. But this FluxData, that's specific to Datazoom, and gives you some insight into how they're collecting data. They're using a polling interval, essentially. And so, I can poll data, and the default here is super, I think the default was one second.
The way that Datazoom makes its money is by the data transfer. So if you're collecting a lot, like, if I collect data every second from JW Player and it's a two-hour event, that's a lot of data going to Datazoom informing it, and so it's a polling mechanism. I'm fairly certain, I haven't made it through all the API documents by any stretch on Datazoom, but you could customize event handlers so that it's not strictly a polling interval, but I think this is so they can cover their bases well from one Collector to the next.
You can specify this polling interval. And so, rather than detecting a bitrate change, it's just polling nonstop, so it will see if there's a bitrate change. So, if I go from one megabit to 500 kilobits in an HLS rendition, I would know based on the polling. And so, I set it to be a little less aggressive than one second. I think I had that at 10 seconds. I'm just gonna cancel it. And the same thing for other configurations.
It's all about this FluxData Interval, as they call it. And again, that's one way of approaching collecting data is just to use a timed interval to see what's going on with the player. And notice, you get a specific configuration key for this Collector. So, that plays in, again, the beauty, I think, of Datazoom is how easy they've made it for you to use their service.
And again, you get a lot of data free. Like, if you just wanna try this with some smaller live event streams that you're doing in your org, you could probably get some useful stuff going right away. Let me go ahead and pull up the Datazoom-specific example.
But at the end of the day, you probably just wanna know, was the video playing or was it stopping? So anyway, there's the beacon script and that's as simple as a Collector is for Datazoom. And I'm gonna cancel out of this and then show you a Connector. So, a Connector is that data storage that we were talking about. I set it up with just a generic Amazon S3 bucket and I tried out Splunk yesterday. I'll talk about Splunk in just a minute. It's a pretty popular tool. I've heard it talked about with a few colleagues. They do have--again, I don't wanna go off on a tangent on Splunk just yet, but let's take a look at Amazon S3. So, S3 of course is just cloud storage. Simple Storage Service is what S3 stands for. So if I bring up my little rubber ducky here for Cyberduck, I'm gonna go and hit Refresh here.
So, this is that bucket that I configured there. I just need my Secret Access Key and I can see the data that it was storing from yesterday. It hasn't yet made a bucket for today, or actually, did it? It did, it made a bucket for today. Sorry, 508. So, this is what I get out of that. So, what Datazoom is doing is dumping JSON files for the collection intervals that I've set up. So, I could potentially see a ton of JSON documents show up in here, and I think I really have one loaded in TextWrangler.
Let me blow that up a little bit so you can get a sense, and again, like, what I find interesting is you can, if you're trying to, where's my view? View, let's, I can't even remember how I can increase fonts. Here, right here, Show Fonts. Let me just select all and make this big. There we go. Okay. So, this is the JSON data structure that Datazoom is using. Everything that they're tracking is in here. I think, again, they have an API offering, so I could probably go into my account and change this on an API level. They're not exposing these controls yet in their user interface, but a lot of information is gathered.
And so the idea with Datazoom, and I don't know if there's competitors on it, is they get pretty detailed. Like, it tracks where, your ISP that they're connecting from. I could tell this was from the airport. I mean, the airport from the hotel 'cause they're using Wayport as the ISP here at the Hilton.
So, this was a test I did yesterday. They're giving me the lat, the long. They're giving me the page reference. JW Player, notice they're giving me the version string right there, so I didn't have to specify that. All of the relevant metrics are there. That's all I really want you to know when you're looking at that. Again, I encourage you to look at it yourself 'cause it's so easy to use.
The last step in setting this up though, so that you can start gathering events is a Data Pipe. So, again, what I really like about Datazoom is it serves as a great graphical representation of what you would have to design on your own if you were gonna build it, and I love that they've got this little tool like this so I can add as many Collectors as I want to go to multiple Connectors. And I can make different pipes; I'm not restricted to one. Like, if I had one hls.js player that I just want it to go dump stuff on Amazon and I was building out my own kind of backend to break down this, I could just have that as its own pipeline.
So, you can have as many Data Pipes as you want. Again, in this case, I just want all players to distribute all their data to both of these services as Collectors. So, pretty neat. Again, they're really just serving as a bridge. And you might be like, "Well, what's the benefit of this?" What the benefit is is that you don't have to do, excuse me, what I was showing you earlier in Dreamweaver with my multi example. The example that I had you load up had all three vendors actively gathering analytics, and this is something that I have seen done before, not with stuff that I've set up, but that were, "Oh, let's get Google Analytics. "Oh, now, we're using this other analytics provider."
Those are all calls that have to come out of this page, right? So now I have three beacons of sort. I have got beacons going out to Google Analytics, I've got a beacon going out from Mux because I've got the Mux library, and I've got the beacon for Datazoom on this page. So, that's three calls coming out for every player event faster based on the interval here, and that's cumbersome.
And so, Datazoom's whole offering, I think, came up because they acknowledge that, "Hey, you know what? If you need multiple analytic services, it's best if you just collect that data once and send it to all those services because then you have a single point of entry on the player, and then you can send it out," and that is very nice. And again, you could build a similar system on your own but Datazoom's got it all ready to go for you. So, that's pretty cool.
VideoRx CTO Robert Reinhardt demos MuxData analytics in his Video Engineering Summit presentation at Streaming Media East 2019.
VideoRx CTO Robert Reinhardt demos SnowPlow Analytics in his Video Engineering Summit presentation at Streaming Media East 2019.
VideoRx CTO Robert Reinhardt identifying the key streaming data points stakeholders care about in this clip from his Video Engineering Summit presentation at Streaming Media East 2019.
GigCasters' Casey Charvet and Cheesehead TV's Aaron Nagler discuss how to contextualize, interpret, and leverage social streaming metrics in this clip from their Live Streaming Summit panel at Streaming Media East 2019.
SSIMWAVE Chief Science Officer Zhou Wang discusses the shortcomings of traditional QoE metrics in this clip from his Video Engineering Summit presentation at Streaming Media West 2019.
Companies and Suppliers Mentioned