Streaming Media

 
Streaming Media on Facebook Streaming Media on Twitter Streaming Media on LinkedIn Streaming Media on Google+ Streaming Media on YouTube
Sponsors

Microsoft Debuts AI Cloud Service for Video at Build Conference
AI provides a simple and fast way to harvest a variety of metadata from video libraries. With this addition to Microsoft's Cognitive Services, developers can try it out for free.

Microsoft is making artificial intelligence (AI) available for free to streaming video developers. Now it wants to see what they'll do with it.

At its Build conference in Seattle, Washington, Microsoft today announced Video Indexer, a cloud service now part of it Cognitive Services lineup. To give a little background, the company's Artificial Intelligence and Research Group was formed in September 2016 as a way to democratize AI, making it available to all developers. It creates tools or services that can be integrated in other code via APIs or SDKs to ad AI functionality.

The group's Cognitive Services toolkit debuted with 14 machine-based learning services a year-and-a-half ago. That grew to 29 services last year. Today, Microsoft introduces 4 new services, one of which speeds video and audio metadata creation.

Video Indexer is available as a preview download for free testing. Microsoft wants developers to give it a try so it can learn from their experiences and refine the service.

With Video Indexer, developers can harvest a variety of useful metadata from files with no human interaction needed. The service can identify faces, transcribe spoken audio, detect objects within a video, and detect emotions. With that information, publishers can improve discoverability or improve monetization by serving targeted ads that better match a video's contents.

During the preview phase, Video Indexer is free but developers are limited to uploading 10 hours of video per day and 40 hours total. They can up load a maximum of 20 files, which each one no more than 4 GB.

Video Indexer is fast, processing a 45 minute video in about 5 minutes. It achieves that by breaking videos into sections and using AI to pull data out of each one. It can identify which speaker is talking at any time, and index on-screen text. It can translate text (it currently supports 9 languages) and monitor for explicit audio or visual content. It's also able to detect scene changes and extract key frames. The service is only for saved, not live, video.

As this is still a work in progress, some tasks have higher success ratings than others. Face detection is highly reliable, while emotion detection has roughly a 60 percent success rate. The process is designed to be fully automated, but even if companies add spot checking by a human, getting the results will take far less time than if all the work was done by hand.

Roughly 8,000 people work in Microsoft's AI and research group, with 5,000 of those working strictly on AI. Around 150 of that group works on Cognitive Services. This group of engineers and researchers turns AI research into products. For video, they've gotten to a point where they can share their work with a larger group.

Expect the preview period to last between six months and one year, says Irving Kwong, group program manager for Artificial Intelligences and Research Marketing at Microsoft. The company will work closely with customers to monitor performance. Microsoft published a blog post with more information and a link to assets.

An illustration of how Video indexer extracts metadata

Related Articles
The online video industry is on the verge of offering useful new features powered by metadata. But when will the promise become a reality?
Ban the data silos and group various types of metadata together. Digitalsmiths offers a future-forward solution.
Collecting video files and related metadata for broadcast and multi-screen delivery has become complicated and costly, thePlatform says.