IBM is applying Watson cognitive capabilities to its various Cloud Video products, creating a measure of automation in some of the more complex areas of digital video production and broadcasting such as audience analytics of both on-demand and live events, video scene detection, and content recommendation.
To do so, IBM is integrating its new Media Insights Platform with IBM Cloud Video’s existing Catalog and Subscriber Manager and Logistics Manager products. The platform is part of its Media and Entertainment solution, and gives its IP video customers deeper insights into audience viewing habits, such as the content they’re watching, what device they’re watching on, and other behaviors and preferences.
Video scene detection could help improve a number of metadata challenges. Watson uses speech-to-text as well as detection technology that helps it recognize semantics and image patterns in order to recognize when a movie, TV show, video lecture or other type of content changes topics or scenes. It can then automatically segment the video and categorize each segment, removing the need for a human to handle this process.
“We tend to think of Watson slightly differently from analytics,” said Gregor McElvogue, Director of Video Offerings, IBM Cloud. “Analytics is about taking the data … and (finding) actionable data … measuring interactions with the content, how streams are being delivered, etc. That’s analytics in its pure sense. Watson is sort of a leap forward from just pure analytics.”
McElvogue said that IBM applies three terms to Watson using the acronym “URL” – the system understands the context in which data is being produced; is able to “reason” by seeing patterns in that data; and then “learns” or infers certain conclusions from that data. “Watson is a self-learning system. Not only can it context(ualize) the data, not only can it start reasoning based on the data it’s getting, it can then learn from the data. Effectively what we as human beings do.”
IBM Cloud Video has been working on this detection technology for several months now. In August, the company put out a “cognitive movie trailer” (see below) for Morgan, an upcoming 20thCentury Fox movie. “Our team was faced with the challenge of not only teaching a system to understand, ‘what is scary,’ but then to create a trailer that would be considered ‘frightening and suspenseful’ by a majority of viewers,” said John R. Smith, IBM Fellow and manager of multimedia and vision for the tech giant, in a blog post.
At IBC 2016, McElvogue described IBM’s work with the US Open to convert tennis commentary to text with Watson recognizing and contextualizing specific tennis terms, player names and other topics around the event.
“We had to teach Watson to understand tennis. Not only the language about it, but the language around the (sport). For example the term ‘love’ (in tennis means something entirely different),” he said. Using speech to text technology, “when it’s taking in the speech from the commentators … it’s expecting to hear something” that it understands in the context of a tennis event.
Of course, IBM has had Watson technology for quite awhile now, but hasn’t directly promoted it with its newer Cloud Video unit. That changed with the acquisitions of UStream in January and Clearleap in December.
McElvogue said that marrying Watson’s capability to cloud video delivery is something that IBM would never have been able to do independently.
“What we got with the acquisitions was both the ability to talk with customers about how you capture this content … and how you take that content and put it out in the market in a very attractive way. A lot of what UStream does is capturing live content … how do you take that live content and put it out to the market in an attractive way?” McElvogue said. “It’s given me a much richer conversation to have with them.”