Machine Learning and AI Solving OTT Challenges

MACHINE LEARNING AND AI SOLVING OTT CHALLENGES AND IMPROVING VIDEO QUALITY

5th Mar 2018

In a recent article from Streaming Media Global, we saw how artificial intelligence and machine learning are solving OTT challenges and improving video quality in ways that are proving revolutionary.

Some interesting statistics were discussed, released by Cisco in their Visual Networking Index:

By 2021, 82% of all internet traffic will be video (growth from 73% in 2016).
In 2020, Cisco predict that a million devices will be added to the network per hour.

If these predictions are right, then one of the biggest challenges for OTT video streaming will be to deliver the highest quality possible in terms of experience (QoE) and service (QoS).

With the consumer public abandoning videos with a 2-second delay, and as many as 6% (per second) leaving after that*, there isn’t much room for manoeuvre.

Buffering is being addressed by the adoption of Adaptive Bitrate (ABR) Streaming, which uses switching to minimise instances of buffering and the issues caused by bandwidth fluctuation. This addresses part of the challenge, but cannot deal entirely with rebuffering or pixelation on mobile, which leads us to look for other solutions to this and other issues such as:

Rewind or playback delays
Forwarding and pause issues
Playback freezing

According to the article, the answers to these issues and challenges lie in Machine Learning and AI.

The Pensieve Neural Network

The Pensieve Neural Network has been developed by the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT. It is an AI system that harnesses Machine Learning to select algorithms according to network conditions. In simple terms, the system predicts potential connectivity problems, and adjusts the resolution of streams to minimise buffering. Although this only promises to minimise, and not eliminate buffering, many industry professionals believe that it is a firm step towards buffer-free viewing.

In an experiment, The Pensieve Neural Network increased QoE by up to 25% and led to a 30% drop in re-buffering, so it is already proving a success.

Machine Learning Technology Advancement

The article also touched on the advances in machine learning technology, already being utilised by big players like YouTube and Netflix to encode parameters dynamically, which increases both QoE and QoS, and leads to a reduction of the bits required to deliver the same levels of quality. It has been suggested that machine learning will play a huge role in cost optimisation, cutting down on bandwidth requirement and, more obviously, the need for manual optimisation. YouTube use neural networks for dynamic prediction of quantisation levels (QL) which find dual-pass encoding cut to a single pass, and this can lead to further cost savings for encoding and reduce video latency.

Machine learning algorithms are also helping to address the challenges that can be posed by the wide range of connected devices and varying screen sizes in terms of deliverable, perceived quality of content. According to the article, algorithms within machine learning can achieve ‘content-aware’ encoding, based on decision-making within encoding parameters based on screen size and intended quality, which ultimately leads to lower consumption of bandwidth and a reduced cost.

What else can we look to AI and machine learning to provide in the future?

The ability to lipsynch and understand closed captioning is a fascinating benefit, and one that is reaching far beyond human capabilities, according to a recent study carried out by Oxford University’s Computer Science Dept. Using an AI system named ‘LipNet’ 93.4% accuracy was achieved in word recognition, compared to just 52.3% success from a human professional.

Studies have also been carried out by the Google DeepMind Project, which tested AI vs humans in 200 randomly selected video clips. The AI translated 46.8% of the words, compared to 12.4% from professional lip-readers. There is also ongoing work in the AI and machine learning world to detect lip synch and closed caption synchronisation problems, which tracks the movement of the lips minutely in attempts to calculate video-audio synchronisation.

As we move towards the next generation of AI and machine learning, we can already see new theories and ideas emerging to enhance every area of the sector, from content production to delivery. It is an exciting time in the industry, and one that we cannot afford to ignore if we are to stay current and ahead of our competitors.

Follow The Streaming Company blog for the latest industry news, events and technology advancements.

*According to Professor Ramesh Sitarman

< Back