Engineering Blog

Ooyala Discovery Leveraging Five Year Investment in Big Data

by Ooyalan ‎12-06-2012 09:35 AM - edited ‎03-09-2017 10:33 AM (2,354 Views)

Written by Zhichen Xu
Systems and Architecture

With a mission to personalize every screen, we have released the Ooyala Discovery product. At the core of this release is a set of recommendation algorithms. Recommending video content poses some significant technical challenges: Unlike for text-based content, video keyword generation is manual and therefore quantity and quality can vary widely across providers. Techniques based keywords alone are not sufficient. In the online world, content can quickly become popular, and then just as quickly cool down. The ability to quickly capture user behavior data and apply it timely is crucial. In addition, the engagement data for videos are much more granular than for typical text-based Web pages. Users interact with the videos in different ways: pause, fast forward, replay, and drop off at different time points into a video. Hence, recommendation should be made based on user engagement as opposed to just simple clicks. Further, people watching videos on different devices, and sharing computers and TVs further complicate the problem.


Ooyala Discovery Overview

Since we launched Ooyala Discovery, we have significantly improved engagement time for our clients across all forms of content. We offer three types of video recommendation services: trending videos, related videos, and personalized recommendation.

  • Ooyala Discovery provides built-in experiences inside the Ooyala player, as well powerful APIs, with which customers can fully customize their user experience. Examples of this would be to provide Trending or Popular video widgets on their website, and to include similar videos on landing pages for individual videos. The figure above shows the in-player experience: “pause screen”. When a users click on the pause button, interesting videos that are related to the current videos are prompted. This is designed to recapture the users’ interest when users start to lose interest in the current videos.
  • Ooyala Discovery offers powerful editorial controls that customers can leverage to accomplish specific goals. For example, the promo feature would allow a publisher to recommend new releases in the most desirable recommendation position during specified time periods. Publishers can model episodic relationships (e.g., “How I Met Your Mother” episode 1 and episode 2) and ensure that after a user completes episode 1, episode 2 will be recommended. They can segment content by labels, e.g., only showing football videos on certain sports sites, and also blacklist videos by labels, e.g., never show internal videos.

Highlights of Technology


On the technology front, we have made significant advancement to the state-of-the-art, both in terms of data mining algorithms, and system architecture to enable real time feedback. We highlight several of them below:

  • Most noticeably, Ooyala Discovery allows customers to set explicit optimization goals, such as optimizing click through rate, total play time, play completion rate, session length, or session time. Optimizing click through rate may work well for Web pages or ads, but not for videos. Providers have different forms of content and employ different monetization strategies. Some monetize with pre-roll ads, while others rely on mid-rolls, or subscription. For this to work, we have built sophisticated experimentation capabilities such that we can allocate traffic to multiple algorithms and measure their performance in near real time. We continuously employ an “explore and exploit strategy” measuring the performance of various algorithms under different circumstances, and rely on higher-level algorithms to steer and pick the parameters that best suit each client’s need. The lower-level algorithms are built with deliberate biases which allow them to perform well under some conditions but not others. The job of the higher-level algorithms is to decide when to use which lower-level algorithms and how to combine results from them.
  • On the data mining front, our approach combines content metadata and collective user behaviour data in a way that the strength of the signals are synergized based on their quantity and quality. An important challenge for recommender systems is to handle the so called new item and new user problems. For a completely new piece of content, there is no behavior data associated with it, and recommendations need to be based on content metadata and behavior data at the categorical level. We have devised a machine learning approach that takes behavior data and content data as raw features. For instance, when computing related videos, we use three pipelines. A content-based pipeline generates pairwise distance scores for each video pair as soon as new videos are ingested. Rather than generating a single score, it outputs distances computed for a variety of content fields such as tile, descriptions, and labels. A collaborative filtering pipeline computes conditional probability of users watching a video given they have watched another video based on user behavior. A third machine-learning pipeline takes the outputs of the former as raw features for both building models and applying the models to generate final similarity scores for all video pairs. (The interactions among the three pipelines are illustrated in the figure below.) Consequently, recommendations will be based on content metadata initially for new content, and rely increasingly on behavior data as it accumulates. Since that the distance functions used in the manually crafted pipelines (e.g., content-based filtering) are arbitrary, relying on the machine learning pipeline to combine the scores has an additional benefit. That is, to generate similarity scores in a principled way based on the actual user responses to the recommendations. To cover enough parameter space, our collaborative pipeline outputs scores for multiple parameter sets. For example, one parameter set may favour new content, another one may favour content with even user activity distribute more evenly over time. (Similarly, to deal with the new user problem, we incrementally build up the user preference as we observe their behavior in real time.)
  • Our personalized recommendations are relevant based on user interests derived from collective user behaviors as well as each user’s intent derived by their recent viewing history. A given user can have long standing interests, e.g., being a sports fan, and/or is looking for information to fulfill an immediate goal, e.g., finding relevant tour information for a trip. Having the ability to satisfy both needs are essential to delight our users. To capture long standing interests of the users, we combine both offline and online user modeling. The difficulty here is to build user models for hundreds of millions of users while at the same time keeping the models up-to-date. Our solution is to combine offline and online modeling. In particular, we build offline user models periodically based on snapshots of the entire user history, and then update the models on the fly based on real time user behavior data. The offline process prevents models from digressing over time and the online component ensures that the models are up-to-date.
  • Our algorithms are built in such as way such that they are able to attribute to each recommendation the underlying reasons for making the recommendation. Studies have shown that users are more susceptible to recommended content when they are aware of the underlying reasons for the recommendations. For example, a recommendation can be made because a piece of content is popular in a given geographical region and time window, or because it is similar to a piece of content that a user has just watched, or because it belongs to a content cluster based on collective user behavior. Exposing the underlying reason also allows for interesting UI explorations. Because that our high level algorithm used for “explore and exploit” takes the outputs of an ensemble of algorithms such as the content-based filtering, collaborative filtering, and trending videos as inputs, attributions can be made by assessing contributions from the lower-level algorithms to the final ranking. When there are more than one reasons, we list the reasons based on their importance.
  • Lastly, to handle the sheer data volume of the user feedback is a daunting task. To take advantage of real time user data that arrives every second, we made most of our algorithms incremental. To give the readers a flavor of the scale, our algorithms consume approximately 12 million relevant user engagement events on a hourly basis, to incrementally update the behavior-based video-to-video similarity graph. We build user models for nearly 200 million users and update their interest profile as we observe their viewing activity on the fly. The biggest performance challenges we have to overcome include: (1) computing pairwise distances for all related video pairs based on both content and behavior data, (2) joining massive tables to generate training and testing data for the machine learning pipeline, and (3) clustering hundreds of millions of users within reasonable time for user modeling. For data join, we exploited techniques such as bloom filters to filter out irrelevant data as early as possible. We also take advantage of the characteristics of the data itself. For example, when joining content metadata with behavior data, we make use of the fact that new video content will have limited behavior data. For user clustering, we have experimented with different factorization techniques to trade off model quality and computing resource utilization.

Since day one, Ooyala has believed that good data is the key to better viewing. We’ve also believed that the future of television isn’t linear broadcasts, but a richer, more engaging, more personalized experience. Perhaps more than any previously released product, Ooyala Discovery represents our dedication to better viewing through data. Stay tuned for more exciting new developments from the Ooyala engineering teams.