A Recommender Engine for IBM Watson Studio platform

A content-based and collaborative-filtering recommendation system for content items and users at IBM Watson Studio platform.

Here I describe the creation of a custom recommender engine that offers tailored recommendations of content items based on the activity history of each user as well as the entire pool of users in the platform. A demo of the recommender is deployed in a web app that showcases how it makes recommendations to randomly chosen users. The web app is linked at the end of this post.

In short, for each user, the engine pulls content items that:

  • Are similar to those already seen by the user
  • Have been seen by similar users
  • Have the higher popularity (number of views) on the platform

Background and rationale

The aim of any recommender is to show personalized (content) items to users to maximize the following 4 operational goals:

  • Relevance: show items likely to be liked by the user
  • Novelty: show items that the user has never seen
  • Serendipity: show items that surprise the user.
  • Diversity: items should be varied to avoid the user getting bored

When these 4 goals are achieved, the probability of reaching business-relevant goals, whatever they are (i.e. the number of transactions, time spent on the platform, etc) are, in turn, also presumably attained (nevertheless, depending on the specific business context, it may be very worth measuring directly how those ultimate business goals are affected by the implementation of a content recommender system or changes on its functioning and/or design).

Recommenders are a very active field of research and development, existing many different approaches to building recommender systems. Some of the most common approaches are:

  • Knowledge-based recommendations: these approaches take input about user preferences to pull items. For example, the user explicitly gives information about what content categories they are interested in. Thus, the idea here is to show the most popular articles within the categories preferred by the user, or in the absence of such specific user input (as is the case in this project), simply the most popular items overall.
  • Content-based recommendations: these approaches use information about content items, so the idea is to compute some sort of similarity parameter between content items to pull items that are similar to those already seen by the user. One possibility is to analyze how similar are the item titles (how many words have in common, see below) using Natural Language Processing.
  • Collaborative filtering: these approaches use information about user-item interactions from the entire (or a considerable proportion of the) user base. Here I use 2 different methods:
    • Neighbour-based approach: consists of pulling items liked by similar users. Users are considered similar when they overlap in the content items they have seen on the platform. Therefore, a user-user similarity parameter must be computed.
    • Model-based: here, instead of computing a fixed score value to parametrize user closeness, articles that will be liked by a given user are predicted and then evaluated using real data from the user base.

The recommender engine described here offers tailored recommendations for 3 different scenarios:

  • When a new user enters the platform, it shows the most popular items in the platform (items with a higher number of views).
  • When a user that has recently joined the platform (number of articles seen > 0 and < 6) enters the platform, the recommender shows, in addition to popular articles, articles that are similar to those already seen by the user. There are around 3500 users of this type in the dataset.
  • In the rest of the cases (i.e. when an “Old” user who has seen more than 5 articles enters the platform), the engine shows, in addition to popular and similar articles, articles that have been seen by similar users. There are 1660 users from this user type in the dataset.

The dataset

The dataset was provided by Udacity as part of a graded assignment for the Data Scientist nano-degree. It consists of 2 CSV files containing 45.000 records of user-item interactions (which user saw which content item) and content data (item title, teaser and body text). There are around 1200 content items (text articles or videos) and 5000 users.

Most users have seen very few articles and most articles have been seen very few times (histograms below):

Popular articles

This is the most basic strategy used and consists of showing those articles that are more popular among the user base. To obtain those, I just sort all articles by how many views they have received and take the top 10.

This is the only strategy I used for new users to solve the so-called “cold-start” problem (when there is no data about user preferences because the new user has not seen any content item yet).

Similar articles (content-based recommendations)

From the 3 content data fields available in the dataset (title, teaser and body text), I decided to focus on the title to compute the similarity between content items using Natural Language Processing (NLP). The steps are as follows:

  1. Obtain a corpus of tokens (“words”) from the entire dataset and compute a matrix of word relative frequencies (vectorization) for each content item. The result is a table where each row is a content item and 1828 columns, where each column is a word of the entire corpus vocabulary and its value is between 0-1 (0 if the word is absent, 1 if it has maximal frequency).
  2. To estimate content item-to-content item similarity, compute the cosine similarity between each pair of content items. If we consider that each content item is a vector of 1828 dimensions, then the cosine of the angle between 2 vectors corresponds to the similarity between those 2 content items. Remember from trigonometry that the cosine of 2 perpendicular vectors is 0, and when the 2 vectors overlap (are identical) the cosine = 1. Thus, cosine similarity gives a value between 0 and 1, being 1 when a pair of sentences are identical and 0 when the 2 sentences have nothing in common.Below an example of what the item-to-item cosine similarity table looks like (each cell is the similarity of one article with another. See that the top-left cell value is 1 – and all the diagonal values -, because it is the similarity of the first article title with itself):
  3. Finally, rank for each content item all the other items from higher to lower cosine similarity.

Below is a heatmap of the cosine similarity matrix and a histogram of the average cosine similarity of each article with the rest of the articles. In general, most articles have an average cosine similarity between 0 and 0.25 with the rest of the articles, which is low. Most articles seem unrelated to each other, as indicated by the fact that black colour prevails in the heatmap, although there are several red spots of relatively high similarity, which could be good if we want to be picky with recommendations. Also notably, there are 100 articles that do not have any similar article in the matrix (average cosine similarity with other articles = 0). Those articles will be ignored in content-based recommendations).

The above approach was used in the recommender engine as soon as a user starts reading content, that is, for recent and old users (i.e. those users that have seen at least 1 content item).

User-user-based collaborative filtering

Here the idea is to pull articles that have been seen by users that are similar to the given user to whom we want to offer recommendations.

The steps are:

  1. Compute user-to-user similarity. As above, but now we obtain a user-to-user similarity matrix (instead of an article-to-article similarity matrix). This is obtained by computing the dot product of the user-item matrix (a matrix where each row is a user, and the columns are the content items the user has seen, encoded numerically in a binary fashion so that 0 means “unseen” and 1 means “seen”) with its transpose. The result is a matrix of users as rows and users as columns, with values ranging from 0 (null similarity) to 1 (identical).
  2. Rank users from highest to lowest similarity and get the top 20 most similar users.
  3. Pull content items that have been seen by those similar users and offer those to the user we are giving recommendations to (after removing the articles that the recommended user has already seen).

This approach is used with old users (users that have seen at least 5 content items) in the recommender engine presented here. Even if this approach sounds sensible, the reality is that in the given dataset it does not pull too many content items because users have very little similarity, as shown in the heatmap of the user-to-user similarity matrix (below).

Matrix factorization

Another approach used in very old users (users that have interacted with more than 100 content items) is to predict articles that will be liked to a given user by using the rest of the user base and their history of user-item interactions. Using Single Value Decomposition on a fraction of records it is possible to build a predicted user-item matrix and then test it in the remaining proportion of records.

The idea is to set a number of latent features and then estimate the accuracy of the prediction, which was near 0.94-0.96 with 300 latent features.

A blended approach

All the approaches described above were put together in the recommender as follows:

  • For new users: as there are no data about user preferences, recommendations are the most popular articles in the platform
  • For recent users: the recommendations include up to 80% of similar articles, collected as follows: articles seen by the user are first sorted by the number of views of each article and then, starting with the most popular, similar articles are pulled for each article (up to a maximum of 4), to bring more variety.
  • For old users: we add to the previous approaches a layer of recommendations pulled from similar users. The 3 approaches are blended to give variety, but prioritizing content similarity (40% of recommendations), then articles from similar users (40% of recommendations), and finally adding at least 20% of popular articles. Such a proportional mixture is made in each call to the recommendations generator function.

Deployment and code

A demo of the recommender is deployed in Streamlit cloud and can be accessed here. The web app showcases the recommender engine in the 3 scenarios (new users, recent users, old users) described above and simulates user activity to show how recommendations are updated accordingly.

The code of the recommender engine is packaged in a Python class and can be seen in this GitHub repository.

Do not hesitate to visit the web app demo, re-use the code or contact me if interested in having further information.