Recommendation System Fundamentals

This are summaries from Andrew N.G’s course on machine learning.

Problem formulation of recommendation system

For this post we will use movie recommendation as example
We want to recommend movies to user
We will have ratings for some (user, movie) pair and would want to predict ratings for other pair
- Then for a given user we will recommend movies sorted by highest prediction ratings

Two kinds of recommendation system
- Content based
- Collaborative filtering

Content based recommendation
- Each movie has some features
- Train regression model for each user to predict ratings based on movie’s features

Collaborative Filtering

We need to know two things
- Feature values for each movie
- Weight parameter for each user
If we know one, we can calculate other
- We will of-course consider (customer, movie) ratings
- We have partially filled matrix as training set.
Can we do a joint optimisation for both ? That’s the idea behind collaborative filtering.
Why it is called collaborative filtering ?
- Users are collaborating with each other to generate predictions for each other.

Notes of collaborative filtering equations
- Both summations are equal.
- x0 = 1 is not needed. x0 is constant term in linear regression
  - If model thinks it needs a feature which is constant it will create this on its own.

Collaborative filtering is essentially a matrix factorisation. We can construct matrix for movie and user based on vectors x and theta in above formulation.

Low rank matrix factorisation
- Let number of user’s be nu
  - Number of movies nm
  - Number of movie feature k
- User matrix dimension -> nu x k
  - For each user we have vector representing feature weights
- Movie matrix dimension -> nm x k
  - For each movie we have feature value
- Original matrix dimension -> nu x nm
- We have factorised matrix into low rank k <= min(nu, nm)
- It is very popular mathematical technique to discover latent structure with sparse data. [1]
Further application
- Showing related item is different problem than recommendation
- With collaborative filtering we are obtaining feature vectors for each movie/item
- We can apply nearest neighbour on top of it

Cold start Problem

Mean normalisation helps combating that
In most ML algorithm we normalise feature (column)
Here you would like to normalise row from product perspective. Mathematically you can do both.
Example below
- 5th user has not rated any movie (blue column in first matrix)
- Regularised objective will make theta 0 for 5th user, which in turn produce 0 rating while making prediction
- Mean normalisation transform Input matrix as shown below
- During prediction time we add mean value for that movie
  - Theta will still be 0 for 5th user from matrix factorisation
  - After multiplying with 0 we are adding mean value to it

Reference

[1] https://arxiv.org/pdf/1507.00333.pdf

Data Stories

Recommendation System Fundamentals

Problem formulation of recommendation system

Collaborative Filtering

Cold start Problem

Reference

Leave a comment Cancel reply

Problem formulation of recommendation system

Collaborative Filtering

Cold start Problem

Reference

Share this:

Related

Leave a comment Cancel reply