Akin Recommendation Engine
Targeting Content with Behavioral Analytics
Mar 15, 2017
There are a lot of different types of recommendation engines out there. They range from the straightforward based on time proximity to sophisticated ones based on similarity derived from content analysis. Recently, one of our customers had a need for a behavior-based recommendation engine to support a news feed-like application. This blog provides an introduction and design overview of what we built. We provide the solution as an open-source implementation called Akin.
How does it work?
Akin uses a collaborative filtering technique to make predictions about the interests or future behaviors of an actor. For example, consider that Alice eats an apple and an orange and Bob eats an apple. If we know that Alice and Bob historically have similar diets, we may predict that Bob may also want to eat an orange.
The first thing a recommendation engine needs is data. How do you predict the future if don’t know about the past? To help collect the metrics we need to train the engine, we built a simple system to log user actions. In this case, we wanted to note when actors view an item, comment on it, mark it as a favorite, etc. A user’s overall activity score for an item is derived from the weighted sum of each action.
The first thing a recommendation engine needs is data. How can you make predictions about the future if you have no information about the past?
Recent Activity is More Relevant
In our use case, recent user activity is a better predictor for future behavior. So, we apply a weighted decay function to each activity based on its age. We’ve found that an inverted ease in/out cubic function provides an appropriate emphasis on recent items – but, you can tune this to the use case.
Determine User Behavior Similarity
After processing the activity for all users across all items, we can build the following matrix of weighted scores. Now, we have vectors for each user consisting of their item scores. Next, we can determine how closely two users predict each other’s behavior by using cosine similarity.
After calculating all of the user similarities, we use those similarities to influence the likelihood that we recommend an item for another user. For example, User A and B have a cosine similarity of 0.9422. User A and C have a similarity of 0.2364. Apply these multipliers to their corresponding rows and sum the item columns. For User A, this results in a recommendation score of 0.2364 for items 1 and 2 and a score of 1.8844 for items 3 and 4. Higher scores mean that a user likely has a higher interest in that item.
Applying the Recommendation Scores
Now that we have item recommendation scores for each user, we have to decide how to apply them. For example, if a user is extremely active on an item, maybe we don’t want to recommend it because they are already really familiar with it. This would be an example of tuning the system to helping users discover new items. Additionally, we can implement user aids like filters or “do not recommend” lists, which can be used to further influence recommended items.
Really popular items can start to hide or overwhelm more niche, but potentially still relevant results. To overcome a problem like this, we can make our selection based on a cumulative distribution function. This provides a random, weighted selection that will favor higher recommendation scores, but doesn’t eliminate the lower scores. These recommendations prove especially useful for users with low activity in the system. Even with limited activity, we’re able to provide content-relevant recommendations to populate news feeds based on the input of other users.
Behavior Provides Powerful Insights
Although behavioral analytics do not consider the content of items, user activity provides insight into content or topic groupings. In fact, we’ve seen these analytics bring to light the hidden variables that connect otherwise disjoint content items. This helps users find relevant content and discover new items of interest to their heart’s desire!
Overall, a recommendation engine based on behavioral analytics has proven to be a useful tool in presenting new and interesting content to users. While it isn’t a silver-bullet for discovery, it can provide valuable insights into how users and items are connected based on crowd-sourced user activity information.