Thumb 917523f6 9845 497b 8b23 72a8d7ba010e

Product Recommendation with Reinforcement Learning

Industry Consumer Goods and Retail

Specialization Or Business Function Customer Analytics (Product Mix Analysis, Market Segmentation and Targeting, Upsell Analysis, Recommendation Systems & Cross Sell Analysis, Product Feature Prioritization), Consumer Experience

Technical Function Data Management, Analytics (Real-time Analytics, Machine Learning, Artificial Intelligence), Marketing and Web Analytics, CRM, ERP, Accounting, Operations, Marketing Automation (Email Marketing)

Technology & Tools Programming Languages and Frameworks (R, Python)

COMPLETED Jul 05, 2017

Project Description

Project Description

We are a company that produces marketing materials through mass customization and web-to-print systems. We operate globally with 6 main markets. Email is one of our main channels to reach our continually growing customer base.

Today, we have the capability to personalized emails product tiles at the customer level based on order / browsing data.

We are looking to resource this project with someone that can serve as an end-to-end advisor for this project.

Requirements description:

We want to build a recommended system that is able to:

·         Recommend the right set of products that will influence customer behavior (we will measure the performance based on incrementality of control vs targeted group): Recommendations that are just based on higher likelihood of buying that product are just not enough.

·         Self-improve over time: The system should be initialized and gather information over time on how to adjust recommendations to optimize performance (weekly or even daily adjustments are required).

·         Scale to product inclusion: The system should accept new products and start optimizing with current recommendations seamlessly.

In addition, this project has as main goal to share knowledge on reinforcement learning techniques so it could be later used on other aspects of the business. This is the first use case and we expect the consultant to be able to share information with Vistaprint that would lead our analyst to add some of these skills to their toolkit.

Task description:

As a consultant, your job would be to architect the design of the system and explain step by step how the system would use customer data, prepare recommendation and self-improve over time. In addition, you need to be able to deliver pseudo code on how the system does each step so our engineers can deploy the solution. This job is for someone that can develop a good partnership relation as we will not only implement the system, but we need to understand the pros / cons of any approach as well as future challenge as it related to data management, score computation etc.

Current state:

We are about to launch a major randomization campaign that will target our entire US customer base on 16 emails. During this time, all browsing, order and basket data will be collected in HDFS.

During the last few months we have researched MDP solutions and we believe this approach would be desirable. While we think this could be a possible solution, we are open to new ideas that will be able to meet the project requirements.


There are 4 different types of data we have about every customer:

·         Transactional data: Which includes everything related to orders that customer have placed in the past up until the data before of emailing recommendations. The data is grouped by product category and aggregated for first order, last 30 days orders, last 15 months orders, last 3 years orders and last order. For each one, we have total amount spent, item count and order count.

·         Basket data: Many times our customers add items to their carts that do not end up in an order. The basket data contains product category of last 30 days and current item in carts. The data includes only binary data for whether a product is / was in the cart of not.

·         Browsing data: Similarly to transactional data, we have all the information grouped by product category and we have aggregations at the following levels: last session (whenever it was), last 7 days and between 7 and 30 days. For each one, we have number of navigations to that product and click information to specific products.

·         Customer segmentation: We have 5 different customer segmentations with 5 levels at most in each. These customers’ segmentations describe visual patterns on what customers are (Consumer vs Business).

Other data:

Email data: We collect information that is sent in an email to the customer. For this project, we have access to customer id and date (to join back to customer information before email sent). We will have information about product shown, placement, type and discount. In addition, we will collect information about creative used and general email sale.

Response data:

We expect you to come up with an optimal reward function for this project. We will have access to any response associated to the email as well as any subsequent site behavior or transactional data. The data will be streamed directly to HDFS into the recommendation system.

All the data is stored in HDFS and current data for scoring will be available via HBASE.

Model output:

For every email deployment, there will be some input into the recommendation such as how many products will be recommended, what product would be featured in the email hero (thus it cannot be shown as a unique recommendation), general sale for this email, product tiles to be used. The recommendation should output the list of products for each customer (in order to be shown – optional). As one of the requirements for this project is to self-improve recommendation over time, the recommendation should explore with a fraction of the customers with new or old recommendations that could end up outperforming top 1 current recommendation.

Questions that we have:

·         How do you make sure the recommendation is generating incremental responses and not only targeting people with high likelihood of buying?

·         How do you define the reward function? Is there any long term value of the reward function?

·         How do you select fraction for exploring new recommendations? How does the reward function affect all the recommendations? At what rate does the system change with new trends on customer behavior?

·         How does the system allow new products to start being recommended?

Project Overview

  • Posted
    March 29, 2016
  • Planned Start
    June 01, 2016
  • Delivery Date
    June 02, 2016
  • Preferred Location
    From anywhere

Client Overview

Reinforcement Learning

Matching Providers

Matching providers 2