Repeat Purchase Recommender System

7 min readJun 13, 2021

Buy It Again recommendations feature on the personalized recommendations page of the Amazon.com website

How can we know when the customer will likely to do purchase the same product again in near future?

Repeat purchase the same product multiple times is a common phenomenon in e-commerce. The buy again recommendation exists to facilitate users who want to repurchase the same item. Crucial to the success of this recommendation strategy is to predict if a customer is likely to repeat purchase a product, and if so, when is the right time to recommend it to them.

Idea

The core of this idea is given that a customer has purchased a specific set of products,

can we recommend to users those product in near future at the right time when they are likely to purchase again?

The repeat purchasing phenomenon is certainly prevalent for consumable products (e.g., toothpaste, diapers, cat food, etc.). But our analysis shows that repeat purchase behavior is different over time, which indicates the need to always take into account goods that can be categorized as repeat purchasable products.

This analysis aims to know (1) repeat purchasable product categories by users using the repeat customer probability model, (2) the optimal time to recommend repeat purchasable products to customers based on their historical order behavior.

Hypothesis

“users will repurchase products using our ‘Buy Again’ widget within the prediction time interval”

Analysis & Results

The first model that we consider is a time-independent frequency-based probabilistic model that uses aggregate repeat purchase statistics of products by customers. This calculation is to determine the potential product that is shown to customers on the ‘Buy Again’ Widget. For each product Ai, we compute its repeat customer probability (RCP) as shown below:

Repeat Customer Probability (RCP) Calculation

By using RCP (Repeat Customer Probability) models, we got the RCP value of each product. Additionally, to ensure that the quality of repeat purchase recommendations is good, a threshold is enforced on RCP (Ai). We set the RCP average calculation to be determined as our r threshold.

The simplifying assumption we make is that P(Ai) (t k+1| t1, t2, t3, …, tk) is approximately given by RCP (Ai), i.e., we assume:

P(Ai) (t k+1| t1, t2, t3, …, tk)⩬ Q(Ai) ⩬ RCP (Ai)

and ignore the time factor altogether, i.e., we assume that R(Ai)is a fixed constant r for all products Ai. Given a customer’s history of product purchases (including repeat purchases), we estimate the probability of the customer repeating purchasing a product as a function of time from their last purchase of that product

PAi (t k+1 = t | t1, t2, t3, …, tk) ⩬ QAi x RAi (t k+1 = t | t1, t2, t3, …, tk, Ai = 1)

where Q(Ai) is the repeat purchase probability of a customer buying a product at (k + 1) th time given that they have bought it k times and the second term R(Ai) is the distribution of tk+1, conditioned on the customer repurchasing that product; indicated by Ai = 1.

We divide our data into two categories, test set, and training set.

Next, we estimate the user’s repeat purchase rate using the Empirical Bayesian Model, where the evidence is distributed as a Poisson and the prior on λ is a gamma prior. Hence this is also called the Poisson-Gamma model (PG). The NBD model is based on the following assumptions:

Customer’s repeat purchases follow a homogeneous Poisson’s process with repeat purchase rate λ, in which successive repeat purchases are not correlated with each other.
Gamma prior on λ, i.e., assume that λ across all customers follows a Gamma distribution with shape α and rate β

Thus, the NBD model is a Bayesian model where the evidence is distributed as a Poisson and the prior on λ is a gamma prior. Hence this is also called the Poisson-Gamma model (PG). PG model uses the customer’s own personalized repeat behavior and thus its predictions will be more accurate when a longer time frame is considered. In the PG model, the parameters of the product-specific gamma distributions are estimated in an empirical fashion by fitting them to the maximum likelihood estimates of the purchase rates of repeat purchasing customers. Then, a Bayesian estimate of the customer’s repeat purchase rate is performed by combining the prior distribution with the customer’s own past purchase history using the following formula:

where αAi and βAi are the shape and rate parameters of the gamma prior of product Ai; k is the number of purchases of product Ai by customer Cj, and t is elapsed time between the first purchase of product Ai by customer Cj and the current time. This leads to a recommendations model where RAi is assumed to be a Poisson distribution where the rate parameter is estimated using the formula above and the probability mass is estimated using:

where m is the number of expected future purchases. The second assumption we make is that QAi is a fixed constant q for all products Ai at any given time t. In this analysis, we set the QAi as the RCP value for each product recommended to users. Finally, recommendations are generated by considering all the repeat purchasable products previously bought by customers and ranking them in the descending order of their estimated probability density PAi (t) at a given time t.

Evaluation by Precision

We analyze the values of precision to measure the fraction of recommendations that are relevant. This evaluation is also used for comparing our new buy again model with the current model. For recommendations, we look at recommendations at a specific first rank that implicate the time we should recommend the product to our users. We calculate the model precision using the following formula:

where the number of order repeat purchasable products each user is the order came from the interval time repeat purchase recommendations model (10 days) and the number of repeat purchase recommendations is total repeat purchase recommendation each product and users.

Furthermore, we also take into account the specific time users repurchase the product. We calculate using the following formula:

where the number of order repeat purchasable products each user at a specific given time (t) is the order came from the time when the user repeated purchase product and the number of repeat purchase recommendations is total repeat purchase recommendation for each product and users.

Impact

These recommendations generated by the ‘Buy It Again’ new model can boost repeat purchase time relevancy of products recommendation to users, which means these new algorithms are more relevant to users’ repeat purchase behavior than current algorithms.

By using this new ‘Buy It Again’ model, affect the cost-efficiency in buy again push notifications to users. This is inferred by every 10 days, we will get unique users and products that are predicted to order repeat purchasable products based on their historical purchases. Of these recommendations, we can narrow down the product that predicted to be repeated purchasable. Furthermore, we see the total relevant order for new models gives a higher value rather than using a random recommendation. This proves that our new model not only keeps the cost down for rolling out this widget to our users but also it gives higher results in order.

Next Action

This Poisson-gamma model can also be applied to other sectors, mainly to answer the “when” question. Promo automation is one example. By applying this model, we can answer such as “When we should send the promo/email marketing to users to reach them back?”. Based on the probability to rebuy, we can segment the user to deliver a personalized campaign based on their engagement. Combined with the predicted lifetime value, we can send a more efficient message, which helps in optimizing ROI on retention campaigns.

Thank you!

you can reach me through:

Github: https://github.com/auliamuthia
Linkedin: https://www.linkedin.com/in/brojid/
Email: aulia.muthia@gmail.com

References

Buy It Again: Modeling Repeat Purchase Recommendations

Repeat purchasing, i.e., a customer purchasing the same product multiple times, is a common phenomenon in retail. As…

www.amazon.science

Looking at retention & lifetime value with data science

In this post, we will present a little bit about the Beta-Geometric/Negative Binomial (BG/NBD) model and the…

medium.com

Understanding the Customer Lifetime Value with Data Science

Elizaveta Lebedeva, Data Scientist at Taxify