Introduction

This package contains functions for applying choice models based on economic theory. Models can be applied to data from choice experiments (‘Conjoint Analysis’), from purchase transaction histories or a combination of both. The package contains functions for estimation and demand prediction.

By and large, it uses ‘tidy’ data and integrated nicely into many ‘tidyverse’ functions. The idea is to facilitate easy and readable choice modeling workflows.

All estimation and prediction functions are implemented in c++ and use openMP for multithreading support (i.e., this package is fast).

Overview of functions

Functions that relate to discrete demand start in dd_, while functions for volumetric demand start in vd_. Estimation functions continue in est, demand simulators in dem. Universal functions (discrete and volumetric choice) start in ec_.

Here is a quick overview of key functions:

Estimation

  • dd_est_hmnl: Discrete Choice (HMNL)
  • dd_est_hmnl_screen: Discrete Choice, attribute-based screening (not including price)
  • dd_est_hmnl_screenpr: Discrete Choice, attribute-based screening (including price)
  • vd_est_vd: Volumetric demand, EV1 errors
  • vd_est_vdmn: Volumetric demand, Normal errors
  • vd_est_vdm_screen: Volumetric demand, attribute-based screening, Normal errors
  • vd_est_vdm_screenpr: Volumetric demand, attribute-based screening including price, Normal errors
  • vd_est_vdm_ss: Volumetric demand, accounting for set-size variation (1st order), EV1 errors
  • vd_est_vdm_ssq: Volumetric demand, accounting for set-size variation (2nd order), EV1 errors

Demand

  • dd_dem: Discrete Choice (HMNL)
  • dd_dem_sr: Discrete Choice, attribute-based screening (not including price)
  • vd_dem_vdm: Volumetric demand, EV1 errors
  • vd_dem_vdmn: Volumetric demand, Normal errors
  • vd_dem_vdmnsr: Volumetric demand, attribute-based screening, Normal errors
  • vd_dem_vdmsrpr: Volumetric demand, attribute-based screening including price, Normal errors
  • vd_dem_vdmss: Volumetric demand, accounting for set-size variation (1st order), EV1 errors
  • vd_dem_vdmssq: Volumetric demand, accounting for set-size variation (2nd order), EV1 errors

Other key functions

  • ec_estimates_MU: Upper-level summaries (Part-Worths)
  • ec_trace_MU: Part-Worth traceplots
  • ec_boxplot_MU: Part-Worth boxplots
  • ec_dem_eval: Evaluate hold-out predictions

To-do-list

The package is work in progress. Some of the next things to do on the list are:

  • Priors: Specifying hyperparameters, providing LKJ Prior alternative. Currently weakly-informative conjugate priors are assumed.
  • Upper level regression: Provide a somewhat fail-safe way to put in upper-level covariates
  • Continuous attributes alongside categorical attributes
  • Optimization (find optimal product configurations)
  • Economic value measures
  • Maybe: Mixture of Normal Heterogeneity

Volumetric demand modeling

The MCMC algorithm for the volumetric demand model is implemented in vd_est_vdm. The model can be applied to data from volumetric conjoint studies, or from unit-level demand data, such as household panel data. This vignette provides exposition of the model, input data structure, and an example, including model estimation and evaluation.

Model

We assume that subjects maximize their utility from consuming inside goods \(x\) and the outside good \(z\)

\[u\left( {{\bf{x}},z} \right) = \sum\limits_k {\frac{{{\psi _k}}}{\gamma }} \ln \left( {\gamma {x_k} + 1} \right) + \ln (z)\] subject to a budget constraint \[\sum\limits_k^{} {{p_k}} {x_k} + z = E\]

where \(\psi_k\) is the baseline utlity of the \(k\)-th good, and \(\gamma\) is a parameter that governs the rate of satiation of inside goods. It is assumed that the rate of satiation \(\gamma\) is the same for all inside goods.

In this implementation, we assume that \[\psi_j = \exp[a_j'\beta + \varepsilon_j]\] where \[\varepsilon \sim \text{Type I EV}(0,\sigma)\]

While the assumption of the EV1 error term is common, there are also implementations assuming Normal-distributed errors.

The marginal utility for the inside and outside goods is:

\[ \begin{array}{*{20}{l}} {{u_j}}&{ = \frac{{\partial u\left( {{\bf{x}},z} \right)}}{{\partial {x_j}}} = \frac{{{\psi _j}}}{{\gamma {x_j} + 1}}{\rm{ }}}\\ {{u_z}}&{ = \frac{{\partial u\left( {{\bf{x}},z} \right)}}{{\partial z}} = \frac{1}{z}} \end{array} \]

Solving for \(\varepsilon_j\) leads to the following expression for the KT conditions:

\[\begin{array}{*{20}{l}} {}&{{\varepsilon _j} = {g_j}\quad {\rm{if}}\quad {x_j} > 0}\\ {}&{{\varepsilon _j} < {g_j}\quad {\rm{if}}\quad {x_j} = 0} \end{array}\]

where \[{g_j} = - {{\bf{a}}_{j'}}\beta + \ln (\gamma {x_j} + 1) + \ln \left( {\frac{{{p_j}}}{{E - {{\bf{p}}^\prime }{\bf{x}}}}} \right)\]

We are able to obtain a closed-form expression for the probability that \(R\) of \(N\) goods are chosen.

\[ x_1,x_2,\dots,x_R > 0, \qquad x_{R+1}, x_{R+2}, \dots, x_N = 0. \]

The error scale \((\sigma\)) is identified in this model because price enters the specification without a separate price coefficient The likelihood \(\ell(\theta)\) of the model parameters is proportional to the probability of observing \(n_1\) chosen goods and \(n_2\) goods with zero demand The contribution to the likelihood of the chosen goods is in the form of a probability density while the goods not chosen contribute as a probability mass.

\[\begin{array}{l} \ell (\theta ) \propto p({x_{{n_1}}} > 0,{x_{{n_2}}} = 0|\theta )\\ = |{J_R}|\int_{ - \infty }^{{g_N}} \cdots \int_{ - \infty }^{{g_{R + 1}}} f ({g_1}, \ldots ,{g_R},{\varepsilon _{R + 1}}, \ldots ,{\varepsilon _N})d{\varepsilon _{R + 1}}, \ldots ,d{\varepsilon _N}\\ = |{J_R}|\left\{ {\prod\limits_{j = 1}^R {\frac{{\exp ( - {g_j}/\sigma )}}{\sigma }} \exp \left( { - {e^{ - {g_j}/\sigma }}} \right)} \right\}\left\{ {\prod\limits_{i = R + 1}^N {\exp } \left( { - {e^{ - {g_i}/\sigma }}} \right)} \right\}\\ = |{J_R}|\left\{ {\prod\limits_{j = 1}^R {\frac{{\exp ( - {g_j}/\sigma )}}{\sigma }} } \right\}\exp \left\{ { - \sum\limits_{i = 1}^N {\exp } ( - {g_i}/\sigma )} \right\} \end{array}\]

where \(|J_{R}|\) is the Jacobian of the transformation from random-utility error (\(\varepsilon\)) to the likelihood of the observed data. In this model, the Jacobian is: \[\left| {{J_R}} \right| = \prod\limits_{k = 1}^R {\left( {\frac{\gamma }{{\gamma {x_k} + 1}}} \right)} \left\{ {\sum\limits_{k = 1}^R {\frac{{\gamma {x_k} + 1}}{\gamma }} \cdot\frac{{{p_k}}}{{E - {{\bf{p}}^\prime }{\bf{x}}}} + 1} \right\}\]

The ‘lower level’ of the hierarchical model applies the direct utility model to a specific respondent’s choice data, and the ‘upper level’ of the model incorporates heterogeneity in respondent coefficients.

Respondent heterogeneity can be incorporated into conjoint analysis using a variety of random-effect models. We use a Normal model for heterogeneity by default.

Denoting all individual-level parameters \(\theta_h\) for respondent \(h\) we have: \[ \theta_h \sim \text{Normal}(\bar\theta, \Sigma) \] where \[\theta_h =\left\{ {\beta_h', \gamma_h, E_h,\sigma_h} \right\}\]It is possible to add covariates (\(z\)) to the mean of the heterogeneity distribution as in a regression model: \[ \theta_h \sim \text{Normal}(\Gamma z_h, \Sigma) \]The parameters \(\bar\theta\), \(\Gamma\) and \(\Sigma\) are referred to as hyper-parameters because they describe the distribution of other parameters.
Covariates in the above expression might include demographic variables or other variables collected as part of the conjoint survey, e.g., variables describing reasons to purchase.
rVDev implements a multivariate Regression prior. If no covariates are provided, \(z\) simply contains a vector of 1s, and \(\Gamma\) is equivalent to \(\bar \theta\) in the MVN heterogeneity model.

Priors

Prior is stored in a list. It’s elements are (Gammabar, AGamma, nu, V).

Natural conjugate priors are specified:

\[ \theta \sim MVN({\Gamma}z_h, \Sigma) \] \[\Gamma \sim (\bar \Gamma, A_{\Gamma}^{-1})\]

\[ \Sigma \sim IW(\nu, V) \]

This specification of priors assumes that \(\Gamma\) is independent of \(\Sigma\) and that, conditional on the hyperparameters, the \(\theta_i\)’s are a priori independent.

By default, Priors are set to be weakly informative, with an emphasis on ‘weakly’.

Example: Ice cream conjoint

Here we demonstrate the implementation using the icecream ice cream data available within the package. The dataset contains volumetric choice data for 300 respondents who evaluated packaged ice cream.

Data

The package is built around the idea of using long format data. This makes it easy to select and manipulate data, whether it is coming from a choice experiment or transaction data. The structure of choice data in echoice2 is simple:

Choice data data.frames or tibbles need to contain the following columns:

  • id (integer; respondent identifier)
  • task (integer; task number)
  • alt (integer; alternative number within task)
  • x (double; quantity purchased)
  • p (double; price)
  • attributes defining the choice alternatives (factor)

Continuous attributes are currently not supported, but forthcoming.

The icecream dataset is stored in a tibble:

  data(icecream)
  icecream %>% head
## # A tibble: 6 x 8
##      id  task   alt     x     p Brand     Flavor      Size 
##   <int> <int> <int> <dbl> <dbl> <fct>     <fct>       <ord>
## 1     1     1     1     8 0.998 Store     Neapolitan  16   
## 2     1     1     2     0 0.748 Store     VanillaBean 16   
## 3     1     1     3     0 1.25  BenNJerry Oreo        16   
## 4     1     1     4     0 0.748 BenNJerry Neapolitan  16   
## 5     1     1     5     0 2.49  HaagenDa  RockyRoad   4    
## 6     1     1     6     0 1.25  HaagenDa  Oreo        16

ec_summarize_attrlvls provides a quick glance at the attributes and levels (for categorical attributes).

icecream %>% ec_summarize_attrlvls
## # A tibble: 3 x 2
##   attribute levels                                                              
##   <chr>     <chr>                                                               
## 1 Brand     BenNJerry, BlueBell, BlueBunny, Breyers, Dryers, HaagenDa, Store    
## 2 Flavor    Chocolate, ChocChip, ChocDough, CookieCream, Neapolitan, Oreo, Rock~
## 3 Size      4, 8, 16

Storing choice data in a ‘long’ format allows leveraging all the functionality of dplyr and other tidyverse packages. We can, for instance, quickly generate a summary of average choice quantities per task and plot a histogram. Most respondents choose 5 or less packs of 4 servings per task.

icecream %>% 
  arrange(id,task,alt) %>%
  group_by(id,task) %>%  
  summarise(units=sum(x)) %>%
  group_by(id) %>% summarise(mean_units=mean(units)) %>%
  ggplot(aes(x=mean_units)) + geom_histogram()

Holdout

For hold-out validation, it is common to keep 1 or 2 tasks per respondent. Here, we select 1 task per respondent. Identifiers of the hold-out tasks are stored in ho_tasks. ice_cal and ice_ho contain the calibration and holdout portion of the icecream choice dataset.

#select holdout tasks
  set.seed(8438453)
      ho_tasks=
      icecream %>%
        distinct(id,task) %>%
        mutate(id=as.integer(id))%>%
        group_by(id) %>%
        summarise(task=sample(task,1), .groups = 'drop')
  set.seed(NULL)

#split data
  ice_cal= icecream %>% mutate(id=as.integer(id)) %>%
    anti_join(ho_tasks, by=c('id','task'))
  
  ice_ho= icecream %>% mutate(id=as.integer(id)) %>%
    semi_join(ho_tasks, by=c('id','task'))

Estimation

The standard volumetric demand model is implemented in vd_est_vdm. Other volumetric demand models are implemented in functions starting in vd_est_. Discrete choice models are implemented in functions starting in dd_est_.

In this example, only 20,000 draws are generated to speed up vignette compilation. It takes about a half minute on a modern 16-core PC.

Estimation uses the simple Metropolis-Hastings algorithm, which is very fast but may require thinning. The function automatically determines the number of physical CPU cores and uses openMP for parallel computation. By default, 100,000 draws are generated and every 10th draw is saved. For real-world applications, more draws (R) and more thinning (keep) is recommended.

 icecream_est = ice_cal %>% vd_est_vdm(R=20000)
##  MCMC in progress 
##  Total Time Elapsed: 0.32 minutes

Estimates

Summaries of the upper-level coefficients can be obtained using the ec_estimates_MU function.

icecream_est %>% ec_estimates_MU 
## # A tibble: 21 x 12
##    attribute lvl   par      mean     sd `CI-5%` `CI-95%` sig   model error
##    <chr>     <chr> <chr>   <dbl>  <dbl>   <dbl>    <dbl> <lgl> <chr> <chr>
##  1 <NA>      <NA>  int   -3.25   0.384   -3.47   -3.07   TRUE  VD-c~ EV1  
##  2 Brand     Blue~ Bran~ -0.815  0.151   -1.02   -0.603  TRUE  VD-c~ EV1  
##  3 Brand     Blue~ Bran~ -0.706  0.140   -0.911  -0.500  TRUE  VD-c~ EV1  
##  4 Brand     Brey~ Bran~ -0.0996 0.0857  -0.242   0.0473 FALSE VD-c~ EV1  
##  5 Brand     Drye~ Bran~ -0.645  0.116   -0.816  -0.497  TRUE  VD-c~ EV1  
##  6 Brand     Haag~ Bran~ -0.386  0.0940  -0.534  -0.244  TRUE  VD-c~ EV1  
##  7 Brand     Store Bran~ -0.581  0.115   -0.745  -0.407  TRUE  VD-c~ EV1  
##  8 Flavor    Choc~ Flav~ -0.396  0.119   -0.592  -0.194  TRUE  VD-c~ EV1  
##  9 Flavor    Choc~ Flav~ -0.418  0.124   -0.616  -0.218  TRUE  VD-c~ EV1  
## 10 Flavor    Cook~ Flav~ -0.413  0.111   -0.590  -0.238  TRUE  VD-c~ EV1  
## # ... with 11 more rows, and 2 more variables: reference_lvl <chr>,
## #   parameter <chr>

Diagnostics

Diagnostics start with a look at traceplots. Upper-level means can be checked using the ec_trace_MU function.

icecream_est %>% ec_trace_MU()

Fit

We should check in-sample and out-of-sample fit.

LMD

First, we compare in-sample fit. ec_lmd_NR computes a rough approximation of the marginal Likelihood. It does so for 4 equally-sized chunks of draws, which let’s us double-check convergence.

icecream_est %>% ec_lmd_NR
## # A tibble: 4 x 2
##    part     lmd
##   <dbl>   <dbl>
## 1  0.25   -Inf 
## 2  0.5  -12178.
## 3  0.75 -12143.
## 4  1    -12068.

Holdout

Now, we compare out of sample fit. These are the steps:

  • Ensure the hold-out data is compatible with the in-sample data. This means that the factor levels of the categorical attributes should be identical. Using the prep_newprediction we ensure that factor levels match.

  • Demand simulators are implemented in functions starting in vd_dem_. For the standard volumetric demand model, we use the vd_dem_vdm function.

  • Removing burn-in from draws and additional thinning might be required. vd_thin_draw helps do that. By default, it removes the first half of the draws.

  • vd_dem_eval helps assess fit. It compared demand predictions relative to true values. For volumetric models, MAE, MSE, RAE and bias are computed. The RAE (relative absolute error) is not widely used, but it is the closest to an absolute fit statistic. It should be smaller than 1.

#generate predictions
ho_demand=
    ice_ho %>%   
      prep_newprediction(ice_cal) %>% 
        vd_dem_vdm(icecream_est %>% vd_thin_draw) 
##  Computation in progress

ho_demand adds the .demdraws column to the hold-out data (ice_ho). Each cell in that column contains posterior draws of predicted demand.

ho_demand
## # A tibble: 3,600 x 9
##       id  task   alt     x     p Brand     Flavor       Size  .demdraws  
##  * <int> <int> <int> <dbl> <dbl> <fct>     <fct>        <ord> <list>     
##  1     1     7     1     0  1.25 Store     Neapolitan   16    <dbl [501]>
##  2     1     7     2     0  1.25 Store     CookieCream  16    <dbl [501]>
##  3     1     7     3     0  1.74 BenNJerry Neapolitan   8     <dbl [501]>
##  4     1     7     4     0  1.12 BenNJerry Oreo         16    <dbl [501]>
##  5     1     7     5     0  3.49 BlueBell  VanillaFudge 4     <dbl [501]>
##  6     1     7     6     0  1.74 BlueBell  RockyRoad    8     <dbl [501]>
##  7     1     7     7     0  1.25 BlueBunny VanillaFudge 16    <dbl [501]>
##  8     1     7     8     0  1.25 BlueBunny RockyRoad    16    <dbl [501]>
##  9     1     7     9     0  1.25 Dryers    CookieCream  16    <dbl [501]>
## 10     1     7    10     0  2.00 Dryers    Chocolate    8     <dbl [501]>
## # ... with 3,590 more rows
ho_demand %>%
  ec_dem_eval
## # A tibble: 1 x 4
##     MSE   MAE   bias   RAE
##   <dbl> <dbl>  <dbl> <dbl>
## 1  2.74 0.459 0.0310 0.856

Choice Simulator

Simulating choice quantities and shares is a key feature of echoice2. Demand simulators are implemented in functions starting in vd_dem_.

Defining a scenario

Let’s consider the Haagen-Dasz brand. They line-price most of their classic flavors. Let’s say they usually charge around $4. Overall, we have 10 flavors in this study.

my_scenario = tibble(
  id=1L,task=1L,alt=1:10, 
  Brand=  'HaagenDa',
  Flavor=c("Chocolate", "ChocChip", "ChocDough", "CookieCream", "Neapolitan", "Oreo", "RockyRoad", "Vanilla", "VanillaBean", "VanillaFudge"),
  Size=8,
  p=4)

We need to make sure factor levels match. The prep_newprediction function does that. Notice that attribute levels in my_scenario are just strings. prep_newprediction converts them to factors, with levels matching those in the calibration dataset that was used to estimate the model.

This makes it easy to define whatever scenario for ‘market’ simulation.

my_scenario <- my_scenario %>% prep_newprediction(ice_cal)

Before we can simulate choices for all our respondents, we need to apply this scenario to all 300 respondents from our study.

N = n_distinct(icecream$id)

my_market=
  tibble(
      id = rep(seq_len(N),each=nrow(my_scenario)),
      task = 1,
      alt = rep(1:nrow(my_scenario),N)
    ) %>% 
  bind_cols(
      my_scenario[rep(1:nrow(my_scenario),N),-(1:3)]
    )

Simulating choices

Now we can simulate choices:

my_market_demand <-  my_market %>%
                     vd_dem_vdm(est = icecream_est)
##  Computation in progress

The my_market_demand object contains draws of demand quantities in column .demdraws.

my_market_demand %>% head
## # A tibble: 6 x 8
##      id  task   alt     p Brand    Flavor      Size  .demdraws    
##   <int> <dbl> <int> <dbl> <fct>    <fct>       <fct> <list>       
## 1     1     1     1     4 HaagenDa Chocolate   8     <dbl [2,000]>
## 2     1     1     2     4 HaagenDa ChocChip    8     <dbl [2,000]>
## 3     1     1     3     4 HaagenDa ChocDough   8     <dbl [2,000]>
## 4     1     1     4     4 HaagenDa CookieCream 8     <dbl [2,000]>
## 5     1     1     5     4 HaagenDa Neapolitan  8     <dbl [2,000]>
## 6     1     1     6     4 HaagenDa Oreo        8     <dbl [2,000]>

ec_dem_aggregate helps aggregate demand draws - from individual-task-alternative level to Brand or Brand-Flavor-Size-level. We are not computing posterior means just yet.

my_market_demand %>%
  ec_dem_aggregate(c('Brand','Flavor','Size'))
## # A tibble: 10 x 4
## # Groups:   Brand, Flavor [10]
##    Brand    Flavor       Size  .demdraws    
##    <fct>    <fct>        <fct> <list>       
##  1 HaagenDa Chocolate    8     <dbl [2,000]>
##  2 HaagenDa ChocChip     8     <dbl [2,000]>
##  3 HaagenDa ChocDough    8     <dbl [2,000]>
##  4 HaagenDa CookieCream  8     <dbl [2,000]>
##  5 HaagenDa Neapolitan   8     <dbl [2,000]>
##  6 HaagenDa Oreo         8     <dbl [2,000]>
##  7 HaagenDa RockyRoad    8     <dbl [2,000]>
##  8 HaagenDa Vanilla      8     <dbl [2,000]>
##  9 HaagenDa VanillaBean  8     <dbl [2,000]>
## 10 HaagenDa VanillaFudge 8     <dbl [2,000]>

ec_dem_summarise computes posterior summaries, e.g. posterior means.

my_market_demand %>%
  ec_dem_aggregate(c('Brand','Flavor','Size')) %>%
  ec_dem_summarise()
## # A tibble: 10 x 8
## # Groups:   Brand, Flavor [10]
##    Brand   Flavor     Size  .demdraws   `E(demand)` `S(demand)` `CI-5%` `CI-95%`
##    <fct>   <fct>      <fct> <list>            <dbl>       <dbl>   <dbl>    <dbl>
##  1 Haagen~ Chocolate  8     <dbl [2,00~        14.9        8.93    5.51     27.0
##  2 Haagen~ ChocChip   8     <dbl [2,00~        20.9        9.25    9.95     35.8
##  3 Haagen~ ChocDough  8     <dbl [2,00~        16.7        9.02    7.49     29.1
##  4 Haagen~ CookieCre~ 8     <dbl [2,00~        13.2        8.70    4.78     25.0
##  5 Haagen~ Neapolitan 8     <dbl [2,00~        13.6        8.61    5.05     25.2
##  6 Haagen~ Oreo       8     <dbl [2,00~        11.8        8.92    3.77     21.3
##  7 Haagen~ RockyRoad  8     <dbl [2,00~        18.6        8.83    8.83     30.5
##  8 Haagen~ Vanilla    8     <dbl [2,00~        18.0        9.08    7.62     30.8
##  9 Haagen~ VanillaBe~ 8     <dbl [2,00~        21.2        9.26   10.5      36.7
## 10 Haagen~ VanillaFu~ 8     <dbl [2,00~        18.5        8.67    8.68     30.3

We can also look at other levels of aggregation, e.g. brand. Since only one brand was included in the choice scenario, we effectively obtain aggregate demand quantities for all inside goods.

my_market_demand %>%
  ec_dem_aggregate(c('Brand')) %>%
  ec_dem_summarise()
## # A tibble: 1 x 6
##   Brand    .demdraws     `E(demand)` `S(demand)` `CI-5%` `CI-95%`
##   <fct>    <list>              <dbl>       <dbl>   <dbl>    <dbl>
## 1 HaagenDa <dbl [2,000]>        168.        64.6    132.     198.

Demand Curves

Elasticities in volumetric demand models are not constant. To better understand implications of pricing decisions, demand curves can be a helpful tool.

The ec_demcurve function can be applied to all demand simulators. Based of an initial market scenario, it generates a series of scenarios where the price of a focal product is changed over an interval. It then runs the demand simulator several times, and we can use the output to draw a demand curve.

Pre-simulating error terms using ec_gen_err_ev1 helps to smooth these demand curves. It generates one error term per draw for each of the alternatives and tasks in the corresponding design dataset.

##  Computation in progress 
##  Computation in progress 
##  Computation in progress 
##  Computation in progress 
##  Computation in progress

The output of ec_demcurve is a list, containing demand summaries for each product under the different price scenarios. The list can be stacked into a data.frame, which can be used to generate plots.

my_demcurve %>% do.call('rbind',.) %>%
  ggplot(aes(x=scenario,y=`E(demand)`,color=Flavor)) + geom_line()