Abstract
More brick-and-mortar retailers open an online channel to increase sales. Often, they use the store to fulfil online orders and to receive returned products. The uncertain product returns however complicate the replenishment decision of a retailer. The inventory also has to be rationed over the offline and online sales channels. We therefore integrate the rationing and ordering decisions of an omni-channel retailer in a Markov Decision Process (MDP) that maximises the retailer's profit. Contrary to previous studies, we explicitly model multi-period sales-dependent returns, which is more realistic and leads to higher profit and service levels. With Value Iteration (VI) an exact solution can only be computed for relatively small-scale instances. For solving large-scale instances, we constructed a Deep Reinforcement Learning (DRL) algorithm. The different methods are compared in an extensive numerical study of small-scale instances to gain insights. The results show that the running time of VI increases exponentially in the problem size, while the running time of DRL is high but scales well. DRL has a low optimality gap but the performance drops when there is a higher level of uncertainty or if the profit trade-off between different actions is minimal. Our approach of modelling multi-period sales-dependent product returns outperforms other methods. Furthermore, based on large-scale instances, we find that increasing online returns lowers the profit and the service level in the offline channel. However, longer return windows do not influence the retailer's profit.
Original language | English |
---|---|
Pages (from-to) | 1248-1263 |
Journal | European Journal of Operational Research |
Volume | 306 |
Issue number | 3 |
Early online date | 2022 |
DOIs | |
Publication status | Published - May 2023 |
Keywords
- Deep reinforcement learning
- Inventory
- Markov decision process
- Online returns
- Replenishment & rationing