Adaptive Convex Enveloping for Multivariate Convex Dynamic Programming Open Access
Downloadable ContentDownload PDF
We propose a new method, called Adaptive Convex Enveloping, for solving convex stochastic dynamic programs (DP) with multidimensional continuous state and decision variables. The new method approximates the value functions with supporting hyperplanes, which is convexity-preserving. As a result, the method is able to handle large numbers of decision variables and constraints with the speed and reliability of convex optimizations. To construct a supporting hyperplane approximation, we use a recursive partitioning algorithm, which "learns" the shape of the underlying function and adds supporting hyperplanes adaptively. The approximation so constructed uses supporting hyperplanes economically, and guarantees that the error is less than a tolerance for all points in a continuous state space.Error bounds of the solution can be developed using convexity. Both the optimal value function and the value function of the obtained policy are bounded below by the solution of Adaptive Convex Enveloping, and bounded above by the solution plus the total tolerance. These bounds are directly available without having to do any simulation, and they are simple numbers, rather than analysis terms, thus they provide confidence. The performance of the obtained policy can also be simulated and used as a tighter upper bound of the optimal value function.An exciting feature of Adaptive Convex Enveloping is that it is very much standardized. It saves the user the time spent on trial and error for designing the approximation functions and tuning the parameters. For the optimization, our method calls existing mathematical optimization libraries. So to solve a DP, the user only needs to write models for the libraries to read, which is a similar experience as doing mathematical optimizations. In all, the method is convenient for modeling and solving real world problems.We apply Adaptive Convex Enveloping to management of battery exchange stations to find the optimal policy for charging a large number of electric vehicle batteries. We solve two variants of the problem: one with known but changing electricity prices and stochastic customer demands, the other with an additional option for the station to discharge the batteries in the inventory and sell electricity back to the grid.