Stochastic dynamic programming is a technique for multi-period decision making under uncertainty. A classic illustration is the case of a risk neutral gambler entering a game with a stake x and a goal of leaving the game with a stake N > x. The gambler places a wager from his or her current stake, then either wins an equal amount with probability p or loses the wager with probability q. The game continues until the gambler either reaches the goal or goes bust. The gambler cares only about the expected value and not about the range of possible outcomes and therefore is said to be risk neutral. The problem is to find a betting strategy maximizing the expected value of the game.
The usual setup for solving this problem involves Bellman's 'principle of optimality'. The optimality criterion establishes a recursive expression for the expected value of the game that is a function of the stake. The 'Bellman equation' can be solved for an optimal strategy by a well-known technique called 'policy iteration'. Alternatively, the same equation can be solved by linear programming as demonstrated in the GMPL file RNGambling.mod.
A related example is the case of the risk averse gambler. The risk averse gambler enters the game for a predetermined number of rounds with a goal of maximizing the expected 'utility' of the final outcome. Utility is a concave function of wealth that more heavily penalizes losses than gains. The GMPL model RAGambling.mod demonstrates that this problem can also be solved with linear programming. This example is a closely related to the well known 'Kelly Criterion' (or here for a popular account).