Production scheduling in a wide range of batch plants involves minimizingtardiness of batches already scheduled when inserting new orders. This problemis addressed here as learning an "order insertion policy" using intensivesimulations in the framework of approximate dynamic programming (ADP).Simple insertion operators are defined and the values of choosing them atdifferent schedule states found by the incoming order are learnt using a Qlearningalgorithm. To generalize values of insertion operators across schedulestates a locally weighting regression technique is used. Results obtainedhighlight that simulation-based heuristic learning is very appealing to increaseresponsivenes of scheduling and planning systems in disruptive event handling.
展开▼