C6 – Fast, Safe, and Perspicuous Run-Time Planning

In dynamic environments, at design-time it is hard or impossible to model and tackle all possible circumstances that may arise; run-time planning serves to take decisions flexibly given current events. To do so under time pressure, a trained action policy π – an ML classifier mapping states to actions, ideally a deep neural network (DNN) – can be employed. Our mission is to help tackle the perspicuity and safety issues arising in this context. We aim at a form of model-based “sandboxing”, safeguarding π via run-time lookahead search and synergy with design-time analysis, debugging π via static and dynamic analyses of the behaviours it induces, supporting the manual understanding of π through zoomable policy-behaviour visualisations, plausibilising π’s decisions through plan dependencies and the effects that alternate choices would have. To these ends, among others we are advancing techniques for nogood learning, supervised learning of classifiers respecting safety constraints, predicate abstractions over MLpolicy behaviour, test-case generation, and subgoal structure analysis. For visualisation, we are tightly cooperating with Project E4.