Generative Ai Planning Robustness

Authors

  • Venkata Harikishan Koppuravuri Author

Keywords:

autoregressive forecasting, multimodal agents, self-reflection, hallucination mitigation, information theory.

Abstract

Recent studies of Claude have shown that large language models can perform implicit forward simulation, meaning they pre-emptively plan rhymes, goals, and syntactic paths before emitting tokens. Although there is evidence of internal planning, the cumulative deviation between a model's initial latent plan and its final output is called planning drift and prior work fails to define or quantify it. This paper presents a rigorous system for analyzing and regulating planning drift in an autoregressive generative system. Drift Entropy quantifies distributional drift, which is  δt=〖∑_K^T〗_(=1)^1 KL(Pk^((0))∥P_k^((t))) to quantify distributional drift in the predicted token probabilities in generation horizons and modalities. The MPC-1k benchmark enables empirical validation, which includes 1,000 expertly annotated multimodal planning chains which combine text, image and code with ground-truth plan representations and hallucination annotations. The Reflection-as-Constraint (RaC) protocol, which periodically injects self-reflective tokens to limit policy execution, is proposed and tested in LLaMA-3, LLaVA, CodeLlama, and Chameleon-70B, achieving an average 41% reduction in drift. There is a strong positive correlation between drift and hallucination (ρ=0.87,p<0.001), and early drift at step 50 yields an AUC of 0.91 for predicting hallucination. RaC is always able to improve the drift reduction of seven out of eight multimodal tasks, outperforming Chain-of-Thought and Tree-of-Thought baselines. The framework defines the nature of internal planning and its failure modes, providing drift-safe alignment and improved robustness in agentic multimodal systems.

Downloads

Published

2025-11-06