DeepMind's AlphaProof combines a language model with AlphaZero-style reinforcement learning to produce fully machine-verifiable mathematical proofs — achieving silver-medal level at the 2024 International Mathematical Olympiad.
AI recursive self-improvement is already happening in production (Constitutional AI, RLHF with AI feedback, automated evaluators) — but the full recursive loop where AI autonomously generates stronger successors remains constrained by evaluation reliability and alignment gaps.
The scarcest resource in embodied AI isn't compute or algorithms — it's high-quality demonstration data recorded in real physical environments at scale.