001 BC Baseline
Behavior cloning collection and supervised warm-start training metrics.
Read reportReproducible summaries for the public AscensionAI snapshot. Raw logs, checkpoints, and rollouts stay out of git; this page links to the publishable reports and registry.
Behavior cloning collection and supervised warm-start training metrics.
Read reportParallel rollout collection, offline trainer updates, and stale-rollout behavior.
Read reportHeuristic, BC, and PPO checkpoint comparison on deterministic seeds.
Read report~2,500 PPO games, 311 trainer updates, and the May 14 fixed-seed comparison.
Read report4,136 PPO games, 515 trainer updates, and the May 16 150-game evaluation.
Read report| Run | Games | Average Floor | Average Reward | Notes |
|---|---|---|---|---|
| Heuristic 150-game eval | 150 | 15.78 | 8.44 | Current reference policy on the wider eval. |
| BC checkpoint 150-game eval | 150 | 12.81 | -0.55 | Playable warm start below heuristic. |
| PPO checkpoint 150-game eval | 150 | 14.70 | 2.37 | Still above BC, but below heuristic on the wider sample. |