AscensionAI

Experiment Reports

Reproducible summaries for the public AscensionAI snapshot. Raw logs, checkpoints, and rollouts stay out of git; this page links to the publishable reports and registry.

001 BC Baseline

Behavior cloning collection and supervised warm-start training metrics.

Read report

002 Parallel PPO

Parallel rollout collection, offline trainer updates, and stale-rollout behavior.

Read report

003 Fixed-Seed Eval

Heuristic, BC, and PPO checkpoint comparison on deterministic seeds.

Read report

004 Long PPO Eval

~2,500 PPO games, 311 trainer updates, and the May 14 fixed-seed comparison.

Read report

005 4,136 PPO Eval

4,136 PPO games, 515 trainer updates, and the May 16 150-game evaluation.

Read report

Snapshot Summary

RunGamesAverage FloorAverage RewardNotes
Heuristic 150-game eval15015.788.44Current reference policy on the wider eval.
BC checkpoint 150-game eval15012.81-0.55Playable warm start below heuristic.
PPO checkpoint 150-game eval15014.702.37Still above BC, but below heuristic on the wider sample.

Open machine-readable experiment registry