Promote cap48 guidance once the third seed confirmed the stable winner
Constraint: Strategy guidance had to wait until the full seed=999 report landed and all three cap48 runs could be aggregated consistently Rejected: Keep treating cap48 as unresolved | The third seed now confirms high_energy repeats the same score while hybrid remains volatile Confidence: high Scope-risk: narrow Directive: Treat high_energy as the cap48 default only within the documented FMA smoke condition until larger cap64 and bucketed benchmarks either confirm or overturn it Tested: Verified seed=999 report.json, high_energy eval.json, hybrid eval.json, and computed three-seed aggregate showing high_energy mean_top1=0.9167 with zero variance versus hybrid mean_top1=0.8750 Not-tested: cap64-or-larger benchmarks, bucket/style-aware evaluations, and any future hybrid redesign
Showing
4 changed files
with
48 additions
and
13 deletions
-
Please register or sign in to post a comment