song_0019.wav
469 KB
-
Make the benchmark pipeline produce reusable release artifacts from actual evaluation results so model iterations can be tracked, reviewed, and shipped with evidence. Constraint: Continuous training only helps if each stage emits durable reports and release metadata Rejected: Keep artifact generation as a disconnected smoke utility | would block repeatable release discipline Confidence: high Scope-risk: moderate Directive: Next iterations should improve hard-case metrics on real/whitelisted datasets and keep artifact generation on every training milestone Tested: synthetic_v2 data regeneration; 2-epoch CPU training; index build; fast evaluation JSON export; artifact generation to reports/smoke-v2/synthetic_v2 Not-tested: full melody-aware slow evaluation as release default; real external dataset benchmark generation
cnb.bofCdSsphPA authored