Preserve the first cap64 score before the second strategy finishes
Constraint: The cap64 run has only produced the high_energy leg so far, so any larger conclusion must wait for hybrid and the final report Rejected: Wait for report.json before checkpointing | Would lose the verified cap64 high_energy score and the proof that execution has already switched into the hybrid branch Confidence: high Scope-risk: narrow Directive: Do not compare cap64 strategy winners until both legs and the final report land; treat the current 0.625 high_energy score as an intermediate checkpoint only Tested: Verified high_energy eval.json reports num_queries=32, top1=0.625, topk=1.0; verified progress.json records the same result; verified the active process has switched to the hybrid smoke-local branch and report.json is still absent Not-tested: Final cap64 hybrid metrics, final report.json, and any cap64-based strategy conclusion
Showing
4 changed files
with
28 additions
and
5 deletions
| 1 | ## 2026-06-02 cap64 high_energy 首个结果 checkpoint | ||
| 2 | |||
| 3 | 完成项: | ||
| 4 | - 已拿到 cap64 中 `high_energy` 的首个评测结果。 | ||
| 5 | - 主流程已从 `high_energy` 切换到 `hybrid` 分支,说明 cap64 仍在继续。 | ||
| 6 | |||
| 7 | 验证证据: | ||
| 8 | - `high_energy/fma_reports_smoke/eval.json`: | ||
| 9 | - `num_queries=32` | ||
| 10 | - `top1=0.625` | ||
| 11 | - `topk=1.0` | ||
| 12 | - `progress.json` 已同步记录同一结果。 | ||
| 13 | - 当前进程显示: | ||
| 14 | - `external_adapters.py smoke-local ... /tmp/ab_smoke_seg_cap64_top2/hybrid` | ||
| 15 | - `manifest_tools.py audio-dir-to-splits ... --query-strategy hybrid` | ||
| 16 | |||
| 17 | 说明: | ||
| 18 | - 截至本 checkpoint,`hybrid` 结果尚未生成,总 `report.json` 也尚未生成。 | ||
| 19 | |||
| 1 | ## 2026-06-02 cap64 索引完成并进入评测 checkpoint | 20 | ## 2026-06-02 cap64 索引完成并进入评测 checkpoint |
| 2 | 21 | ||
| 3 | 完成项: | 22 | 完成项: | ... | ... |
| ... | @@ -68,3 +68,5 @@ cd /workspace/acr-engine | ... | @@ -68,3 +68,5 @@ cd /workspace/acr-engine |
| 68 | - 已记录 cap64 benchmark 已启动,并确认进入 `high_energy` 训练阶段。 | 68 | - 已记录 cap64 benchmark 已启动,并确认进入 `high_energy` 训练阶段。 |
| 69 | 69 | ||
| 70 | - 已补充 cap64 新鲜证据:`high_energy` 索引完成(`64 refs / 657 windows / 192-d`)并进入 `evaluate.py`。 | 70 | - 已补充 cap64 新鲜证据:`high_energy` 索引完成(`64 refs / 657 windows / 192-d`)并进入 `evaluate.py`。 |
| 71 | |||
| 72 | - 已补充 cap64 首个结果:`high_energy = top1 0.625 / topk 1.0 / num_queries 32`,并记录主流程切换到 `hybrid`。 | ... | ... |
| ... | @@ -60,5 +60,6 @@ test -f /tmp/ab_smoke_seg_cap48_top2_seed999/report.json && cat /tmp/ab_smoke_se | ... | @@ -60,5 +60,6 @@ test -f /tmp/ab_smoke_seg_cap48_top2_seed999/report.json && cat /tmp/ab_smoke_se |
| 60 | ## 下一轮已启动 | 60 | ## 下一轮已启动 |
| 61 | 61 | ||
| 62 | - 新 benchmark:`/tmp/ab_smoke_seg_cap64_top2` | 62 | - 新 benchmark:`/tmp/ab_smoke_seg_cap64_top2` |
| 63 | - 当前阶段:`high_energy` 索引已完成,现处于 evaluate 中 | 63 | - 当前阶段:`high_energy` 已完成评测,结果为 `top1=0.625 / topk=1.0 / num_queries=32` |
| 64 | - 下一 session 应优先检查 `report.json` 是否生成 | 64 | - 当前已切换到 `hybrid` 分支 |
| 65 | - 下一 session 应优先检查 `hybrid` 结果与 `report.json` 是否生成 | ... | ... |
| ... | @@ -240,7 +240,7 @@ | ... | @@ -240,7 +240,7 @@ |
| 240 | - `hybrid`:`mean_top1=0.8750, min=0.7917, max=0.9583, stdev=0.0680` | 240 | - `hybrid`:`mean_top1=0.8750, min=0.7917, max=0.9583, stdev=0.0680` |
| 241 | 241 | ||
| 242 | ### 最优先待办 | 242 | ### 最优先待办 |
| 243 | 1. 跟进正在运行的 cap64 benchmark:`/tmp/ab_smoke_seg_cap64_top2/report.json`。 | 243 | 1. 跟进 cap64 的 `hybrid` 结果与最终 `/tmp/ab_smoke_seg_cap64_top2/report.json`。 |
| 244 | 2. 在 cap64 完成后更新 `open-dataset-workflow.md / session-handoff.md / CHANGELOG.md`。 | 244 | 2. 在 cap64 完成后更新 `open-dataset-workflow.md / session-handoff.md / CHANGELOG.md`。 |
| 245 | 3. 接着增加 bucket/style-aware benchmark。 | 245 | 3. 接着增加 bucket/style-aware benchmark。 |
| 246 | 4. 继续优化 `hybrid`,重点降低波动并提升 hard case 稳定性。 | 246 | 4. 继续优化 `hybrid`,重点降低波动并提升 hard case 稳定性。 |
| ... | @@ -675,6 +675,7 @@ seed123 最终结论: | ... | @@ -675,6 +675,7 @@ seed123 最终结论: |
| 675 | - 已启动:`/tmp/ab_smoke_seg_cap64_top2` | 675 | - 已启动:`/tmp/ab_smoke_seg_cap64_top2` |
| 676 | - 配置:`subset_size=64`, `max_test_queries=32`, `seed=42` | 676 | - 配置:`subset_size=64`, `max_test_queries=32`, `seed=42` |
| 677 | - 当前最新证据: | 677 | - 当前最新证据: |
| 678 | - `high_energy` reference index 已完成:`64 refs / 657 windows / 192-d` | 678 | - `high_energy` 已完成评测:`num_queries=32, top1=0.625, topk=1.0` |
| 679 | - 当前已进入 `evaluate.py`,尚未产出最终 `report.json` | 679 | - cap64 主流程已切换到 `hybrid` 分支 |
| 680 | - 总 `report.json` 尚未生成 | ||
| 680 | 681 | ... | ... |
-
Please register or sign in to post a comment