Checkpoint the larger cap32 benchmark before results land
Preserve the new 32-track top-two benchmark entry point and current build-index phase so a later session can continue the stronger validation run without losing runtime context. Constraint: The cap32 benchmark is still running, so only execution-state evidence is available Rejected: Wait for cap32 results before recording anything | Risks losing the larger-benchmark checkpoint if the session ends first Confidence: high Scope-risk: narrow Directive: Replace the cap32 running-state section with measured scores once hybrid eval.json and report.json land Tested: Verified active cap32 processes; verified handoff records work-root, subset size, query cap, and current build-index phase Not-tested: cap32 strategy scores because the run is still in progress
Showing
2 changed files
with
64 additions
and
0 deletions
| ... | @@ -2,6 +2,28 @@ | ... | @@ -2,6 +2,28 @@ |
| 2 | 2 | ||
| 3 | ## 2026-06-02 | 3 | ## 2026-06-02 |
| 4 | 4 | ||
| 5 | ### Stage: 启动 cap32 top2 真实 FMA 对照并记录运行阶段 | ||
| 6 | |||
| 7 | 完成项: | ||
| 8 | - 启动更大的真实 FMA top2 benchmark: | ||
| 9 | - `work_root = /tmp/ab_smoke_seg_cap32_top2` | ||
| 10 | - `subset_size = 32` | ||
| 11 | - `max_test_queries = 20` | ||
| 12 | - 策略:`hybrid` vs `high_energy` | ||
| 13 | - 更新 [session-handoff.md](./session-handoff.md) | ||
| 14 | |||
| 15 | 当前 fresh evidence: | ||
| 16 | - `scripts/ab_smoke_segmentation.py ... --work-root /tmp/ab_smoke_seg_cap32_top2` 已启动 | ||
| 17 | - 当前 first lane 为: | ||
| 18 | - `hybrid` | ||
| 19 | - 当前已进入: | ||
| 20 | - `run_demo.py build-index --resume --checkpoint-every-refs 100` | ||
| 21 | - `report.json` 尚未落盘 | ||
| 22 | |||
| 23 | 结论: | ||
| 24 | - 现在已经开始验证 cap24 结论在更大 `subset=32` 上是否继续成立 | ||
| 25 | - 即使当前 session 结束,新 session 也可直接从 handoff 中的 cap32 入口继续盯结果 | ||
| 26 | |||
| 5 | ### Stage: 收尾 cap24 top2 真实 FMA 对照并确认默认策略 | 27 | ### Stage: 收尾 cap24 top2 真实 FMA 对照并确认默认策略 |
| 6 | 28 | ||
| 7 | 完成项: | 29 | 完成项: | ... | ... |
| ... | @@ -413,6 +413,48 @@ cap24 top2 最终结论: | ... | @@ -413,6 +413,48 @@ cap24 top2 最终结论: |
| 413 | - `hybrid`:`16 / 1.0 / 1.0` | 413 | - `hybrid`:`16 / 1.0 / 1.0` |
| 414 | - `high_energy`:`16 / 0.8125 / 1.0` | 414 | - `high_energy`:`16 / 0.8125 / 1.0` |
| 415 | - 这个结果比 cap16 更能说明问题:**当前默认策略应明确固定为 `hybrid`** | 415 | - 这个结果比 cap16 更能说明问题:**当前默认策略应明确固定为 `hybrid`** |
| 416 | |||
| 417 | --- | ||
| 418 | |||
| 419 | ## 11. cap32 top2 对照实验(进行中) | ||
| 420 | |||
| 421 | 为了确认 cap24 的结论不是偶然,已继续启动更大的真实 FMA top2 对照: | ||
| 422 | |||
| 423 | ```bash | ||
| 424 | cd /workspace/acr-engine | ||
| 425 | /usr/local/miniconda3/bin/python scripts/ab_smoke_segmentation.py \ | ||
| 426 | --dataset fma \ | ||
| 427 | --input-dir data/raw/fma_small_audio \ | ||
| 428 | --work-root /tmp/ab_smoke_seg_cap32_top2 \ | ||
| 429 | --subset-size 32 \ | ||
| 430 | --query-duration 8 \ | ||
| 431 | --train-epochs 1 \ | ||
| 432 | --batch-size 2 \ | ||
| 433 | --device cpu \ | ||
| 434 | --strategies hybrid high_energy \ | ||
| 435 | --max-test-queries 20 \ | ||
| 436 | --output-json /tmp/ab_smoke_seg_cap32_top2/report.json | ||
| 437 | ``` | ||
| 438 | |||
| 439 | 当前已确认的 fresh evidence: | ||
| 440 | |||
| 441 | | 项目 | 状态 | | ||
| 442 | |---|---| | ||
| 443 | | `subset_size` | `32` | | ||
| 444 | | `max_test_queries` | `20` | | ||
| 445 | | 首个运行策略 | `hybrid` | | ||
| 446 | | 当前阶段 | `run_demo.py build-index --resume --checkpoint-every-refs 100` | | ||
| 447 | | `report.json` | 尚未生成 | | ||
| 448 | |||
| 449 | 恢复检查命令: | ||
| 450 | |||
| 451 | ```bash | ||
| 452 | pgrep -af 'ab_smoke_seg_cap32_top2|external_adapters.py smoke-local fma /tmp/ab_smoke_seg_cap32_top2|evaluate.py --data /tmp/ab_smoke_seg_cap32_top2|run_demo.py build-index --data /tmp/ab_smoke_seg_cap32_top2|train.py --data /tmp/ab_smoke_seg_cap32_top2' | ||
| 453 | ``` | ||
| 454 | |||
| 455 | 优先等待文件: | ||
| 456 | - `/tmp/ab_smoke_seg_cap32_top2/hybrid/fma_reports_smoke/eval.json` | ||
| 457 | - `/tmp/ab_smoke_seg_cap32_top2/report.json` | ||
| 416 | - `b766c74` Make open-dataset manifests trainable end to end | 458 | - `b766c74` Make open-dataset manifests trainable end to end |
| 417 | - `fa23144` Add a single-page open dataset workflow for training prep | 459 | - `fa23144` Add a single-page open dataset workflow for training prep |
| 418 | - `af33be3` Condense docs and add manifest validation before training | 460 | - `af33be3` Condense docs and add manifest validation before training | ... | ... |
-
Please register or sign in to post a comment