Commit 5dadbae3 5dadbae34777735c0eeadf12c825fb47d38ae6a8 by cnb.bofCdSsphPA

Checkpoint the larger cap32 benchmark before results land

Preserve the new 32-track top-two benchmark entry point and current build-index phase so a later session can continue the stronger validation run without losing runtime context.

Constraint: The cap32 benchmark is still running, so only execution-state evidence is available
Rejected: Wait for cap32 results before recording anything | Risks losing the larger-benchmark checkpoint if the session ends first
Confidence: high
Scope-risk: narrow
Directive: Replace the cap32 running-state section with measured scores once hybrid eval.json and report.json land
Tested: Verified active cap32 processes; verified handoff records work-root, subset size, query cap, and current build-index phase
Not-tested: cap32 strategy scores because the run is still in progress
1 parent 08379e56
......@@ -2,6 +2,28 @@
## 2026-06-02
### Stage: 启动 cap32 top2 真实 FMA 对照并记录运行阶段
完成项:
- 启动更大的真实 FMA top2 benchmark:
- `work_root = /tmp/ab_smoke_seg_cap32_top2`
- `subset_size = 32`
- `max_test_queries = 20`
- 策略:`hybrid` vs `high_energy`
- 更新 [session-handoff.md](./session-handoff.md)
当前 fresh evidence:
- `scripts/ab_smoke_segmentation.py ... --work-root /tmp/ab_smoke_seg_cap32_top2` 已启动
- 当前 first lane 为:
- `hybrid`
- 当前已进入:
- `run_demo.py build-index --resume --checkpoint-every-refs 100`
- `report.json` 尚未落盘
结论:
- 现在已经开始验证 cap24 结论在更大 `subset=32` 上是否继续成立
- 即使当前 session 结束,新 session 也可直接从 handoff 中的 cap32 入口继续盯结果
### Stage: 收尾 cap24 top2 真实 FMA 对照并确认默认策略
完成项:
......
......@@ -413,6 +413,48 @@ cap24 top2 最终结论:
- `hybrid``16 / 1.0 / 1.0`
- `high_energy``16 / 0.8125 / 1.0`
- 这个结果比 cap16 更能说明问题:**当前默认策略应明确固定为 `hybrid`**
---
## 11. cap32 top2 对照实验(进行中)
为了确认 cap24 的结论不是偶然,已继续启动更大的真实 FMA top2 对照:
```bash
cd /workspace/acr-engine
/usr/local/miniconda3/bin/python scripts/ab_smoke_segmentation.py \
--dataset fma \
--input-dir data/raw/fma_small_audio \
--work-root /tmp/ab_smoke_seg_cap32_top2 \
--subset-size 32 \
--query-duration 8 \
--train-epochs 1 \
--batch-size 2 \
--device cpu \
--strategies hybrid high_energy \
--max-test-queries 20 \
--output-json /tmp/ab_smoke_seg_cap32_top2/report.json
```
当前已确认的 fresh evidence:
| 项目 | 状态 |
|---|---|
| `subset_size` | `32` |
| `max_test_queries` | `20` |
| 首个运行策略 | `hybrid` |
| 当前阶段 | `run_demo.py build-index --resume --checkpoint-every-refs 100` |
| `report.json` | 尚未生成 |
恢复检查命令:
```bash
pgrep -af 'ab_smoke_seg_cap32_top2|external_adapters.py smoke-local fma /tmp/ab_smoke_seg_cap32_top2|evaluate.py --data /tmp/ab_smoke_seg_cap32_top2|run_demo.py build-index --data /tmp/ab_smoke_seg_cap32_top2|train.py --data /tmp/ab_smoke_seg_cap32_top2'
```
优先等待文件:
- `/tmp/ab_smoke_seg_cap32_top2/hybrid/fma_reports_smoke/eval.json`
- `/tmp/ab_smoke_seg_cap32_top2/report.json`
- `b766c74` Make open-dataset manifests trainable end to end
- `fa23144` Add a single-page open dataset workflow for training prep
- `af33be3` Condense docs and add manifest validation before training
......