Commit 75fa5e93 75fa5e932efe8d1c0980dba3081216c18c1af885 by cnb.bofCdSsphPA

Provide a runnable semantic-bucket template so the next benchmark step can start immediately

Constraint: Keep the checkpoint lightweight and avoid touching dataset or model artifacts
Rejected: Wait to add buckets until automatic semantic labeling exists | Manual curated buckets are enough to unblock the next session now
Confidence: high
Scope-risk: narrow
Directive: Use the template as a curated benchmark scaffold, not as evidence that filenames imply semantics
Tested: Parsed the new JSON template; ran ab_smoke_bucketed.py --help; rechecked targeted relative links
Not-tested: Did not launch a new semantic bucket benchmark run in this checkpoint
1 parent 1bdca61b
...@@ -53,6 +53,7 @@ ...@@ -53,6 +53,7 @@
53 ## 5. 当前续跑优先级 53 ## 5. 当前续跑优先级
54 54
55 1. 将 toy prefix bucket 升级为语义 bucket。 55 1. 将 toy prefix bucket 升级为语义 bucket。
56 - 模板入口:`acr-engine/configs/buckets/fma_semantic_bucket_template.json`
56 2. 补 cap64 multi-seed aggregate。 57 2. 补 cap64 multi-seed aggregate。
57 3. 更新: 58 3. 更新:
58 - `docs/open-dataset-workflow.md` 59 - `docs/open-dataset-workflow.md`
......
1 {
2 "notes": {
3 "purpose": "Template for semantic/style-aware bucket benchmarking on local FMA-like trees.",
4 "how_to_use": "Replace placeholder glob patterns with your own curated track groups before running ab_smoke_bucketed.py.",
5 "warning": "Do not treat filename prefixes as product semantics; this file is for manually curated semantic buckets."
6 },
7 "buckets": [
8 {
9 "name": "energy_dominant",
10 "patterns": [
11 "fma_small/*/REPLACE_WITH_HIGH_ENERGY_TRACKS_*.mp3"
12 ],
13 "subset_size": 16,
14 "label_hint": "chorus-heavy or consistently high-energy songs"
15 },
16 {
17 "name": "repeated_section_rich",
18 "patterns": [
19 "fma_small/*/REPLACE_WITH_REPEATED_SECTION_TRACKS_*.mp3"
20 ],
21 "subset_size": 16,
22 "label_hint": "clear repeating hook/chorus structure"
23 },
24 {
25 "name": "steady_beat_regular_meter",
26 "patterns": [
27 "fma_small/*/REPLACE_WITH_STEADY_BEAT_TRACKS_*.mp3"
28 ],
29 "subset_size": 16,
30 "label_hint": "stable beat, strong downbeat, regular meter"
31 },
32 {
33 "name": "hard_negative_confusable",
34 "patterns": [
35 "fma_small/*/REPLACE_WITH_CONFUSABLE_TRACKS_*.mp3"
36 ],
37 "subset_size": 16,
38 "label_hint": "sonically similar tracks likely to trigger confusion"
39 }
40 ]
41 }
1 ## 2026-06-02 语义 bucket 模板交付 checkpoint
2
3 完成项:
4 - 新增语义 bucket 配置模板:`acr-engine/configs/buckets/fma_semantic_bucket_template.json`
5 - 已把模板入口与运行命令补入 workflow / benchmark / handoff 文档。
6
7 模板覆盖的首批 bucket:
8 - `energy_dominant`
9 - `repeated_section_rich`
10 - `steady_beat_regular_meter`
11 - `hard_negative_confusable`
12
13 结论:
14 - 现在下个 session 不需要从 0 设计 bucket 结构。
15 - 可以直接在模板里替换 glob,开始做更有业务意义的 bucket benchmark。
16
1 ## 2026-06-02 bucket/style-aware benchmark 汇总完成 checkpoint 17 ## 2026-06-02 bucket/style-aware benchmark 汇总完成 checkpoint
2 18
3 完成项: 19 完成项:
......
...@@ -117,3 +117,9 @@ flowchart LR ...@@ -117,3 +117,9 @@ flowchart LR
117 - aggregate 层面两者 `mean_top1` 都是 `1.0` 117 - aggregate 层面两者 `mean_top1` 都是 `1.0`
118 118
119 因此 bucket benchmark 的当前意义不是“选出唯一赢家”,而是为后续语义 bucket / hard-case bucket 提供一个可复用执行框架。 119 因此 bucket benchmark 的当前意义不是“选出唯一赢家”,而是为后续语义 bucket / hard-case bucket 提供一个可复用执行框架。
120
121
122 推荐模板:
123 - [../acr-engine/configs/buckets/fma_semantic_bucket_template.json](../acr-engine/configs/buckets/fma_semantic_bucket_template.json)
124
125 它不是自动标注器,而是一个“人工先分 bucket,再复用统一 benchmark 流程”的执行模板。
......
...@@ -367,3 +367,32 @@ cd acr-engine ...@@ -367,3 +367,32 @@ cd acr-engine
367 当前结论: 367 当前结论:
368 - bucket baseline 已经能稳定复现“不同子集会选出不同 winner”。 368 - bucket baseline 已经能稳定复现“不同子集会选出不同 winner”。
369 - 下一步不是继续做 prefix toy bucket,而是升级到更有业务意义的 bucket。 369 - 下一步不是继续做 prefix toy bucket,而是升级到更有业务意义的 bucket。
370
371 推荐直接从模板开始:
372 - [../acr-engine/configs/buckets/fma_semantic_bucket_template.json](../acr-engine/configs/buckets/fma_semantic_bucket_template.json)
373
374 建议先人工挑一批歌,再把 glob 替换成你自己的候选集合,优先覆盖:
375 1. `energy_dominant`
376 2. `repeated_section_rich`
377 3. `steady_beat_regular_meter`
378 4. `hard_negative_confusable`
379
380 对应命令:
381
382 ```bash
383 cd /workspace/acr-engine
384 /usr/local/miniconda3/bin/python scripts/ab_smoke_bucketed.py \
385 --dataset fma \
386 --input-dir data/raw/fma_small_audio \
387 --bucket-config configs/buckets/fma_semantic_bucket_template.json \
388 --work-root /tmp/ab_smoke_bucketed_semantic \
389 --default-subset-size 16 \
390 --query-duration 8 \
391 --train-epochs 1 \
392 --batch-size 2 \
393 --device cpu \
394 --strategies high_energy hybrid \
395 --max-test-queries 8 \
396 --seed 42 \
397 --output-json /tmp/ab_smoke_bucketed_semantic/report.json
398 ```
......
...@@ -255,6 +255,7 @@ ...@@ -255,6 +255,7 @@
255 255
256 ### 最优先待办 256 ### 最优先待办
257 1. 把已完成的 toy bucket baseline 升级为语义 bucket(风格 / 结构 / hard-case)。 257 1. 把已完成的 toy bucket baseline 升级为语义 bucket(风格 / 结构 / hard-case)。
258 - 模板:`acr-engine/configs/buckets/fma_semantic_bucket_template.json`
258 2. 对比 cap48 与 cap64 的不一致现象,补充分规模结论。 259 2. 对比 cap48 与 cap64 的不一致现象,补充分规模结论。
259 3. 继续补 cap64 multi-seed,而不是只保留单 seed。 260 3. 继续补 cap64 multi-seed,而不是只保留单 seed。
260 4. 继续优化 `hybrid`,重点降低波动并提升 hard case 稳定性。 261 4. 继续优化 `hybrid`,重点降低波动并提升 hard case 稳定性。
......