Provide a runnable semantic-bucket template so the next benchmark step can start immediately

Constraint: Keep the checkpoint lightweight and avoid touching dataset or model artifacts Rejected: Wait to add buckets until automatic semantic labeling exists | Manual curated buckets are enough to unblock the next session now Confidence: high Scope-risk: narrow Directive: Use the template as a curated benchmark scaffold, not as evidence that filenames imply semantics Tested: Parsed the new JSON template; ran ab_smoke_bucketed.py --help; rechecked targeted relative links Not-tested: Did not launch a new semantic bucket benchmark run in this checkpoint

Provide a runnable semantic-bucket template so the next benchmark step can start immediately
Constraint: Keep the checkpoint lightweight and avoid touching dataset or model artifacts Rejected: Wait to add buckets until automatic semantic labeling exists | Manual curated buckets are enough to unblock the next session now Confidence: high Scope-risk: narrow Directive: Use the template as a curated benchmark scaffold, not as evidence that filenames imply semantics Tested: Parsed the new JSON template; ran ab_smoke_bucketed.py --help; rechecked targeted relative links Not-tested: Did not launch a new semantic bucket benchmark run in this checkpoint
cnb.bofCdSsphPA
Commit 75fa5e93 ... 75fa5e932efe8d1c0980dba3081216c18c1af885 authored 2026-06-02 18:51:59 +0800 by cnb.bofCdSsphPA
Showing 6 changed files with 94 additions and 0 deletions
AGENT.md
acr-engine/configs/buckets/fma_semantic_bucket_template.json
docs/CHANGELOG.md
docs/industrial-benchmark-spec.md
docs/open-dataset-workflow.md
docs/session-handoff.md
--- a/AGENT.md
View file @75fa5e9
+++ b/AGENT.md
View file @75fa5e9
@@ -53,6 +53,7 @@
 ## 5. 当前续跑优先级
 1. 将 toy prefix bucket 升级为语义 bucket。
+   - 模板入口：`acr-engine/configs/buckets/fma_semantic_bucket_template.json`
 2. 补 cap64 multi-seed aggregate。
 3. 更新：
   - `docs/open-dataset-workflow.md`
--- a/acr-engine/configs/buckets/fma_semantic_bucket_template.json 0 → 100644
View file @75fa5e9
+++ b/acr-engine/configs/buckets/fma_semantic_bucket_template.json 0 → 100644
View file @75fa5e9
+{
+  "notes": {
+    "purpose": "Template for semantic/style-aware bucket benchmarking on local FMA-like trees.",
+    "how_to_use": "Replace placeholder glob patterns with your own curated track groups before running ab_smoke_bucketed.py.",
+    "warning": "Do not treat filename prefixes as product semantics; this file is for manually curated semantic buckets."
+  },
+  "buckets": [
+    {
+      "name": "energy_dominant",
+      "patterns": [
+        "fma_small/*/REPLACE_WITH_HIGH_ENERGY_TRACKS_*.mp3"
+      ],
+      "subset_size": 16,
+      "label_hint": "chorus-heavy or consistently high-energy songs"
+    },
+    {
+      "name": "repeated_section_rich",
+      "patterns": [
+        "fma_small/*/REPLACE_WITH_REPEATED_SECTION_TRACKS_*.mp3"
+      ],
+      "subset_size": 16,
+      "label_hint": "clear repeating hook/chorus structure"
+    },
+    {
+      "name": "steady_beat_regular_meter",
+      "patterns": [
+        "fma_small/*/REPLACE_WITH_STEADY_BEAT_TRACKS_*.mp3"
+      ],
+      "subset_size": 16,
+      "label_hint": "stable beat, strong downbeat, regular meter"
+    },
+    {
+      "name": "hard_negative_confusable",
+      "patterns": [
+        "fma_small/*/REPLACE_WITH_CONFUSABLE_TRACKS_*.mp3"
+      ],
+      "subset_size": 16,
+      "label_hint": "sonically similar tracks likely to trigger confusion"
+    }
+  ]
+}
--- a/docs/CHANGELOG.md
View file @75fa5e9
+++ b/docs/CHANGELOG.md
View file @75fa5e9
+## 2026-06-02 语义 bucket 模板交付 checkpoint
+完成项：
+- 新增语义 bucket 配置模板：`acr-engine/configs/buckets/fma_semantic_bucket_template.json`
+- 已把模板入口与运行命令补入 workflow / benchmark / handoff 文档。
+模板覆盖的首批 bucket：
+- `energy_dominant`
+- `repeated_section_rich`
+- `steady_beat_regular_meter`
+- `hard_negative_confusable`
+结论：
+- 现在下个 session 不需要从 0 设计 bucket 结构。
+- 可以直接在模板里替换 glob，开始做更有业务意义的 bucket benchmark。
 ## 2026-06-02 bucket/style-aware benchmark 汇总完成 checkpoint
 完成项：
--- a/docs/industrial-benchmark-spec.md
View file @75fa5e9
+++ b/docs/industrial-benchmark-spec.md
View file @75fa5e9
@@ -117,3 +117,9 @@ flowchart LR
 - aggregate 层面两者 `mean_top1` 都是 `1.0`
 因此 bucket benchmark 的当前意义不是“选出唯一赢家”，而是为后续语义 bucket / hard-case bucket 提供一个可复用执行框架。
+推荐模板：
+- [../acr-engine/configs/buckets/fma_semantic_bucket_template.json](../acr-engine/configs/buckets/fma_semantic_bucket_template.json)
+它不是自动标注器，而是一个“人工先分 bucket，再复用统一 benchmark 流程”的执行模板。
--- a/docs/open-dataset-workflow.md
View file @75fa5e9
+++ b/docs/open-dataset-workflow.md
View file @75fa5e9
@@ -367,3 +367,32 @@ cd acr-engine
 当前结论：
 - bucket baseline 已经能稳定复现“不同子集会选出不同 winner”。
 - 下一步不是继续做 prefix toy bucket，而是升级到更有业务意义的 bucket。
+推荐直接从模板开始：
+- [../acr-engine/configs/buckets/fma_semantic_bucket_template.json](../acr-engine/configs/buckets/fma_semantic_bucket_template.json)
+建议先人工挑一批歌，再把 glob 替换成你自己的候选集合，优先覆盖：
+1. `energy_dominant`
+2. `repeated_section_rich`
+3. `steady_beat_regular_meter`
+4. `hard_negative_confusable`
+对应命令：
+```bash
+cd /workspace/acr-engine
+/usr/local/miniconda3/bin/python scripts/ab_smoke_bucketed.py \
+  --dataset fma \
+  --input-dir data/raw/fma_small_audio \
+  --bucket-config configs/buckets/fma_semantic_bucket_template.json \
+  --work-root /tmp/ab_smoke_bucketed_semantic \
+  --default-subset-size 16 \
+  --query-duration 8 \
+  --train-epochs 1 \
+  --batch-size 2 \
+  --device cpu \
+  --strategies high_energy hybrid \
+  --max-test-queries 8 \
+  --seed 42 \
+  --output-json /tmp/ab_smoke_bucketed_semantic/report.json
+```
--- a/docs/session-handoff.md
View file @75fa5e9
+++ b/docs/session-handoff.md
View file @75fa5e9
@@ -255,6 +255,7 @@
 ### 最优先待办
 1. 把已完成的 toy bucket baseline 升级为语义 bucket（风格 / 结构 / hard-case）。
+   - 模板：`acr-engine/configs/buckets/fma_semantic_bucket_template.json`
 2. 对比 cap48 与 cap64 的不一致现象，补充分规模结论。
 3. 继续补 cap64 multi-seed，而不是只保留单 seed。
 4. 继续优化 `hybrid`，重点降低波动并提升 hard case 稳定性。