Add a detailed handoff doc for future development sessions
Constraint: New sessions need a fast, durable understanding of the project state, open-dataset workflow, verified evidence, and next steps Rejected: Rely on scattered docs and git history alone | Too slow for session handoff and easy to miss critical workflow context Confidence: high Scope-risk: narrow Directive: Keep this handoff doc updated whenever a major workflow milestone or verified capability changes Tested: existence checks for docs/session-handoff.md and docs/README.md, plus docs index link presence Not-tested: Manual human review across multiple markdown renderers
Showing
2 changed files
with
314 additions
and
0 deletions
| ... | @@ -69,6 +69,7 @@ flowchart TD | ... | @@ -69,6 +69,7 @@ flowchart TD |
| 69 | 69 | ||
| 70 | ### C. 服务与工程 | 70 | ### C. 服务与工程 |
| 71 | - [服务接口](./service-api.md) | 71 | - [服务接口](./service-api.md) |
| 72 | - [持续开发交接文档](./session-handoff.md) | ||
| 72 | - [更新记录](./CHANGELOG.md) | 73 | - [更新记录](./CHANGELOG.md) |
| 73 | 74 | ||
| 74 | ### D. 研究与路线 | 75 | ### D. 研究与路线 | ... | ... |
docs/session-handoff.md
0 → 100644
| 1 | # Session Handoff / 持续开发交接文档 | ||
| 2 | |||
| 3 | > 更新:2026-06-02 | ||
| 4 | > 目的:让新 session / 新代理进入仓库后,能在最短时间内理解项目现状并继续开发。 | ||
| 5 | |||
| 6 | ## 一页结论 | ||
| 7 | |||
| 8 | 这是一个正在从原型向工业化推进的 **音乐 ACR / music retrieval** 项目。 | ||
| 9 | 当前已经完成: | ||
| 10 | |||
| 11 | 1. **原型可运行** | ||
| 12 | - synthetic 数据生成 | ||
| 13 | - 训练 | ||
| 14 | - 建索引 | ||
| 15 | - 识别 | ||
| 16 | - 评测 | ||
| 17 | |||
| 18 | 2. **开放数据接入链路完整闭环** | ||
| 19 | - inspect-local / inspect-batch | ||
| 20 | - prepare-local | ||
| 21 | - validate-local | ||
| 22 | - train | ||
| 23 | - build-index | ||
| 24 | - evaluate | ||
| 25 | - generate_artifacts | ||
| 26 | |||
| 27 | 3. **文档已浓缩** | ||
| 28 | - docs 入口已分成 4 组 | ||
| 29 | - 相对路径支持跳转 | ||
| 30 | - 开放数据工作流有单页文档 | ||
| 31 | |||
| 32 | 当前最重要的下一步: | ||
| 33 | - 用真实本地 FMA / MTG-Jamendo 音频目录替换 synthetic stand-in | ||
| 34 | - 跑真实开放数据 smoke | ||
| 35 | - 继续优化准确率,尤其是 `confused` / `humming_like` | ||
| 36 | |||
| 37 | --- | ||
| 38 | |||
| 39 | ## 1. 项目是什么 | ||
| 40 | |||
| 41 | 这是一个面向**音乐片段识别 / 音乐检索**的 ACR 引擎,核心路线是: | ||
| 42 | |||
| 43 | - 指纹检索(Chromaprint-like) | ||
| 44 | - embedding 检索(ECAPA-derived) | ||
| 45 | - 可选 melody-aware 融合 | ||
| 46 | - retrieval-first 评测与优化 | ||
| 47 | |||
| 48 | 它已经不是单纯的“分类模型训练脚本”,而是一个较完整的工程原型: | ||
| 49 | - 数据层 | ||
| 50 | - 训练层 | ||
| 51 | - 索引层 | ||
| 52 | - 识别层 | ||
| 53 | - 评测层 | ||
| 54 | - 文档层 | ||
| 55 | - 开放数据接入层 | ||
| 56 | - 发布产物层 | ||
| 57 | |||
| 58 | --- | ||
| 59 | |||
| 60 | ## 2. 你应该先看哪些文档 | ||
| 61 | |||
| 62 | ### 核心 4 组入口 | ||
| 63 | - [docs/README.md](./README.md) | ||
| 64 | - [docs/open-dataset-workflow.md](./open-dataset-workflow.md) | ||
| 65 | - [docs/dataset-spec.md](./dataset-spec.md) | ||
| 66 | - [docs/industrialization-roadmap.md](./industrialization-roadmap.md) | ||
| 67 | |||
| 68 | ### 如果你是算法/模型方向 | ||
| 69 | - [docs/dataset-spec.md](./dataset-spec.md) | ||
| 70 | - [docs/sota-research-2026.md](./sota-research-2026.md) | ||
| 71 | - [docs/industrial-benchmark-spec.md](./industrial-benchmark-spec.md) | ||
| 72 | |||
| 73 | ### 如果你是数据接入方向 | ||
| 74 | - [docs/open-dataset-workflow.md](./open-dataset-workflow.md) | ||
| 75 | - [docs/dataset-sources-and-licensing.md](./dataset-sources-and-licensing.md) | ||
| 76 | - [acr-engine/data/raw/README.md](../acr-engine/data/raw/README.md) | ||
| 77 | |||
| 78 | ### 如果你是工程/服务方向 | ||
| 79 | - [docs/service-api.md](./service-api.md) | ||
| 80 | - [docs/CHANGELOG.md](./CHANGELOG.md) | ||
| 81 | |||
| 82 | --- | ||
| 83 | |||
| 84 | ## 3. 当前代码结构重点 | ||
| 85 | |||
| 86 | ### 训练与评测主入口 | ||
| 87 | - [acr-engine/train.py](../acr-engine/train.py) | ||
| 88 | - [acr-engine/evaluate.py](../acr-engine/evaluate.py) | ||
| 89 | - [acr-engine/run_demo.py](../acr-engine/run_demo.py) | ||
| 90 | |||
| 91 | ### 数据层 | ||
| 92 | - [acr-engine/src/data/dataset.py](../acr-engine/src/data/dataset.py) | ||
| 93 | - [acr-engine/src/data/synthetic.py](../acr-engine/src/data/synthetic.py) | ||
| 94 | - [acr-engine/src/data/manifest_tools.py](../acr-engine/src/data/manifest_tools.py) | ||
| 95 | - [acr-engine/src/data/external_adapters.py](../acr-engine/src/data/external_adapters.py) | ||
| 96 | |||
| 97 | ### 检索与模型层 | ||
| 98 | - [acr-engine/src/engines/hybrid_engine.py](../acr-engine/src/engines/hybrid_engine.py) | ||
| 99 | - [acr-engine/src/engines/ecapa_embedder.py](../acr-engine/src/engines/ecapa_embedder.py) | ||
| 100 | - [acr-engine/src/engines/chromaprint_matcher.py](../acr-engine/src/engines/chromaprint_matcher.py) | ||
| 101 | - [acr-engine/src/models/ecapa_tdnn.py](../acr-engine/src/models/ecapa_tdnn.py) | ||
| 102 | - [acr-engine/src/models/losses.py](../acr-engine/src/models/losses.py) | ||
| 103 | |||
| 104 | ### 服务层 | ||
| 105 | - [acr-engine/src/service/app.py](../acr-engine/src/service/app.py) | ||
| 106 | |||
| 107 | --- | ||
| 108 | |||
| 109 | ## 4. 已经完成的关键能力 | ||
| 110 | |||
| 111 | ### 4.1 原型与 synthetic 数据 | ||
| 112 | - synthetic dataset 可生成 | ||
| 113 | - `train.py --dry-run` 可通过 | ||
| 114 | - 可训练出 checkpoint | ||
| 115 | - 可 build-index | ||
| 116 | - 可 recognize | ||
| 117 | - 可 evaluate | ||
| 118 | |||
| 119 | ### 4.2 开放数据接入 | ||
| 120 | 已经具备以下命令: | ||
| 121 | |||
| 122 | - `inspect-local` | ||
| 123 | - `inspect-batch` | ||
| 124 | - `prepare-local` | ||
| 125 | - `validate-local` | ||
| 126 | - `smoke-local` | ||
| 127 | |||
| 128 | 这些都在: | ||
| 129 | - [acr-engine/src/data/external_adapters.py](../acr-engine/src/data/external_adapters.py) | ||
| 130 | |||
| 131 | ### 4.3 文档与发布产物 | ||
| 132 | 开放数据 smoke 也能生成: | ||
| 133 | - benchmark report | ||
| 134 | - model card | ||
| 135 | - release checklist | ||
| 136 | - artifact manifest | ||
| 137 | |||
| 138 | --- | ||
| 139 | |||
| 140 | ## 5. 开放数据当前的实际工作方式 | ||
| 141 | |||
| 142 | ### 真实音频应该放到哪里 | ||
| 143 | - [acr-engine/data/raw/fma_small_audio/](../acr-engine/data/raw/fma_small_audio/) | ||
| 144 | - [acr-engine/data/raw/mtg_jamendo_audio/](../acr-engine/data/raw/mtg_jamendo_audio/) | ||
| 145 | |||
| 146 | 说明文件: | ||
| 147 | - [acr-engine/data/raw/README.md](../acr-engine/data/raw/README.md) | ||
| 148 | |||
| 149 | ### 当前最推荐的命令 | ||
| 150 | |||
| 151 | #### FMA | ||
| 152 | ```bash | ||
| 153 | /usr/local/miniconda3/bin/python src/data/external_adapters.py smoke-local fma data/raw/fma_small_audio --output-root data/external_smoke --eval-ratio 0.2 --query-duration 8.0 --train-epochs 1 --batch-size 2 | ||
| 154 | ``` | ||
| 155 | |||
| 156 | #### MTG-Jamendo | ||
| 157 | ```bash | ||
| 158 | /usr/local/miniconda3/bin/python src/data/external_adapters.py smoke-local mtg_jamendo data/raw/mtg_jamendo_audio --output-root data/external_smoke --eval-ratio 0.2 --query-duration 8.0 --train-epochs 1 --batch-size 2 | ||
| 159 | ``` | ||
| 160 | |||
| 161 | ### 当前 smoke-local 已验证能力 | ||
| 162 | `smoke-local` 会自动跑: | ||
| 163 | 1. inspect-local | ||
| 164 | 2. prepare-local | ||
| 165 | 3. validate-local | ||
| 166 | 4. train | ||
| 167 | 5. build-index | ||
| 168 | 6. evaluate | ||
| 169 | 7. generate_artifacts | ||
| 170 | |||
| 171 | --- | ||
| 172 | |||
| 173 | ## 6. 目前最重要的验证证据 | ||
| 174 | |||
| 175 | ### 6.1 synthetic-as-open-fixed(开放数据 stand-in) | ||
| 176 | 已成功验证: | ||
| 177 | - `prepare-local` | ||
| 178 | - `validate-local` | ||
| 179 | - `train.py` | ||
| 180 | - `build-index` | ||
| 181 | - `evaluate.py` | ||
| 182 | - `generate_artifacts.py` | ||
| 183 | |||
| 184 | 关键结果: | ||
| 185 | - `num_queries=8` | ||
| 186 | - `top1=1.0` | ||
| 187 | - `topk=1.0` | ||
| 188 | |||
| 189 | 相关目录: | ||
| 190 | - [acr-engine/data/external_ingested/synthetic_as_open_fixed/](../acr-engine/data/external_ingested/synthetic_as_open_fixed/) | ||
| 191 | - [acr-engine/reports/open-smoke-fixed/fma/](../acr-engine/reports/open-smoke-fixed/fma/) | ||
| 192 | |||
| 193 | ### 6.2 一键 smoke-local | ||
| 194 | 已验证: | ||
| 195 | ```bash | ||
| 196 | /usr/local/miniconda3/bin/python src/data/external_adapters.py smoke-local fma data/synthetic_v2/songs --output-root data/external_smoke --eval-ratio 0.2 --query-duration 5.0 --train-epochs 1 --batch-size 2 | ||
| 197 | ``` | ||
| 198 | |||
| 199 | 关键结果: | ||
| 200 | - `num_audio_files=24` | ||
| 201 | - `catalog=24` | ||
| 202 | - `train_queries=16` | ||
| 203 | - `test_queries=8` | ||
| 204 | - `top1=1.0` | ||
| 205 | - `topk=1.0` | ||
| 206 | |||
| 207 | 相关目录: | ||
| 208 | - [acr-engine/data/external_smoke/](../acr-engine/data/external_smoke/) | ||
| 209 | |||
| 210 | --- | ||
| 211 | |||
| 212 | ## 7. 当前最重要的待办 | ||
| 213 | |||
| 214 | ### 优先级 A:真实开放数据替换 | ||
| 215 | 目标: | ||
| 216 | - 用真实本地 FMA / MTG-Jamendo 音频替换 synthetic stand-in | ||
| 217 | |||
| 218 | 操作: | ||
| 219 | 1. 把真实音频放进: | ||
| 220 | - `acr-engine/data/raw/fma_small_audio/` | ||
| 221 | - 或 `acr-engine/data/raw/mtg_jamendo_audio/` | ||
| 222 | 2. 直接运行 `smoke-local` | ||
| 223 | 3. 记录: | ||
| 224 | - inspect 规模 | ||
| 225 | - train/test query 数 | ||
| 226 | - top1/topk | ||
| 227 | - artifact bundle | ||
| 228 | |||
| 229 | ### 优先级 B:hard-case 精度继续优化 | ||
| 230 | 当前历史结论: | ||
| 231 | - naive oversampling:失败 | ||
| 232 | - type-aware weighting:部分有效 | ||
| 233 | - sample-level weighting:提升 `confused` | ||
| 234 | - retrieval fusion tuning:更稳定有效 | ||
| 235 | |||
| 236 | 下阶段重点: | ||
| 237 | - `confused` | ||
| 238 | - `humming_like` | ||
| 239 | - 真实开放数据上的 hard-case bucket | ||
| 240 | |||
| 241 | ### 优先级 C:foundation model / SOTA baseline | ||
| 242 | 已经在文档中记录: | ||
| 243 | - MERT | ||
| 244 | - MuQ | ||
| 245 | - 更强 retrieval-first 路线 | ||
| 246 | |||
| 247 | 后续可以做: | ||
| 248 | - frozen backbone baseline | ||
| 249 | - adapter fine-tune | ||
| 250 | |||
| 251 | --- | ||
| 252 | |||
| 253 | ## 8. 最新关键提交(便于新 session 快速定位) | ||
| 254 | |||
| 255 | 近几次关键提交建议优先看: | ||
| 256 | |||
| 257 | - `d221852` Add explicit drop zones for real open-music corpora | ||
| 258 | - `eee15ac` Automate the full open-dataset smoke workflow behind one command | ||
| 259 | - `8795907` Generate release artifacts for the open-dataset smoke path | ||
| 260 | - `dc9ef1b` Close the open-dataset smoke loop through evaluation | ||
| 261 | - `b766c74` Make open-dataset manifests trainable end to end | ||
| 262 | - `fa23144` Add a single-page open dataset workflow for training prep | ||
| 263 | - `af33be3` Condense docs and add manifest validation before training | ||
| 264 | |||
| 265 | 这些 commit 基本覆盖了当前开放数据与文档演进主线。 | ||
| 266 | |||
| 267 | --- | ||
| 268 | |||
| 269 | ## 9. 新 session 接手时的推荐动作 | ||
| 270 | |||
| 271 | 如果你是新的 session,建议顺序: | ||
| 272 | |||
| 273 | 1. 读: | ||
| 274 | - [docs/README.md](./README.md) | ||
| 275 | - [docs/open-dataset-workflow.md](./open-dataset-workflow.md) | ||
| 276 | - [docs/session-handoff.md](./session-handoff.md) | ||
| 277 | |||
| 278 | 2. 检查真实数据是否已落位: | ||
| 279 | - `acr-engine/data/raw/fma_small_audio/` | ||
| 280 | - `acr-engine/data/raw/mtg_jamendo_audio/` | ||
| 281 | |||
| 282 | 3. 如果已有真实音频: | ||
| 283 | - 直接跑 `smoke-local` | ||
| 284 | |||
| 285 | 4. 如果还没有真实音频: | ||
| 286 | - 继续优化 synthetic-as-open-fixed | ||
| 287 | - 或继续补开放数据下载/清洗自动化 | ||
| 288 | |||
| 289 | 5. 每完成一个阶段: | ||
| 290 | - 更新 [docs/CHANGELOG.md](./CHANGELOG.md) | ||
| 291 | - `git commit` | ||
| 292 | - `git push` | ||
| 293 | |||
| 294 | --- | ||
| 295 | |||
| 296 | ## 10. 注意事项 | ||
| 297 | |||
| 298 | - 这个仓库里存在已跟踪的 `__pycache__` 文件;提交时要小心不要让它们污染变更。 | ||
| 299 | - 当前最稳定的改进方向不是盲目调训练权重,而是: | ||
| 300 | - retrieval-time fusion | ||
| 301 | - 更真实开放数据 | ||
| 302 | - 更真实评测 | ||
| 303 | - 开放数据布局现在依赖“自包含输出根”: | ||
| 304 | - `audio/` | ||
| 305 | - `manifests/` | ||
| 306 | 这一点后续不要破坏。 | ||
| 307 | |||
| 308 | --- | ||
| 309 | |||
| 310 | ## Sources | ||
| 311 | - [README.md](./README.md) | ||
| 312 | - [open-dataset-workflow.md](./open-dataset-workflow.md) | ||
| 313 | - [CHANGELOG.md](./CHANGELOG.md) |
-
Please register or sign in to post a comment