CHANGELOG.md 7.13 KB

Raw Blame History Permalink



Changelog


2026-06-02


Stage: confused 定向优化 v6（sample-level weighting）

完成项：


将 hard-case loss 从 batch 级平均权重改为 sample-level weighting


SongPairDataset 改为对 confused / humming_like 区分采样强度

confused 样本权重提高到更高优先级
重训 models_v6、重建 index_v6、重跑 smoke-v6 评测
生成 reports/smoke-v6/synthetic_v2/ 发布制品
补充 docs/dataset-spec.md 中的 hard-case 输入规范说明
补充 docs/sota-research-2026.md 中的 v4/v5/v6 对比结论


验证结果：


train.py --dry-run 成功

py_compile 成功

run_demo.py build-index 成功

evaluate.py --fast-eval --output-json reports/smoke-v6/synthetic_v2/eval.json 成功
当前结果：


overall top1=0.65, top5=0.95
humming_like top1=0.25
confused top1=0.25


结论：


相比 smoke-v5，overall top1 从 0.60 提升到 0.65

confused top1 从 0.00 提升到 0.25，说明 sample-level 权重有效

humming_like top1 从 0.50 回落到 0.25，说明两类 hard case 需要分治，而不能只靠单轴加权


2026-06-02


Stage: 文档补全 + ACR 最小可运行链路

完成项：


补充项目职责图：docs/project-responsibility-map.md

补充系统架构图：docs/acr-architecture.md

补充阶段路线图：docs/roadmap.md

补充运行手册：docs/runbook.md

补充引擎说明：acr-engine/README.md

新增依赖清单：acr-engine/requirements.txt

新增 demo CLI：acr-engine/run_demo.py

修复数据集读取路径问题：acr-engine/src/data/dataset.py

修复首次训练不落 best checkpoint 的问题：acr-engine/train.py


验证结果：


已生成 synthetic dataset
已通过 train.py --dry-run

已完成 1 epoch CPU 训练并生成 best_model.pt

已完成指纹索引与 embedding 索引构建
已完成识别命令并输出 JSON 候选结果


2026-06-02


Stage: 准确率优化 v2（128 Mel / band-split / retrieval 评测 / dataset 规范 / SOTA 调研）

完成项：


补充 dataset / 输入输出规范：docs/dataset-spec.md

补充开源数据集接入计划：docs/open-dataset-plan.md

补充 2026 SOTA 研究说明：docs/sota-research-2026.md

输入特征从低维说话人风格配置改为 128 Mel

新增频带分割模块 BandSplitBlock

引入 pro-WGAN 风格工程近似平衡策略（针对困难样本的更强增广）
合成数据新增 confused / humming_like 样本类型
引入 catalog.json 作为可搜索 reference 清单
索引从整曲单向量改为 window-level embedding index
新增 evaluate.py 做 retrieval 评测
训练逻辑改为更 retrieval-oriented 的 song-pair 训练输入


验证结果：


synthetic_v2 端到端重新跑通
build-index 成功
evaluate 成功
test split 指标：top1=0.65, top5=0.95
分类型指标：


clean top1=1.00
augmented top1=0.75
humming_like top1=0.25
confused top1=0.25


结论：


结构性错误（catalog/index/fusion/评测缺失）已明显改善
当前主要剩余短板是 humming_like / confused 的鲁棒识别


2026-06-02


Stage: 工业化服务骨架 + 外部 manifest 转换模板

完成项：


新增 FastAPI 服务骨架：acr-engine/src/service/app.py

新增 manifest 转换工具：acr-engine/src/data/manifest_tools.py

新增工业 benchmark 文档：docs/industrial-benchmark-spec.md

扩展外部 dataset adapter CLI：acr-engine/src/data/external_adapters.py

新增服务 API 文档：docs/service-api.md

requirements 增加 FastAPI / uvicorn / pydantic


验证结果：


external_adapters.py registry 成功

external_adapters.py describe ccmusic 成功

external_adapters.py init modelscope_music 成功

manifest_tools.py csv-to-catalog 成功生成 catalog

service.app health() 返回 {"status":"ok"}

API build_index(...) 成功返回 reference window 数量
API recognize(...) 成功返回候选结果

train.py --dry-run 成功


2026-06-02


Stage: 文档治理闭环（导航 / 引用 / 模板）

完成项：


新增 docs/README.md 作为文档总入口
新增 docs/references-and-sources.md 作为引用来源总图
新增 docs/benchmark-report-template.md

新增 docs/model-card-template.md

新增 docs/release-checklist.md

核心文档统一补充 Sources 小节
核心文档统一补齐 executive summary / mermaid / table / appendix 风格


验证结果：


docs 总入口结构检查通过
references map 结构检查通过
核心 docs 存在性检查通过
benchmark/model/release 模板结构检查通过
所有核心文档均具备 Sources；SOTA 文档已补齐 Mermaid 图


2026-06-02


Stage: 真实评测到发布产物链路打通

完成项：


evaluate.py 支持 --output-json

新增 docs/report-layout.md

新增 scripts/generate_artifacts.py

打通 eval.json -> benchmark-report.md / model-card.md / release-checklist.md / artifact-manifest.json

为快速发布链路新增 --fast-eval（关闭 melody 重排以加快报告生成）


验证结果：


synthetic_v2 重建、训练、建索引成功

evaluate.py --fast-eval --output-json ... 成功输出 JSON
artifact generator 成功输出 4 类发布产物

reports/smoke-v2/synthetic_v2/ 目录产物存在性检查通过
当前 fast-eval 指标：top1=0.60, top5=0.75，hard-case 仍需继续优化


2026-06-02


Stage: 外部数据集 bootstrap + hard-case 过采样试验

完成项：


新增 src/data/bootstrap_external.py

可自动为 fma / ccmusic 生成 bootstrap catalog manifest
在 SongPairDataset 中加入困难样本过采样试验（confused / humming_like）
重新训练 models_v4、重建 index_v4、重跑 smoke-v4 评测


验证结果：


data/external_bootstrap/fma/manifests/catalog.bootstrap.json 成功生成

data/external_bootstrap/ccmusic/manifests/catalog.bootstrap.json 成功生成

reports/smoke-v4/synthetic_v2/eval.json 成功生成
当前试验结果：top1=0.40, top5=0.80
hard-case 结果未改善：


humming_like top1=0.00
confused top1=0.00


结论：


该轮简单过采样策略无效，且整体精度下降
下一轮应改用更细粒度 hard-negative / melody-aware 正则，而不是继续放大样本重复权重


2026-06-02


Stage: MTG-Jamendo / ModelScope bootstrap + type-aware hard-case weighting

完成项：


补充 mtg_jamendo 与 modelscope_music 的 bootstrap manifest 生成
在训练链路中加入 type-aware hard-case weighting（针对 confused / humming_like）
重训 models_v5、重建 index_v5、重跑 smoke-v5 评测


验证结果：


data/external_bootstrap/mtg_jamendo/manifests/catalog.bootstrap.json 成功生成

data/external_bootstrap/modelscope_music/manifests/catalog.bootstrap.json 成功生成

reports/smoke-v5/synthetic_v2/eval.json 成功生成
当前结果：top1=0.60, top5=0.90
hard-case 结果：


humming_like top1=0.50（较 v4 有提升）
confused top1=0.00（仍未解决）


结论：


type-aware weighting 比 naive oversampling 更有效
下一轮应专门针对 confused 类设计更强的 negative mining / confusion-aware 信号