Why the handoff must distinguish current model coverage from historical test rows
Constraint: The live schema contains historical placeholder and fallback rows that must not be mistaken for the current baseline Rejected: Relying on informal memory of which rows are current | too easy for future sessions to misread the database Confidence: high Scope-risk: narrow Directive: When live schemas keep historical test data, always document the exact SQL that identifies the current baseline coverage Tested: markdown link check under docs; live SQL counts for chromaprint_matcher, mert-v1-95m, and muq-large-msd-iter Not-tested: Full schema cleanup of historical test rows
Showing
2 changed files
with
39 additions
and
0 deletions
| ... | @@ -4,6 +4,7 @@ | ... | @@ -4,6 +4,7 @@ |
| 4 | - 继续收敛文档到当前 live 主链口径:补齐 `feature_fact.object_id -> audio_object(window)`、`window.parent_object_id -> asset`、`feature_fact.song_id -> media_entity(song)` 的绑定说明,并新增 manifest/SQL 双样例,专门回答 Phase-1 开源模型集合应该如何落地存储以及 feature 与 audio object 如何关联。 | 4 | - 继续收敛文档到当前 live 主链口径:补齐 `feature_fact.object_id -> audio_object(window)`、`window.parent_object_id -> asset`、`feature_fact.song_id -> media_entity(song)` 的绑定说明,并新增 manifest/SQL 双样例,专门回答 Phase-1 开源模型集合应该如何落地存储以及 feature 与 audio object 如何关联。 |
| 5 | - 修正 `docs/session-handoff.md` 中关于 semantic lane 的旧状态残留,统一到当前真实事实:live 默认已落 `chromaprint_matcher + mert-v1-95m`,MuQ 仍是下一阶段 challenger。 | 5 | - 修正 `docs/session-handoff.md` 中关于 semantic lane 的旧状态残留,统一到当前真实事实:live 默认已落 `chromaprint_matcher + mert-v1-95m`,MuQ 仍是下一阶段 challenger。 |
| 6 | - 继续补充可复核的 live 样例:把 `feature_id = 34 -> window_id = 22 -> asset_id = 20 -> song_beta` 的真实 PostgreSQL 回溯结果写入 handoff 与 schema sample 文档,方便下次 session 直接人工复核绑定链路。 | 6 | - 继续补充可复核的 live 样例:把 `feature_id = 34 -> window_id = 22 -> asset_id = 20 -> song_beta` 的真实 PostgreSQL 回溯结果写入 handoff 与 schema sample 文档,方便下次 session 直接人工复核绑定链路。 |
| 7 | - 继续补充 live 覆盖率口径:把当前默认主链的 `chromaprint_matcher=5`、`mert-v1-95m=5`、`muq-large-msd-iter=0` 统计写入 handoff,并明确提醒 schema 中仍有历史 placeholder / fallback 测试行,避免下次误把旧数据当当前 baseline。 | ||
| 7 | 8 | ||
| 8 | ## 2026-06-04 | 9 | ## 2026-06-04 |
| 9 | - fresh runtime 进展:已在当前 host 成功安装 `torch-2.12.0+cpu`、`torchaudio-2.11.0+cpu` 与 `transformers-5.10.1`,重跑 song-centric 主链后确认 `semantic_runtime_available = true`、`semantic_runtime_ready_count = 5`、`semantic_fallback_count = 0`;当前 semantic 已从 fallback 推进到 `mert-v1-95m`,下一步可在不破坏当前 MERT 基线的前提下继续接 `MuQ` adapter。 | 10 | - fresh runtime 进展:已在当前 host 成功安装 `torch-2.12.0+cpu`、`torchaudio-2.11.0+cpu` 与 `transformers-5.10.1`,重跑 song-centric 主链后确认 `semantic_runtime_available = true`、`semantic_runtime_ready_count = 5`、`semantic_fallback_count = 0`;当前 semantic 已从 fallback 推进到 `mert-v1-95m`,下一步可在不破坏当前 MERT 基线的前提下继续接 `MuQ` adapter。 | ... | ... |
| ... | @@ -191,6 +191,44 @@ flowchart TD | ... | @@ -191,6 +191,44 @@ flowchart TD |
| 191 | - exact:`chromaprint_matcher / phase1_local / chromaprint_matcher_5s` | 191 | - exact:`chromaprint_matcher / phase1_local / chromaprint_matcher_5s` |
| 192 | - semantic baseline:`mert-v1-95m / hf-main / mert_5s_hop2.5_v1` | 192 | - semantic baseline:`mert-v1-95m / hf-main / mert_5s_hop2.5_v1` |
| 193 | 193 | ||
| 194 | ### 当前默认主链的覆盖率事实 | ||
| 195 | |||
| 196 | 按当前 live 统计,当前默认 song-centric 新主链的模型覆盖率是: | ||
| 197 | - `chromaprint_matcher / fingerprint = 5 rows` | ||
| 198 | - `mert-v1-95m / embedding = 5 rows` | ||
| 199 | - `muq-large-msd-iter = 0 rows` | ||
| 200 | |||
| 201 | 对当前 5 个新 window 来说,默认主链都已经至少具备: | ||
| 202 | - `chromaprint_matcher:fingerprint` | ||
| 203 | - `mert-v1-95m:embedding` | ||
| 204 | |||
| 205 | 这意味着下次 session 判断 MuQ 是否接入时,最简单的办法就是直接看: | ||
| 206 | |||
| 207 | ```sql | ||
| 208 | select model_name, feature_type, count(*) as rows | ||
| 209 | from acr_songcentric_test.feature_fact | ||
| 210 | where model_name in ('chromaprint_matcher', 'mert-v1-95m', 'muq-large-msd-iter') | ||
| 211 | group by model_name, feature_type | ||
| 212 | order by model_name, feature_type; | ||
| 213 | ``` | ||
| 214 | |||
| 215 | 如果这里开始出现 `muq-large-msd-iter`,就说明 challenger 已经真正落库。 | ||
| 216 | |||
| 217 | ### 关于历史脏数据的提醒 | ||
| 218 | |||
| 219 | 当前 `acr_songcentric_test` 里还保留了一些历史测试行,例如: | ||
| 220 | - `local_wavehash` | ||
| 221 | - `local_wavehash_embed` | ||
| 222 | - `semantic_runtime_ready_placeholder` | ||
| 223 | - 更早的 `chromaprint / mert / muq` 旧样例 | ||
| 224 | |||
| 225 | 这些行**不代表当前默认主链**。 | ||
| 226 | 当前默认主链判断口径要看: | ||
| 227 | - 新目录 runner 产出的 manifest | ||
| 228 | - `chromaprint_matcher` 行数 | ||
| 229 | - `mert-v1-95m` 行数 | ||
| 230 | - 是否出现新的 `muq-large-msd-iter` 行数 | ||
| 231 | |||
| 194 | 当前 MuQ 状态: | 232 | 当前 MuQ 状态: |
| 195 | - 目标模型:`OpenMuQ/MuQ-large-msd-iter` | 233 | - 目标模型:`OpenMuQ/MuQ-large-msd-iter` |
| 196 | - 当前 blocker:`import muq` 触发 `RuntimeError: operator torchvision::nms does not exist` | 234 | - 当前 blocker:`import muq` 触发 `RuntimeError: operator torchvision::nms does not exist` | ... | ... |
-
Please register or sign in to post a comment