Why the Phase-1 docs must explain feature-to-window binding explicitly
Constraint: The current default must stay aligned with the live 4-table song-centric path and the real MERT baseline Rejected: Re-expanding old multi-layer docs | increases onboarding cost and reintroduces stale states Confidence: high Scope-risk: narrow Directive: Keep future schema docs anchored to live model_name/feature_set_name facts, not aspirational placeholders Tested: markdown link check under docs; live PostgreSQL spot-check of feature_fact model_name/object_id/song_id lineage Not-tested: Mermaid rendering in external markdown viewers
Showing
6 changed files
with
376 additions
and
89 deletions
| 1 | # Changelog | 1 | # Changelog |
| 2 | 2 | ||
| 3 | ## 2026-06-04 | 3 | ## 2026-06-04 |
| 4 | - 继续收敛文档到当前 live 主链口径:补齐 `feature_fact.object_id -> audio_object(window)`、`window.parent_object_id -> asset`、`feature_fact.song_id -> media_entity(song)` 的绑定说明,并新增 manifest/SQL 双样例,专门回答 Phase-1 开源模型集合应该如何落地存储以及 feature 与 audio object 如何关联。 | ||
| 5 | - 修正 `docs/session-handoff.md` 中关于 semantic lane 的旧状态残留,统一到当前真实事实:live 默认已落 `chromaprint_matcher + mert-v1-95m`,MuQ 仍是下一阶段 challenger。 | ||
| 6 | |||
| 7 | ## 2026-06-04 | ||
| 4 | - fresh runtime 进展:已在当前 host 成功安装 `torch-2.12.0+cpu`、`torchaudio-2.11.0+cpu` 与 `transformers-5.10.1`,重跑 song-centric 主链后确认 `semantic_runtime_available = true`、`semantic_runtime_ready_count = 5`、`semantic_fallback_count = 0`;当前 semantic 已从 fallback 推进到 `mert-v1-95m`,下一步可在不破坏当前 MERT 基线的前提下继续接 `MuQ` adapter。 | 8 | - fresh runtime 进展:已在当前 host 成功安装 `torch-2.12.0+cpu`、`torchaudio-2.11.0+cpu` 与 `transformers-5.10.1`,重跑 song-centric 主链后确认 `semantic_runtime_available = true`、`semantic_runtime_ready_count = 5`、`semantic_fallback_count = 0`;当前 semantic 已从 fallback 推进到 `mert-v1-95m`,下一步可在不破坏当前 MERT 基线的前提下继续接 `MuQ` adapter。 |
| 5 | - 新增 MuQ 接入线索固化:根据仓库现有 Phase-1 脚本与外部模型线索,下一步可优先尝试 `OpenMuQ/MuQ-large-msd-iter` 作为 MuQ challenger 的最小接入目标;官方加载入口可优先按 `from muq import MuQ` + `MuQ.from_pretrained("OpenMuQ/MuQ-large-msd-iter")`。 | 9 | - 新增 MuQ 接入线索固化:根据仓库现有 Phase-1 脚本与外部模型线索,下一步可优先尝试 `OpenMuQ/MuQ-large-msd-iter` 作为 MuQ challenger 的最小接入目标;官方加载入口可优先按 `from muq import MuQ` + `MuQ.from_pretrained("OpenMuQ/MuQ-large-msd-iter")`。 |
| 6 | - fresh MuQ 进展:当前 host 已完成 `muq` 包安装,但 `import muq` 仍被 `RuntimeError: operator torchvision::nms does not exist` 卡住;当前 blocker 已从“MuQ 未安装”推进为“torchvision 兼容问题”。 | 10 | - fresh MuQ 进展:当前 host 已完成 `muq` 包安装,但 `import muq` 仍被 `RuntimeError: operator torchvision::nms does not exist` 卡住;当前 blocker 已从“MuQ 未安装”推进为“torchvision 兼容问题”。 | ... | ... |
| ... | @@ -73,8 +73,8 @@ acr-engine/scripts/start_songcentric_shortest_path.sh 'postgres://d2:d2pass@127. | ... | @@ -73,8 +73,8 @@ acr-engine/scripts/start_songcentric_shortest_path.sh 'postgres://d2:d2pass@127. |
| 73 | 73 | ||
| 74 | - [start-here.md](./start-here.md):新同学 10 分钟接手入口 | 74 | - [start-here.md](./start-here.md):新同学 10 分钟接手入口 |
| 75 | - [session-handoff.md](./session-handoff.md):下次启动从哪里继续 | 75 | - [session-handoff.md](./session-handoff.md):下次启动从哪里继续 |
| 76 | - [postgresql-data-model.md](./postgresql-data-model.md):表设计、字段语义、流程图、设计取舍 | 76 | - [postgresql-data-model.md](./postgresql-data-model.md):表设计、字段语义、feature 与 audio_object 的绑定关系、Phase-1 模型落库口径 |
| 77 | - [postgres_db_schema_samples.md](./postgres_db_schema_samples.md):DDL、样例数据、典型 SQL、导入查询链路 | 77 | - [postgres_db_schema_samples.md](./postgres_db_schema_samples.md):DDL、manifest/SQL 样例、典型查询链路、真实存储示例 |
| 78 | - [CHANGELOG.md](./CHANGELOG.md):变更历史 | 78 | - [CHANGELOG.md](./CHANGELOG.md):变更历史 |
| 79 | 79 | ||
| 80 | --- | 80 | --- | ... | ... |
| ... | @@ -33,14 +33,39 @@ song -> asset -> window -> fingerprint / embedding | ... | @@ -33,14 +33,39 @@ song -> asset -> window -> fingerprint / embedding |
| 33 | | song | `media_entity` | `entity_type='song'` | `song_000001` | | 33 | | song | `media_entity` | `entity_type='song'` | `song_000001` | |
| 34 | | asset | `audio_object` | `object_type='asset'` | 一首歌的原始 wav/mp3/flac | | 34 | | asset | `audio_object` | `object_type='asset'` | 一首歌的原始 wav/mp3/flac | |
| 35 | | window | `audio_object` | `object_type='window'` | `0-5000ms`, `2500-7500ms` | | 35 | | window | `audio_object` | `object_type='window'` | `0-5000ms`, `2500-7500ms` | |
| 36 | | fingerprint | `feature_fact` | `feature_type='fingerprint'` | chromaprint | | 36 | | fingerprint | `feature_fact` | `feature_type='fingerprint'` | chromaprint_matcher | |
| 37 | | embedding | `feature_fact` | `feature_type='embedding'` | MERT/MuQ/fallback vector | | 37 | | embedding | `feature_fact` | `feature_type='embedding'` | MERT/MuQ/fallback vector | |
| 38 | | model | `feature_fact` | `model_name`, `model_version` | `mert-v1-95m`, `muq-base`, `local_wavehash_embed` | | 38 | | model | `feature_fact` | `model_name`, `model_version` | `chromaprint_matcher`, `mert-v1-95m`, `muq-large-msd-iter`, `local_wavehash_embed` | |
| 39 | | feature set | `feature_fact` | `feature_set_name`, `feature_schema_ver` | `mert_5s_hop2.5_v1` | | 39 | | feature set | `feature_fact` | `feature_set_name`, `feature_schema_ver` | `mert_5s_hop2.5_v1` | |
| 40 | 40 | ||
| 41 | --- | 41 | --- |
| 42 | 42 | ||
| 43 | ## 3. DDL | 43 | ## 3. Phase-1 数据绑定一页图 |
| 44 | |||
| 45 | ```mermaid | ||
| 46 | flowchart LR | ||
| 47 | S[media_entity | ||
| 48 | song] --> A[audio_object | ||
| 49 | asset] | ||
| 50 | A --> W[audio_object | ||
| 51 | window] | ||
| 52 | W --> F1[feature_fact | ||
| 53 | chromaprint_matcher] | ||
| 54 | W --> F2[feature_fact | ||
| 55 | mert-v1-95m] | ||
| 56 | W --> F3[feature_fact | ||
| 57 | muq-large-msd-iter 计划] | ||
| 58 | ``` | ||
| 59 | |||
| 60 | 关键绑定字段: | ||
| 61 | - `audio_object.song_id -> media_entity.entity_id` | ||
| 62 | - `window.parent_object_id -> asset.object_id` | ||
| 63 | - `feature_fact.object_id -> window.object_id` | ||
| 64 | - `feature_fact.song_id -> media_entity.entity_id` | ||
| 65 | |||
| 66 | 一句话:`feature_fact` 绑的是“具体 window”,不是抽象 song;但为了快速返回结果,又会把 `song_id` 冗余写进去。 | ||
| 67 | |||
| 68 | ## 4. DDL | ||
| 44 | 69 | ||
| 45 | ### 3.1 `media_entity` | 70 | ### 3.1 `media_entity` |
| 46 | 71 | ||
| ... | @@ -170,9 +195,63 @@ flowchart LR | ... | @@ -170,9 +195,63 @@ flowchart LR |
| 170 | 195 | ||
| 171 | --- | 196 | --- |
| 172 | 197 | ||
| 173 | ## 5. 样例数据 | 198 | ## 5. 导入前的 manifest 样例 |
| 199 | |||
| 200 | 当前主链导入前,推荐就把 feature 放到 `windows[].features[]` 里: | ||
| 201 | |||
| 202 | ```json | ||
| 203 | { | ||
| 204 | "song": {"biz_key": "song_alpha", "title": "song alpha", "artist_name": "artist a"}, | ||
| 205 | "asset": { | ||
| 206 | "source_type": "official", | ||
| 207 | "storage_uri": "/workspace/acr-engine/data/songcentric_builder_smoke/song_alpha/artist_a/clip1.wav", | ||
| 208 | "storage_scheme": "file", | ||
| 209 | "checksum": "path:/workspace/acr-engine/data/songcentric_builder_smoke/song_alpha/artist_a/clip1.wav", | ||
| 210 | "codec": "wav", | ||
| 211 | "sample_rate": 16000, | ||
| 212 | "channels": 1, | ||
| 213 | "duration_ms": 8000 | ||
| 214 | }, | ||
| 215 | "windows": [ | ||
| 216 | { | ||
| 217 | "start_ms": 0, | ||
| 218 | "end_ms": 5000, | ||
| 219 | "features": [ | ||
| 220 | { | ||
| 221 | "feature_type": "fingerprint", | ||
| 222 | "model_name": "chromaprint_matcher", | ||
| 223 | "model_version": "phase1_local", | ||
| 224 | "feature_set_name": "chromaprint_matcher_5s", | ||
| 225 | "fingerprint_value": "dc0c731425f360787f462da693ff4a50" | ||
| 226 | }, | ||
| 227 | { | ||
| 228 | "feature_type": "embedding", | ||
| 229 | "model_name": "mert-v1-95m", | ||
| 230 | "model_version": "hf-main", | ||
| 231 | "feature_set_name": "mert_5s_hop2.5_v1", | ||
| 232 | "feature_schema_ver": "v1", | ||
| 233 | "embedding_dim": 768, | ||
| 234 | "embedding_uri": "inline-mert://19c0162d3bdde235:0:5000", | ||
| 235 | "vector_table_name": "audio_embedding_vector_768_placeholder" | ||
| 236 | } | ||
| 237 | ] | ||
| 238 | } | ||
| 239 | ], | ||
| 240 | "memberships": [ | ||
| 241 | {"set_type": "reference_set", "set_name": "phase1_hot_reference_v1", "member_type": "asset", "priority": 100} | ||
| 242 | ] | ||
| 243 | } | ||
| 244 | ``` | ||
| 245 | |||
| 246 | 这份 JSON 的含义非常直接: | ||
| 247 | - `song` 决定最终要回到哪个 `song_id` | ||
| 248 | - `asset` 决定原始音频文件是谁 | ||
| 249 | - `windows[]` 决定切片边界 | ||
| 250 | - `windows[].features[]` 决定每个切片已经由哪些模型编码过 | ||
| 251 | |||
| 252 | ## 6. 样例数据 | ||
| 174 | 253 | ||
| 175 | ### 5.1 写 song | 254 | ### 6.1 写 song |
| 176 | 255 | ||
| 177 | ```sql | 256 | ```sql |
| 178 | insert into media_entity ( | 257 | insert into media_entity ( |
| ... | @@ -184,7 +263,7 @@ insert into media_entity ( | ... | @@ -184,7 +263,7 @@ insert into media_entity ( |
| 184 | returning entity_id; | 263 | returning entity_id; |
| 185 | ``` | 264 | ``` |
| 186 | 265 | ||
| 187 | ### 5.2 写 asset | 266 | ### 6.2 写 asset |
| 188 | 267 | ||
| 189 | ```sql | 268 | ```sql |
| 190 | insert into audio_object ( | 269 | insert into audio_object ( |
| ... | @@ -199,7 +278,7 @@ insert into audio_object ( | ... | @@ -199,7 +278,7 @@ insert into audio_object ( |
| 199 | returning object_id; | 278 | returning object_id; |
| 200 | ``` | 279 | ``` |
| 201 | 280 | ||
| 202 | ### 5.3 写 window | 281 | ### 6.3 写 window |
| 203 | 282 | ||
| 204 | ```sql | 283 | ```sql |
| 205 | insert into audio_object ( | 284 | insert into audio_object ( |
| ... | @@ -211,7 +290,7 @@ insert into audio_object ( | ... | @@ -211,7 +290,7 @@ insert into audio_object ( |
| 211 | returning object_id; | 290 | returning object_id; |
| 212 | ``` | 291 | ``` |
| 213 | 292 | ||
| 214 | ### 5.4 写 fingerprint | 293 | ### 6.4 写 fingerprint |
| 215 | 294 | ||
| 216 | ```sql | 295 | ```sql |
| 217 | insert into feature_fact ( | 296 | insert into feature_fact ( |
| ... | @@ -220,13 +299,13 @@ insert into feature_fact ( | ... | @@ -220,13 +299,13 @@ insert into feature_fact ( |
| 220 | fingerprint_value, checksum, metadata_json | 299 | fingerprint_value, checksum, metadata_json |
| 221 | ) values ( | 300 | ) values ( |
| 222 | 'fingerprint', :window_id, :song_id, | 301 | 'fingerprint', :window_id, :song_id, |
| 223 | 'chromaprint', '1.0', 'chromaprint_5s_v1', 'v1', | 302 | 'chromaprint_matcher', 'phase1_local', 'chromaprint_matcher_5s', 'v1', |
| 224 | 'AQAAE0mUaEkSZSo...', 'sha256:fp001', | 303 | 'AQAAE0mUaEkSZSo...', 'sha256:fp001', |
| 225 | '{"lane":"exact"}'::jsonb | 304 | '{"lane":"exact"}'::jsonb |
| 226 | ); | 305 | ); |
| 227 | ``` | 306 | ``` |
| 228 | 307 | ||
| 229 | ### 5.5 写 embedding | 308 | ### 6.5 写 embedding |
| 230 | 309 | ||
| 231 | ```sql | 310 | ```sql |
| 232 | insert into feature_fact ( | 311 | insert into feature_fact ( |
| ... | @@ -241,7 +320,7 @@ insert into feature_fact ( | ... | @@ -241,7 +320,7 @@ insert into feature_fact ( |
| 241 | ); | 320 | ); |
| 242 | ``` | 321 | ``` |
| 243 | 322 | ||
| 244 | ### 5.6 写 set membership | 323 | ### 6.6 写 set membership |
| 245 | 324 | ||
| 246 | ```sql | 325 | ```sql |
| 247 | insert into set_membership ( | 326 | insert into set_membership ( |
| ... | @@ -254,7 +333,7 @@ insert into set_membership ( | ... | @@ -254,7 +333,7 @@ insert into set_membership ( |
| 254 | 333 | ||
| 255 | --- | 334 | --- |
| 256 | 335 | ||
| 257 | ## 6. 典型查询 | 336 | ## 7. 典型查询 |
| 258 | 337 | ||
| 259 | ### 6.1 查看某首歌有哪些 asset | 338 | ### 6.1 查看某首歌有哪些 asset |
| 260 | 339 | ||
| ... | @@ -555,7 +634,7 @@ insert into feature_fact ( | ... | @@ -555,7 +634,7 @@ insert into feature_fact ( |
| 555 | fingerprint_value | 634 | fingerprint_value |
| 556 | ) values ( | 635 | ) values ( |
| 557 | 'fingerprint', :window_id, :song_id, | 636 | 'fingerprint', :window_id, :song_id, |
| 558 | 'chromaprint', '1.0', 'chromaprint_5s_v1', 'v1', | 637 | 'chromaprint_matcher', 'phase1_local', 'chromaprint_matcher_5s', 'v1', |
| 559 | 'AQAAE0mUaEkSZSo...' | 638 | 'AQAAE0mUaEkSZSo...' |
| 560 | ); | 639 | ); |
| 561 | ``` | 640 | ``` |
| ... | @@ -583,7 +662,7 @@ insert into feature_fact ( | ... | @@ -583,7 +662,7 @@ insert into feature_fact ( |
| 583 | embedding_dim, embedding_uri, vector_table_name | 662 | embedding_dim, embedding_uri, vector_table_name |
| 584 | ) values ( | 663 | ) values ( |
| 585 | 'embedding', :window_id, :song_id, | 664 | 'embedding', :window_id, :song_id, |
| 586 | 'muq-base', 'hf-main', 'muq_5s_hop2.5_v1', 'v1', | 665 | 'muq-large-msd-iter', 'hf-main', 'muq_5s_hop2.5_v1', 'v1', |
| 587 | 768, 's3://bucket/emb/demo_song_win0001_muq.npy', 'audio_embedding_vector_768' | 666 | 768, 's3://bucket/emb/demo_song_win0001_muq.npy', 'audio_embedding_vector_768' |
| 588 | ); | 667 | ); |
| 589 | ``` | 668 | ``` |
| ... | @@ -636,13 +715,50 @@ order by ff.feature_type, ff.model_name; | ... | @@ -636,13 +715,50 @@ order by ff.feature_type, ff.model_name; |
| 636 | 715 | ||
| 637 | --- | 716 | --- |
| 638 | 717 | ||
| 639 | ## 14. 一个完整的多 asset / 多 window / 多 model 样例 | 718 | ## 14. 一个真实绑定查询样例 |
| 719 | |||
| 720 | 下面这条 SQL 用来回答用户最关心的问题: | ||
| 721 | |||
| 722 | > 一条 feature 是怎么和 audio object 绑定,并最终回到 `song_id` 的? | ||
| 723 | |||
| 724 | ```sql | ||
| 725 | select ff.feature_id, | ||
| 726 | ff.feature_type, | ||
| 727 | ff.model_name, | ||
| 728 | ff.model_version, | ||
| 729 | ff.feature_set_name, | ||
| 730 | w.object_id as window_id, | ||
| 731 | w.start_ms, | ||
| 732 | w.end_ms, | ||
| 733 | a.object_id as asset_id, | ||
| 734 | a.storage_uri, | ||
| 735 | s.entity_id as song_id, | ||
| 736 | s.biz_key | ||
| 737 | from feature_fact ff | ||
| 738 | join audio_object w | ||
| 739 | on w.object_id = ff.object_id | ||
| 740 | and w.object_type = 'window' | ||
| 741 | join audio_object a | ||
| 742 | on a.object_id = w.parent_object_id | ||
| 743 | and a.object_type = 'asset' | ||
| 744 | join media_entity s | ||
| 745 | on s.entity_id = ff.song_id | ||
| 746 | where ff.feature_id = :feature_id; | ||
| 747 | ``` | ||
| 748 | |||
| 749 | 你可以把它理解成 4 步: | ||
| 750 | 1. 从 `feature_fact` 找到这条特征 | ||
| 751 | 2. 用 `object_id` 找到它绑定的 `window` | ||
| 752 | 3. 用 `parent_object_id` 找到它所属的 `asset` | ||
| 753 | 4. 用 `song_id` 找到最终归属的 `song` | ||
| 754 | |||
| 755 | ## 15. 一个完整的多 asset / 多 window / 多 model 样例 | ||
| 640 | 756 | ||
| 641 | 假设: | 757 | 假设: |
| 642 | - 同一个 `song_id = 1001` | 758 | - 同一个 `song_id = 1001` |
| 643 | - 有 2 个音频文件:`master.wav`、`ugc_clip.mp3` | 759 | - 有 2 个音频文件:`master.wav`、`ugc_clip.mp3` |
| 644 | - 每个 asset 切成 2 个 window | 760 | - 每个 asset 切成 2 个 window |
| 645 | - 每个 window 都跑 `chromaprint + mert-v1-95m + muq-base` | 761 | - 每个 window 都跑 `chromaprint_matcher + mert-v1-95m + muq-large-msd-iter` |
| 646 | 762 | ||
| 647 | ### 14.1 逻辑结构 | 763 | ### 14.1 逻辑结构 |
| 648 | 764 | ||
| ... | @@ -650,22 +766,22 @@ order by ff.feature_type, ff.model_name; | ... | @@ -650,22 +766,22 @@ order by ff.feature_type, ff.model_name; |
| 650 | song(1001) | 766 | song(1001) |
| 651 | -> asset(2001, master.wav) | 767 | -> asset(2001, master.wav) |
| 652 | -> window(3001, 0-5000) | 768 | -> window(3001, 0-5000) |
| 653 | -> chromaprint | 769 | -> chromaprint_matcher |
| 654 | -> mert-v1-95m | 770 | -> mert-v1-95m |
| 655 | -> muq-base | 771 | -> muq-large-msd-iter |
| 656 | -> window(3002, 2500-7500) | 772 | -> window(3002, 2500-7500) |
| 657 | -> chromaprint | 773 | -> chromaprint_matcher |
| 658 | -> mert-v1-95m | 774 | -> mert-v1-95m |
| 659 | -> muq-base | 775 | -> muq-large-msd-iter |
| 660 | -> asset(2002, ugc_clip.mp3) | 776 | -> asset(2002, ugc_clip.mp3) |
| 661 | -> window(3003, 10000-15000) | 777 | -> window(3003, 10000-15000) |
| 662 | -> chromaprint | 778 | -> chromaprint_matcher |
| 663 | -> mert-v1-95m | 779 | -> mert-v1-95m |
| 664 | -> muq-base | 780 | -> muq-large-msd-iter |
| 665 | -> window(3004, 12500-17500) | 781 | -> window(3004, 12500-17500) |
| 666 | -> chromaprint | 782 | -> chromaprint_matcher |
| 667 | -> mert-v1-95m | 783 | -> mert-v1-95m |
| 668 | -> muq-base | 784 | -> muq-large-msd-iter |
| 669 | ``` | 785 | ``` |
| 670 | 786 | ||
| 671 | ### 14.2 会落成多少行 | 787 | ### 14.2 会落成多少行 |
| ... | @@ -706,7 +822,7 @@ order by a.object_id, w.start_ms, ff.feature_type, ff.model_name; | ... | @@ -706,7 +822,7 @@ order by a.object_id, w.start_ms, ff.feature_type, ff.model_name; |
| 706 | 822 | ||
| 707 | ### 14.4 查询哪些 window 缺某个模型 | 823 | ### 14.4 查询哪些 window 缺某个模型 |
| 708 | 824 | ||
| 709 | 这个 SQL 很适合做补算任务扫描,比如检查哪些 window 还没跑 `muq-base`: | 825 | 这个 SQL 很适合做补算任务扫描,比如检查哪些 window 还没跑 `muq-large-msd-iter`: |
| 710 | 826 | ||
| 711 | ```sql | 827 | ```sql |
| 712 | select w.object_id as window_id, | 828 | select w.object_id as window_id, |
| ... | @@ -721,7 +837,7 @@ where w.object_type = 'window' | ... | @@ -721,7 +837,7 @@ where w.object_type = 'window' |
| 721 | from feature_fact ff | 837 | from feature_fact ff |
| 722 | where ff.object_id = w.object_id | 838 | where ff.object_id = w.object_id |
| 723 | and ff.feature_type = 'embedding' | 839 | and ff.feature_type = 'embedding' |
| 724 | and ff.model_name = 'muq-base' | 840 | and ff.model_name = 'muq-large-msd-iter' |
| 725 | and ff.model_version = 'hf-main' | 841 | and ff.model_version = 'hf-main' |
| 726 | and ff.feature_set_name = 'muq_5s_hop2.5_v1' | 842 | and ff.feature_set_name = 'muq_5s_hop2.5_v1' |
| 727 | ) | 843 | ) |
| ... | @@ -746,7 +862,7 @@ order by w.start_ms; | ... | @@ -746,7 +862,7 @@ order by w.start_ms; |
| 746 | 862 | ||
| 747 | --- | 863 | --- |
| 748 | 864 | ||
| 749 | ## 15. 批量入库与索引建设样例 | 865 | ## 16. 批量入库与索引建设样例 |
| 750 | 866 | ||
| 751 | ### 15.1 推荐批量顺序 | 867 | ### 15.1 推荐批量顺序 |
| 752 | 868 | ||
| ... | @@ -756,7 +872,7 @@ batch-2: audio_object(asset) | ... | @@ -756,7 +872,7 @@ batch-2: audio_object(asset) |
| 756 | batch-3: audio_object(window) | 872 | batch-3: audio_object(window) |
| 757 | batch-4: feature_fact(chromaprint) | 873 | batch-4: feature_fact(chromaprint) |
| 758 | batch-5: feature_fact(mert-v1-95m) | 874 | batch-5: feature_fact(mert-v1-95m) |
| 759 | batch-6: feature_fact(muq-base) | 875 | batch-6: feature_fact(muq-large-msd-iter) |
| 760 | ``` | 876 | ``` |
| 761 | 877 | ||
| 762 | ### 15.2 推荐补充索引 | 878 | ### 15.2 推荐补充索引 | ... | ... |
| ... | @@ -67,7 +67,68 @@ song -> asset -> window -> fingerprint / embedding | ... | @@ -67,7 +67,68 @@ song -> asset -> window -> fingerprint / embedding |
| 67 | | feature set identity | `feature_fact` | `feature_set_name`, `feature_schema_ver` | 区分特征配置、窗口策略、schema 版本 | | 67 | | feature set identity | `feature_fact` | `feature_set_name`, `feature_schema_ver` | 区分特征配置、窗口策略、schema 版本 | |
| 68 | | reference routing | `set_membership` | `set_type`, `set_name` | 控制 reference/eval/hot 范围 | | 68 | | reference routing | `set_membership` | `set_type`, `set_name` | 控制 reference/eval/hot 范围 | |
| 69 | 69 | ||
| 70 | ### 4.1 一个关键设计点 | 70 | ### 4.1 feature 和 audio_object 到底怎么绑定 |
| 71 | |||
| 72 | 这是当前 schema 最关键的一层: | ||
| 73 | |||
| 74 | ```text | ||
| 75 | feature_fact.object_id -> audio_object.object_id | ||
| 76 | ``` | ||
| 77 | |||
| 78 | 含义: | ||
| 79 | - 一条 `feature_fact` 永远对应一个具体音频对象 | ||
| 80 | - 在当前 Phase-1 主链里,这个对象默认是 `window` | ||
| 81 | - 所以检索命中的最小证据单元是 `window`,不是整首 song,也不是整份 asset | ||
| 82 | |||
| 83 | 再往上回溯: | ||
| 84 | |||
| 85 | ```text | ||
| 86 | feature_fact.object_id -> window.object_id | ||
| 87 | window.parent_object_id -> asset.object_id | ||
| 88 | window.song_id / feature_fact.song_id -> media_entity.entity_id | ||
| 89 | ``` | ||
| 90 | |||
| 91 | 也就是说: | ||
| 92 | - `object_id` 负责绑定到“具体哪段音频” | ||
| 93 | - `parent_object_id` 负责回到“这段音频属于哪份 asset” | ||
| 94 | - `song_id` 负责快速回到“最终归属哪个 song_id” | ||
| 95 | |||
| 96 | ### 4.2 为什么 `feature_fact` 里还要冗余存 `song_id` | ||
| 97 | |||
| 98 | 因为版权保护场景里,在线服务最终要快速输出 `song_id`。 | ||
| 99 | |||
| 100 | 所以 `feature_fact.song_id` 是一个**有意的冗余字段**,目的有 3 个: | ||
| 101 | - 减少召回后 song-level 聚合时的 join 成本 | ||
| 102 | - 允许直接按 `song_id + model_name + feature_type` 做覆盖率巡检 | ||
| 103 | - 便于后续把 `window` 命中快速折叠为 song-level evidence | ||
| 104 | |||
| 105 | ### 4.3 Phase-1 默认为什么把 feature 绑到 `window` 而不是 `asset` | ||
| 106 | |||
| 107 | 因为 Phase-1 的目标不是只知道“这份音频大概像谁”,而是还要保留: | ||
| 108 | - 命中的 offset | ||
| 109 | - 命中的具体 5s 片段 | ||
| 110 | - exact / semantic 在同一时间段上的并行证据 | ||
| 111 | |||
| 112 | 因此默认策略是: | ||
| 113 | - `asset`:承载原始音频文件 | ||
| 114 | - `window`:承载检索、匹配、回溯最小单元 | ||
| 115 | - `feature_fact`:默认挂到 `window` | ||
| 116 | |||
| 117 | ### 4.4 一个最小链路示意 | ||
| 118 | |||
| 119 | ```mermaid | ||
| 120 | flowchart LR | ||
| 121 | F[feature_fact | ||
| 122 | model_name=mert-v1-95m] --> W[audio_object | ||
| 123 | object_type=window] | ||
| 124 | W --> A[audio_object | ||
| 125 | object_type=asset] | ||
| 126 | W --> S[media_entity | ||
| 127 | entity_type=song] | ||
| 128 | F --> S | ||
| 129 | ``` | ||
| 130 | |||
| 131 | ### 4.5 一个关键设计点 | ||
| 71 | 132 | ||
| 72 | 当前 **模型信息不单独放 registry 表作为默认主链依赖**,而是先直接沉淀在 `feature_fact`: | 133 | 当前 **模型信息不单独放 registry 表作为默认主链依赖**,而是先直接沉淀在 `feature_fact`: |
| 73 | - 这样 Phase-1 更轻 | 134 | - 这样 Phase-1 更轻 |
| ... | @@ -610,10 +671,10 @@ flowchart TD | ... | @@ -610,10 +671,10 @@ flowchart TD |
| 610 | 671 | ||
| 611 | | lane | model_name | model_version | feature_type | 用途 | | 672 | | lane | model_name | model_version | feature_type | 用途 | |
| 612 | |---|---|---|---|---| | 673 | |---|---|---|---|---| |
| 613 | | exact | `chromaprint` | `1.0` | `fingerprint` | 高精度 exact 命中 | | 674 | | exact(当前 live) | `chromaprint_matcher` | `phase1_local` | `fingerprint` | 当前 live exact baseline | |
| 614 | | semantic baseline | `mert-v1-95m` | `hf-main` | `embedding` | song semantic baseline | | 675 | | semantic baseline(当前 live) | `mert-v1-95m` | `hf-main` | `embedding` | 当前 live semantic baseline | |
| 615 | | semantic challenger | `muq-base` | `hf-main` | `embedding` | cover / bgm / 复杂干扰 challenger | | 676 | | semantic challenger(计划) | `muq-large-msd-iter` | `hf-main` | `embedding` | 下一阶段 cover / bgm / 复杂干扰 challenger | |
| 616 | | semantic fallback | `local_wavehash_embed` | `phase1_local` | `embedding` | 当前 host 缺 runtime 时兜底 | | 677 | | semantic fallback | `local_wavehash_embed` | `phase1_local` | `embedding` | runtime 不可用时兜底 | |
| 617 | | historical baseline | `ecapa-tdnn` | `baseline_only` | `embedding` | 历史对比,不建议做 Phase-1 主导 | | 678 | | historical baseline | `ecapa-tdnn` | `baseline_only` | `embedding` | 历史对比,不建议做 Phase-1 主导 | |
| 618 | 679 | ||
| 619 | ### 16.2 建议用什么字段固化模型身份 | 680 | ### 16.2 建议用什么字段固化模型身份 |
| ... | @@ -635,17 +696,17 @@ flowchart TD | ... | @@ -635,17 +696,17 @@ flowchart TD |
| 635 | ``` | 696 | ``` |
| 636 | 697 | ||
| 637 | 例如: | 698 | 例如: |
| 638 | - `chromaprint_5s_v1` | 699 | - `chromaprint_matcher_5s`(当前 live) |
| 639 | - `mert_5s_hop2.5_v1` | 700 | - `mert_5s_hop2.5_v1`(当前 live) |
| 640 | - `muq_5s_hop2.5_v1` | 701 | - `muq_5s_hop2.5_v1`(计划) |
| 641 | - `wavehash_5s_hop2.5_v1` | 702 | - `wavehash_5s_hop2.5_v1`(fallback) |
| 642 | 703 | ||
| 643 | ### 16.4 Phase-1 推荐的存储规则 | 704 | ### 16.4 Phase-1 推荐的存储规则 |
| 644 | 705 | ||
| 645 | #### exact lane | 706 | #### exact lane |
| 646 | - `feature_type = 'fingerprint'` | 707 | - `feature_type = 'fingerprint'` |
| 647 | - `fingerprint_value` 必填 | 708 | - `fingerprint_value` 必填 |
| 648 | - `model_name = 'chromaprint'` | 709 | - `model_name = 'chromaprint_matcher'` |
| 649 | - `embedding_uri / vector_table_name` 为空 | 710 | - `embedding_uri / vector_table_name` 为空 |
| 650 | 711 | ||
| 651 | #### semantic lane | 712 | #### semantic lane |
| ... | @@ -674,7 +735,7 @@ flowchart TD | ... | @@ -674,7 +735,7 @@ flowchart TD |
| 674 | 3. 切窗并写 `audio_object(window)` | 735 | 3. 切窗并写 `audio_object(window)` |
| 675 | 4. 跑 `chromaprint`,写 `feature_fact(fingerprint)` | 736 | 4. 跑 `chromaprint`,写 `feature_fact(fingerprint)` |
| 676 | 5. 跑 `mert-v1-95m`,写 `feature_fact(embedding)` | 737 | 5. 跑 `mert-v1-95m`,写 `feature_fact(embedding)` |
| 677 | 6. 跑 `muq-base`,写 `feature_fact(embedding)` | 738 | 6. 下一阶段接 `muq-large-msd-iter`,写 `feature_fact(embedding)` |
| 678 | 7. 如果 runtime 不可用,至少写 `local_wavehash_embed` fallback | 739 | 7. 如果 runtime 不可用,至少写 `local_wavehash_embed` fallback |
| 679 | 740 | ||
| 680 | 这样最终会形成: | 741 | 这样最终会形成: |
| ... | @@ -683,7 +744,7 @@ flowchart TD | ... | @@ -683,7 +744,7 @@ flowchart TD |
| 683 | 同一个 window | 744 | 同一个 window |
| 684 | -> 1 条 chromaprint fingerprint | 745 | -> 1 条 chromaprint fingerprint |
| 685 | -> 1 条 mert embedding | 746 | -> 1 条 mert embedding |
| 686 | -> 1 条 muq embedding | 747 | -> 1 条 muq embedding(接入后) |
| 687 | -> (可选) 1 条 fallback embedding | 748 | -> (可选) 1 条 fallback embedding |
| 688 | ``` | 749 | ``` |
| 689 | 750 | ||
| ... | @@ -693,7 +754,87 @@ flowchart TD | ... | @@ -693,7 +754,87 @@ flowchart TD |
| 693 | 754 | ||
| 694 | --- | 755 | --- |
| 695 | 756 | ||
| 696 | ## 17. 100w 音频 / 30w song 的批量入库与索引建设策略 | 757 | ## 17. 当前 live 样例:一条 feature 是怎么回到 song_id 的 |
| 758 | |||
| 759 | 下面是当前 PostgreSQL `acr_songcentric_test` 的真实主链口径: | ||
| 760 | |||
| 761 | - `feature_type = 'fingerprint'` 时,当前 live `model_name = 'chromaprint_matcher'` | ||
| 762 | - `feature_type = 'embedding'` 时,当前 live baseline `model_name = 'mert-v1-95m'` | ||
| 763 | - 历史测试里还能看到旧的 placeholder / fallback 行,但它们不是当前默认基线 | ||
| 764 | |||
| 765 | ### 17.1 一个真实 manifest 样例(导入前) | ||
| 766 | |||
| 767 | ```json | ||
| 768 | { | ||
| 769 | "song": {"biz_key": "song_alpha", "title": "song alpha", "artist_name": "artist a"}, | ||
| 770 | "asset": {"storage_uri": ".../clip1.wav", "duration_ms": 8000}, | ||
| 771 | "windows": [ | ||
| 772 | { | ||
| 773 | "start_ms": 0, | ||
| 774 | "end_ms": 5000, | ||
| 775 | "features": [ | ||
| 776 | { | ||
| 777 | "feature_type": "fingerprint", | ||
| 778 | "model_name": "chromaprint_matcher", | ||
| 779 | "model_version": "phase1_local", | ||
| 780 | "feature_set_name": "chromaprint_matcher_5s" | ||
| 781 | }, | ||
| 782 | { | ||
| 783 | "feature_type": "embedding", | ||
| 784 | "model_name": "mert-v1-95m", | ||
| 785 | "model_version": "hf-main", | ||
| 786 | "feature_set_name": "mert_5s_hop2.5_v1", | ||
| 787 | "embedding_dim": 768 | ||
| 788 | } | ||
| 789 | ] | ||
| 790 | } | ||
| 791 | ] | ||
| 792 | } | ||
| 793 | ``` | ||
| 794 | |||
| 795 | ### 17.2 导入后的绑定结果应该长什么样 | ||
| 796 | |||
| 797 | ```text | ||
| 798 | media_entity(song_alpha) | ||
| 799 | -> audio_object(asset: clip1.wav) | ||
| 800 | -> audio_object(window: 0-5000) | ||
| 801 | -> feature_fact(fingerprint, chromaprint_matcher) | ||
| 802 | -> feature_fact(embedding, mert-v1-95m) | ||
| 803 | ``` | ||
| 804 | |||
| 805 | ### 17.3 查询某条 feature 绑定到哪个 window / asset / song | ||
| 806 | |||
| 807 | ```sql | ||
| 808 | select ff.feature_id, | ||
| 809 | ff.feature_type, | ||
| 810 | ff.model_name, | ||
| 811 | ff.feature_set_name, | ||
| 812 | w.object_id as window_id, | ||
| 813 | w.start_ms, | ||
| 814 | w.end_ms, | ||
| 815 | a.object_id as asset_id, | ||
| 816 | a.storage_uri, | ||
| 817 | s.entity_id as song_id, | ||
| 818 | s.biz_key | ||
| 819 | from feature_fact ff | ||
| 820 | join audio_object w | ||
| 821 | on w.object_id = ff.object_id | ||
| 822 | and w.object_type = 'window' | ||
| 823 | join audio_object a | ||
| 824 | on a.object_id = w.parent_object_id | ||
| 825 | and a.object_type = 'asset' | ||
| 826 | join media_entity s | ||
| 827 | on s.entity_id = ff.song_id | ||
| 828 | where ff.feature_id = :feature_id; | ||
| 829 | ``` | ||
| 830 | |||
| 831 | 这条 SQL 回答的就是: | ||
| 832 | - 这条 feature 是哪个模型算的 | ||
| 833 | - 它绑定的是哪个 window | ||
| 834 | - 这个 window 属于哪个 asset | ||
| 835 | - 最终应该归到哪个 `song_id` | ||
| 836 | |||
| 837 | ## 18. 100w 音频 / 30w song 的批量入库与索引建设策略 | ||
| 697 | 838 | ||
| 698 | 当前规模下,最重要的原则不是一次把所有模型都算完,而是: | 839 | 当前规模下,最重要的原则不是一次把所有模型都算完,而是: |
| 699 | 840 | ... | ... |
| ... | @@ -42,7 +42,7 @@ acr-engine/scripts/start_songcentric_shortest_path.sh 'postgres://d2:d2pass@127. | ... | @@ -42,7 +42,7 @@ acr-engine/scripts/start_songcentric_shortest_path.sh 'postgres://d2:d2pass@127. |
| 42 | > **4 表 song-centric schema 已在 live PostgreSQL 上真实打通了“真实目录 -> 切片 -> exact/semantic feature enrichment -> import -> feature_fact”的宿主链。** | 42 | > **4 表 song-centric schema 已在 live PostgreSQL 上真实打通了“真实目录 -> 切片 -> exact/semantic feature enrichment -> import -> feature_fact”的宿主链。** |
| 43 | 43 | ||
| 44 | 下一步最应该做的是: | 44 | 下一步最应该做的是: |
| 45 | > **在不破坏这条宿主链的前提下,把 semantic lane 从 runtime-aware fallback 升级到真实 MERT / MuQ adapter。** | 45 | > **在不破坏这条宿主链的前提下,继续把 semantic lane 从当前 MERT baseline 扩展到 MuQ challenger。** |
| 46 | 46 | ||
| 47 | --- | 47 | --- |
| 48 | 48 | ||
| ... | @@ -114,7 +114,7 @@ flowchart TD | ... | @@ -114,7 +114,7 @@ flowchart TD |
| 114 | 5. 真实目录 -> manifest -> import 已验证通过 | 114 | 5. 真实目录 -> manifest -> import 已验证通过 |
| 115 | 6. 真实目录 -> fingerprint enrichment -> import 已验证通过 | 115 | 6. 真实目录 -> fingerprint enrichment -> import 已验证通过 |
| 116 | 7. exact lane 已优先复用仓库内 `ChromaprintMatcher` | 116 | 7. exact lane 已优先复用仓库内 `ChromaprintMatcher` |
| 117 | 8. semantic lane 已 runtime-ready,当前 host 已可进入 placeholder runtime 分支 | 117 | 8. semantic lane 已真实接入 `mert-v1-95m` baseline,当前 host 的 live 主链已不再停留在 placeholder 分支 |
| 118 | 118 | ||
| 119 | --- | 119 | --- |
| 120 | 120 | ||
| ... | @@ -169,48 +169,67 @@ flowchart TD | ... | @@ -169,48 +169,67 @@ flowchart TD |
| 169 | 169 | ||
| 170 | --- | 170 | --- |
| 171 | 171 | ||
| 172 | ## 10. 真实 semantic adapter 下一步应该接到哪里 | 172 | ## 10. 数据关联与当前 live 落库事实 |
| 173 | 173 | ||
| 174 | 当前最直接的接入点已经明确: | 174 | 当前最重要的绑定关系只有 3 条: |
| 175 | 175 | ||
| 176 | - 入口脚本:`acr-engine/scripts/enrich_songcentric_manifest_with_local_features.py` | 176 | 1. `feature_fact.object_id -> audio_object.object_id` |
| 177 | - 关键函数:`build_semantic_feature(...)` | 177 | - feature 绑定到具体音频对象 |
| 178 | 178 | - Phase-1 默认绑定 `window`,不是直接绑定 song | |
| 179 | ### 当前真实状态 | 179 | 2. `audio_object.parent_object_id -> audio_object.object_id` |
| 180 | 180 | - `window -> asset` 父子回溯链 | |
| 181 | - exact lane 已优先复用 `ChromaprintMatcher` | 181 | 3. `feature_fact.song_id -> media_entity.entity_id` |
| 182 | - semantic lane 还没有真实接入 `MERT / MuQ` | 182 | - 用于快速做 song-level 聚合与最终返回 `song_id` |
| 183 | - runtime 就绪时,当前会产出: | 183 | |
| 184 | - `model_name = mert-v1-95m` | 184 | 可以用一句话理解: |
| 185 | - fallback 分支仍保留: | 185 | |
| 186 | - `model_name = local_wavehash_embed` | 186 | > `audio_object` 说明“这段音频是谁”,`feature_fact` 说明“这段音频被哪个模型编码成了什么特征”。 |
| 187 | 187 | ||
| 188 | ### fresh 依赖检查事实 | 188 | ### 当前 live 主链已经真实落了什么 |
| 189 | 189 | ||
| 190 | 当前 host 仍缺: | 190 | 当前 live 新数据已经真实落到: |
| 191 | - `torch` | 191 | - exact:`chromaprint_matcher / phase1_local / chromaprint_matcher_5s` |
| 192 | - `torchaudio` | 192 | - semantic baseline:`mert-v1-95m / hf-main / mert_5s_hop2.5_v1` |
| 193 | - `transformers` | 193 | |
| 194 | 194 | 当前 MuQ 状态: | |
| 195 | ### 下次 session 最直接的实现顺序 | 195 | - 目标模型:`OpenMuQ/MuQ-large-msd-iter` |
| 196 | 196 | - 当前 blocker:`import muq` 触发 `RuntimeError: operator torchvision::nms does not exist` | |
| 197 | 1. 安装 `torch / torchaudio / transformers` | 197 | - 结论:MuQ 仍是下一阶段 challenger,不是当前 live 默认基线 |
| 198 | 2. 在 `build_semantic_feature(...)` 内接真实 `MERT` 或 `MuQ` adapter | 198 | |
| 199 | 3. 保留当前 `local_wavehash_embed` fallback 不删 | 199 | ### 当前 manifest 形状(导入前) |
| 200 | 4. 重跑: | 200 | |
| 201 | 201 | ```json | |
| 202 | ```bash | 202 | { |
| 203 | cd /workspace | 203 | "song": {"biz_key": "song_alpha", "title": "song alpha"}, |
| 204 | /usr/local/miniconda3/bin/python acr-engine/scripts/run_songcentric_directory_pipeline_live.py \ | 204 | "asset": {"storage_uri": ".../clip1.wav"}, |
| 205 | --dsn 'postgres://d2:d2pass@127.0.0.1:5432/d2' \ | 205 | "windows": [ |
| 206 | --schema acr_songcentric_test \ | 206 | { |
| 207 | --input-root acr-engine/data/songcentric_builder_smoke \ | 207 | "start_ms": 0, |
| 208 | --output-dir acr-engine/data/pgvector_eval/music20 | 208 | "end_ms": 5000, |
| 209 | "features": [ | ||
| 210 | { | ||
| 211 | "feature_type": "fingerprint", | ||
| 212 | "model_name": "chromaprint_matcher", | ||
| 213 | "feature_set_name": "chromaprint_matcher_5s" | ||
| 214 | }, | ||
| 215 | { | ||
| 216 | "feature_type": "embedding", | ||
| 217 | "model_name": "mert-v1-95m", | ||
| 218 | "feature_set_name": "mert_5s_hop2.5_v1", | ||
| 219 | "embedding_dim": 768 | ||
| 220 | } | ||
| 221 | ] | ||
| 222 | } | ||
| 223 | ] | ||
| 224 | } | ||
| 209 | ``` | 225 | ``` |
| 210 | 226 | ||
| 211 | ### 期望看到的 fresh 指标变化 | 227 | ### 下次 session 最直接的继续点 |
| 212 | |||
| 213 | - `semantic_runtime_available = true` | ||
| 214 | - `semantic_runtime_ready_count > 0` | ||
| 215 | - `semantic_fallback_count` 明显下降或归零 | ||
| 216 | 228 | ||
| 229 | 1. 不要再验证 MERT 是否接上,已经接上 | ||
| 230 | 2. 直接处理 MuQ 的 `torchvision::nms` 兼容问题 | ||
| 231 | 3. 接入 `OpenMuQ/MuQ-large-msd-iter` challenger | ||
| 232 | 4. 重跑主链 runner,确认每个 window 最终可同时看到: | ||
| 233 | - `chromaprint_matcher` | ||
| 234 | - `mert-v1-95m` | ||
| 235 | - `muq-large-msd-iter`(或最终统一后的 `model_name`) | ... | ... |
| ... | @@ -78,6 +78,12 @@ song -> asset -> window -> fingerprint / embedding | ... | @@ -78,6 +78,12 @@ song -> asset -> window -> fingerprint / embedding |
| 78 | | 模型信息 | `feature_fact` | `model_name`, `model_version`, `feature_set_name` | | 78 | | 模型信息 | `feature_fact` | `model_name`, `model_version`, `feature_set_name` | |
| 79 | | reference/eval/hot 集 | `set_membership` | `set_type`, `set_name` | | 79 | | reference/eval/hot 集 | `set_membership` | `set_type`, `set_name` | |
| 80 | 80 | ||
| 81 | 补充理解: | ||
| 82 | - `feature_fact.object_id -> audio_object.object_id`:feature 直接绑定到具体音频对象,Phase-1 默认绑 `window` | ||
| 83 | - `audio_object.parent_object_id`:把 `window` 回溯到它的 `asset` | ||
| 84 | - `feature_fact.song_id -> media_entity.entity_id`:为了 song-level 聚合与快速返回 `song_id` 做的冗余固化 | ||
| 85 | - 如果你只想看这一层的详细解释,直接读 [postgresql-data-model.md](./postgresql-data-model.md) 第 4 节和 [postgres_db_schema_samples.md](./postgres_db_schema_samples.md) 第 5 节。 | ||
| 86 | |||
| 81 | --- | 87 | --- |
| 82 | 88 | ||
| 83 | ## 5. 当前主链流程图 | 89 | ## 5. 当前主链流程图 |
| ... | @@ -99,8 +105,9 @@ flowchart TD | ... | @@ -99,8 +105,9 @@ flowchart TD |
| 99 | - live PostgreSQL schema 已真实建表通过 | 105 | - live PostgreSQL schema 已真实建表通过 |
| 100 | - 真实目录 -> manifest -> import 已打通 | 106 | - 真实目录 -> manifest -> import 已打通 |
| 101 | - 真实目录 -> fingerprint enrichment -> import 已打通 | 107 | - 真实目录 -> fingerprint enrichment -> import 已打通 |
| 102 | - semantic lane 已做成 runtime-ready | 108 | - semantic lane 已真实接入 `mert-v1-95m` baseline |
| 103 | - 当前 host 已能进入 runtime-ready placeholder 分支,下一步可在不破坏当前 MERT 基线的前提下继续接 `MuQ` | 109 | - 当前 host 上 live 主链已落 `chromaprint_matcher + mert-v1-95m` |
| 110 | - 下一步是在不破坏当前 MERT 基线的前提下继续接 `MuQ` challenger | ||
| 104 | - 当前 exact lane 已优先复用仓库内 `ChromaprintMatcher` | 111 | - 当前 exact lane 已优先复用仓库内 `ChromaprintMatcher` |
| 105 | 112 | ||
| 106 | --- | 113 | --- | ... | ... |
-
Please register or sign in to post a comment