Commit 6ee8c576 6ee8c576b04b4538dcf21a113600f08e1f08adbb by cnb.bofCdSsphPA

Why the Phase-1 docs must explain feature-to-window binding explicitly

Constraint: The current default must stay aligned with the live 4-table song-centric path and the real MERT baseline
Rejected: Re-expanding old multi-layer docs | increases onboarding cost and reintroduces stale states
Confidence: high
Scope-risk: narrow
Directive: Keep future schema docs anchored to live model_name/feature_set_name facts, not aspirational placeholders
Tested: markdown link check under docs; live PostgreSQL spot-check of feature_fact model_name/object_id/song_id lineage
Not-tested: Mermaid rendering in external markdown viewers
1 parent 08d24bd4
# Changelog
## 2026-06-04
- 继续收敛文档到当前 live 主链口径:补齐 `feature_fact.object_id -> audio_object(window)``window.parent_object_id -> asset``feature_fact.song_id -> media_entity(song)` 的绑定说明,并新增 manifest/SQL 双样例,专门回答 Phase-1 开源模型集合应该如何落地存储以及 feature 与 audio object 如何关联。
- 修正 `docs/session-handoff.md` 中关于 semantic lane 的旧状态残留,统一到当前真实事实:live 默认已落 `chromaprint_matcher + mert-v1-95m`,MuQ 仍是下一阶段 challenger。
## 2026-06-04
- fresh runtime 进展:已在当前 host 成功安装 `torch-2.12.0+cpu``torchaudio-2.11.0+cpu``transformers-5.10.1`,重跑 song-centric 主链后确认 `semantic_runtime_available = true``semantic_runtime_ready_count = 5``semantic_fallback_count = 0`;当前 semantic 已从 fallback 推进到 `mert-v1-95m`,下一步可在不破坏当前 MERT 基线的前提下继续接 `MuQ` adapter。
- 新增 MuQ 接入线索固化:根据仓库现有 Phase-1 脚本与外部模型线索,下一步可优先尝试 `OpenMuQ/MuQ-large-msd-iter` 作为 MuQ challenger 的最小接入目标;官方加载入口可优先按 `from muq import MuQ` + `MuQ.from_pretrained("OpenMuQ/MuQ-large-msd-iter")`
- fresh MuQ 进展:当前 host 已完成 `muq` 包安装,但 `import muq` 仍被 `RuntimeError: operator torchvision::nms does not exist` 卡住;当前 blocker 已从“MuQ 未安装”推进为“torchvision 兼容问题”。
......
......@@ -73,8 +73,8 @@ acr-engine/scripts/start_songcentric_shortest_path.sh 'postgres://d2:d2pass@127.
- [start-here.md](./start-here.md):新同学 10 分钟接手入口
- [session-handoff.md](./session-handoff.md):下次启动从哪里继续
- [postgresql-data-model.md](./postgresql-data-model.md):表设计、字段语义、流程图、设计取舍
- [postgres_db_schema_samples.md](./postgres_db_schema_samples.md):DDL、样例数据、典型 SQL、导入查询链路
- [postgresql-data-model.md](./postgresql-data-model.md):表设计、字段语义、feature 与 audio_object 的绑定关系、Phase-1 模型落库口径
- [postgres_db_schema_samples.md](./postgres_db_schema_samples.md):DDL、manifest/SQL 样例、典型查询链路、真实存储示例
- [CHANGELOG.md](./CHANGELOG.md):变更历史
---
......
......@@ -33,14 +33,39 @@ song -> asset -> window -> fingerprint / embedding
| song | `media_entity` | `entity_type='song'` | `song_000001` |
| asset | `audio_object` | `object_type='asset'` | 一首歌的原始 wav/mp3/flac |
| window | `audio_object` | `object_type='window'` | `0-5000ms`, `2500-7500ms` |
| fingerprint | `feature_fact` | `feature_type='fingerprint'` | chromaprint |
| fingerprint | `feature_fact` | `feature_type='fingerprint'` | chromaprint_matcher |
| embedding | `feature_fact` | `feature_type='embedding'` | MERT/MuQ/fallback vector |
| model | `feature_fact` | `model_name`, `model_version` | `mert-v1-95m`, `muq-base`, `local_wavehash_embed` |
| model | `feature_fact` | `model_name`, `model_version` | `chromaprint_matcher`, `mert-v1-95m`, `muq-large-msd-iter`, `local_wavehash_embed` |
| feature set | `feature_fact` | `feature_set_name`, `feature_schema_ver` | `mert_5s_hop2.5_v1` |
---
## 3. DDL
## 3. Phase-1 数据绑定一页图
```mermaid
flowchart LR
S[media_entity
song] --> A[audio_object
asset]
A --> W[audio_object
window]
W --> F1[feature_fact
chromaprint_matcher]
W --> F2[feature_fact
mert-v1-95m]
W --> F3[feature_fact
muq-large-msd-iter 计划]
```
关键绑定字段:
- `audio_object.song_id -> media_entity.entity_id`
- `window.parent_object_id -> asset.object_id`
- `feature_fact.object_id -> window.object_id`
- `feature_fact.song_id -> media_entity.entity_id`
一句话:`feature_fact` 绑的是“具体 window”,不是抽象 song;但为了快速返回结果,又会把 `song_id` 冗余写进去。
## 4. DDL
### 3.1 `media_entity`
......@@ -170,9 +195,63 @@ flowchart LR
---
## 5. 样例数据
## 5. 导入前的 manifest 样例
当前主链导入前,推荐就把 feature 放到 `windows[].features[]` 里:
```json
{
"song": {"biz_key": "song_alpha", "title": "song alpha", "artist_name": "artist a"},
"asset": {
"source_type": "official",
"storage_uri": "/workspace/acr-engine/data/songcentric_builder_smoke/song_alpha/artist_a/clip1.wav",
"storage_scheme": "file",
"checksum": "path:/workspace/acr-engine/data/songcentric_builder_smoke/song_alpha/artist_a/clip1.wav",
"codec": "wav",
"sample_rate": 16000,
"channels": 1,
"duration_ms": 8000
},
"windows": [
{
"start_ms": 0,
"end_ms": 5000,
"features": [
{
"feature_type": "fingerprint",
"model_name": "chromaprint_matcher",
"model_version": "phase1_local",
"feature_set_name": "chromaprint_matcher_5s",
"fingerprint_value": "dc0c731425f360787f462da693ff4a50"
},
{
"feature_type": "embedding",
"model_name": "mert-v1-95m",
"model_version": "hf-main",
"feature_set_name": "mert_5s_hop2.5_v1",
"feature_schema_ver": "v1",
"embedding_dim": 768,
"embedding_uri": "inline-mert://19c0162d3bdde235:0:5000",
"vector_table_name": "audio_embedding_vector_768_placeholder"
}
]
}
],
"memberships": [
{"set_type": "reference_set", "set_name": "phase1_hot_reference_v1", "member_type": "asset", "priority": 100}
]
}
```
这份 JSON 的含义非常直接:
- `song` 决定最终要回到哪个 `song_id`
- `asset` 决定原始音频文件是谁
- `windows[]` 决定切片边界
- `windows[].features[]` 决定每个切片已经由哪些模型编码过
## 6. 样例数据
### 5.1 写 song
### 6.1 写 song
```sql
insert into media_entity (
......@@ -184,7 +263,7 @@ insert into media_entity (
returning entity_id;
```
### 5.2 写 asset
### 6.2 写 asset
```sql
insert into audio_object (
......@@ -199,7 +278,7 @@ insert into audio_object (
returning object_id;
```
### 5.3 写 window
### 6.3 写 window
```sql
insert into audio_object (
......@@ -211,7 +290,7 @@ insert into audio_object (
returning object_id;
```
### 5.4 写 fingerprint
### 6.4 写 fingerprint
```sql
insert into feature_fact (
......@@ -220,13 +299,13 @@ insert into feature_fact (
fingerprint_value, checksum, metadata_json
) values (
'fingerprint', :window_id, :song_id,
'chromaprint', '1.0', 'chromaprint_5s_v1', 'v1',
'chromaprint_matcher', 'phase1_local', 'chromaprint_matcher_5s', 'v1',
'AQAAE0mUaEkSZSo...', 'sha256:fp001',
'{"lane":"exact"}'::jsonb
);
```
### 5.5 写 embedding
### 6.5 写 embedding
```sql
insert into feature_fact (
......@@ -241,7 +320,7 @@ insert into feature_fact (
);
```
### 5.6 写 set membership
### 6.6 写 set membership
```sql
insert into set_membership (
......@@ -254,7 +333,7 @@ insert into set_membership (
---
## 6. 典型查询
## 7. 典型查询
### 6.1 查看某首歌有哪些 asset
......@@ -555,7 +634,7 @@ insert into feature_fact (
fingerprint_value
) values (
'fingerprint', :window_id, :song_id,
'chromaprint', '1.0', 'chromaprint_5s_v1', 'v1',
'chromaprint_matcher', 'phase1_local', 'chromaprint_matcher_5s', 'v1',
'AQAAE0mUaEkSZSo...'
);
```
......@@ -583,7 +662,7 @@ insert into feature_fact (
embedding_dim, embedding_uri, vector_table_name
) values (
'embedding', :window_id, :song_id,
'muq-base', 'hf-main', 'muq_5s_hop2.5_v1', 'v1',
'muq-large-msd-iter', 'hf-main', 'muq_5s_hop2.5_v1', 'v1',
768, 's3://bucket/emb/demo_song_win0001_muq.npy', 'audio_embedding_vector_768'
);
```
......@@ -636,13 +715,50 @@ order by ff.feature_type, ff.model_name;
---
## 14. 一个完整的多 asset / 多 window / 多 model 样例
## 14. 一个真实绑定查询样例
下面这条 SQL 用来回答用户最关心的问题:
> 一条 feature 是怎么和 audio object 绑定,并最终回到 `song_id` 的?
```sql
select ff.feature_id,
ff.feature_type,
ff.model_name,
ff.model_version,
ff.feature_set_name,
w.object_id as window_id,
w.start_ms,
w.end_ms,
a.object_id as asset_id,
a.storage_uri,
s.entity_id as song_id,
s.biz_key
from feature_fact ff
join audio_object w
on w.object_id = ff.object_id
and w.object_type = 'window'
join audio_object a
on a.object_id = w.parent_object_id
and a.object_type = 'asset'
join media_entity s
on s.entity_id = ff.song_id
where ff.feature_id = :feature_id;
```
你可以把它理解成 4 步:
1.`feature_fact` 找到这条特征
2.`object_id` 找到它绑定的 `window`
3.`parent_object_id` 找到它所属的 `asset`
4.`song_id` 找到最终归属的 `song`
## 15. 一个完整的多 asset / 多 window / 多 model 样例
假设:
- 同一个 `song_id = 1001`
- 有 2 个音频文件:`master.wav``ugc_clip.mp3`
- 每个 asset 切成 2 个 window
- 每个 window 都跑 `chromaprint + mert-v1-95m + muq-base`
- 每个 window 都跑 `chromaprint_matcher + mert-v1-95m + muq-large-msd-iter`
### 14.1 逻辑结构
......@@ -650,22 +766,22 @@ order by ff.feature_type, ff.model_name;
song(1001)
-> asset(2001, master.wav)
-> window(3001, 0-5000)
-> chromaprint
-> chromaprint_matcher
-> mert-v1-95m
-> muq-base
-> muq-large-msd-iter
-> window(3002, 2500-7500)
-> chromaprint
-> chromaprint_matcher
-> mert-v1-95m
-> muq-base
-> muq-large-msd-iter
-> asset(2002, ugc_clip.mp3)
-> window(3003, 10000-15000)
-> chromaprint
-> chromaprint_matcher
-> mert-v1-95m
-> muq-base
-> muq-large-msd-iter
-> window(3004, 12500-17500)
-> chromaprint
-> chromaprint_matcher
-> mert-v1-95m
-> muq-base
-> muq-large-msd-iter
```
### 14.2 会落成多少行
......@@ -706,7 +822,7 @@ order by a.object_id, w.start_ms, ff.feature_type, ff.model_name;
### 14.4 查询哪些 window 缺某个模型
这个 SQL 很适合做补算任务扫描,比如检查哪些 window 还没跑 `muq-base`
这个 SQL 很适合做补算任务扫描,比如检查哪些 window 还没跑 `muq-large-msd-iter`
```sql
select w.object_id as window_id,
......@@ -721,7 +837,7 @@ where w.object_type = 'window'
from feature_fact ff
where ff.object_id = w.object_id
and ff.feature_type = 'embedding'
and ff.model_name = 'muq-base'
and ff.model_name = 'muq-large-msd-iter'
and ff.model_version = 'hf-main'
and ff.feature_set_name = 'muq_5s_hop2.5_v1'
)
......@@ -746,7 +862,7 @@ order by w.start_ms;
---
## 15. 批量入库与索引建设样例
## 16. 批量入库与索引建设样例
### 15.1 推荐批量顺序
......@@ -756,7 +872,7 @@ batch-2: audio_object(asset)
batch-3: audio_object(window)
batch-4: feature_fact(chromaprint)
batch-5: feature_fact(mert-v1-95m)
batch-6: feature_fact(muq-base)
batch-6: feature_fact(muq-large-msd-iter)
```
### 15.2 推荐补充索引
......
......@@ -67,7 +67,68 @@ song -> asset -> window -> fingerprint / embedding
| feature set identity | `feature_fact` | `feature_set_name`, `feature_schema_ver` | 区分特征配置、窗口策略、schema 版本 |
| reference routing | `set_membership` | `set_type`, `set_name` | 控制 reference/eval/hot 范围 |
### 4.1 一个关键设计点
### 4.1 feature 和 audio_object 到底怎么绑定
这是当前 schema 最关键的一层:
```text
feature_fact.object_id -> audio_object.object_id
```
含义:
- 一条 `feature_fact` 永远对应一个具体音频对象
- 在当前 Phase-1 主链里,这个对象默认是 `window`
- 所以检索命中的最小证据单元是 `window`,不是整首 song,也不是整份 asset
再往上回溯:
```text
feature_fact.object_id -> window.object_id
window.parent_object_id -> asset.object_id
window.song_id / feature_fact.song_id -> media_entity.entity_id
```
也就是说:
- `object_id` 负责绑定到“具体哪段音频”
- `parent_object_id` 负责回到“这段音频属于哪份 asset”
- `song_id` 负责快速回到“最终归属哪个 song_id”
### 4.2 为什么 `feature_fact` 里还要冗余存 `song_id`
因为版权保护场景里,在线服务最终要快速输出 `song_id`
所以 `feature_fact.song_id` 是一个**有意的冗余字段**,目的有 3 个:
- 减少召回后 song-level 聚合时的 join 成本
- 允许直接按 `song_id + model_name + feature_type` 做覆盖率巡检
- 便于后续把 `window` 命中快速折叠为 song-level evidence
### 4.3 Phase-1 默认为什么把 feature 绑到 `window` 而不是 `asset`
因为 Phase-1 的目标不是只知道“这份音频大概像谁”,而是还要保留:
- 命中的 offset
- 命中的具体 5s 片段
- exact / semantic 在同一时间段上的并行证据
因此默认策略是:
- `asset`:承载原始音频文件
- `window`:承载检索、匹配、回溯最小单元
- `feature_fact`:默认挂到 `window`
### 4.4 一个最小链路示意
```mermaid
flowchart LR
F[feature_fact
model_name=mert-v1-95m] --> W[audio_object
object_type=window]
W --> A[audio_object
object_type=asset]
W --> S[media_entity
entity_type=song]
F --> S
```
### 4.5 一个关键设计点
当前 **模型信息不单独放 registry 表作为默认主链依赖**,而是先直接沉淀在 `feature_fact`
- 这样 Phase-1 更轻
......@@ -610,10 +671,10 @@ flowchart TD
| lane | model_name | model_version | feature_type | 用途 |
|---|---|---|---|---|
| exact | `chromaprint` | `1.0` | `fingerprint` | 高精度 exact 命中 |
| semantic baseline | `mert-v1-95m` | `hf-main` | `embedding` | song semantic baseline |
| semantic challenger | `muq-base` | `hf-main` | `embedding` | cover / bgm / 复杂干扰 challenger |
| semantic fallback | `local_wavehash_embed` | `phase1_local` | `embedding` | 当前 host 缺 runtime 时兜底 |
| exact(当前 live) | `chromaprint_matcher` | `phase1_local` | `fingerprint` | 当前 live exact baseline |
| semantic baseline(当前 live) | `mert-v1-95m` | `hf-main` | `embedding` | 当前 live semantic baseline |
| semantic challenger(计划) | `muq-large-msd-iter` | `hf-main` | `embedding` | 下一阶段 cover / bgm / 复杂干扰 challenger |
| semantic fallback | `local_wavehash_embed` | `phase1_local` | `embedding` | runtime 不可用时兜底 |
| historical baseline | `ecapa-tdnn` | `baseline_only` | `embedding` | 历史对比,不建议做 Phase-1 主导 |
### 16.2 建议用什么字段固化模型身份
......@@ -635,17 +696,17 @@ flowchart TD
```
例如:
- `chromaprint_5s_v1`
- `mert_5s_hop2.5_v1`
- `muq_5s_hop2.5_v1`
- `wavehash_5s_hop2.5_v1`
- `chromaprint_matcher_5s`(当前 live)
- `mert_5s_hop2.5_v1`(当前 live)
- `muq_5s_hop2.5_v1`(计划)
- `wavehash_5s_hop2.5_v1`(fallback)
### 16.4 Phase-1 推荐的存储规则
#### exact lane
- `feature_type = 'fingerprint'`
- `fingerprint_value` 必填
- `model_name = 'chromaprint'`
- `model_name = 'chromaprint_matcher'`
- `embedding_uri / vector_table_name` 为空
#### semantic lane
......@@ -674,7 +735,7 @@ flowchart TD
3. 切窗并写 `audio_object(window)`
4.`chromaprint`,写 `feature_fact(fingerprint)`
5.`mert-v1-95m`,写 `feature_fact(embedding)`
6. `muq-base`,写 `feature_fact(embedding)`
6. 下一阶段接 `muq-large-msd-iter`,写 `feature_fact(embedding)`
7. 如果 runtime 不可用,至少写 `local_wavehash_embed` fallback
这样最终会形成:
......@@ -683,7 +744,7 @@ flowchart TD
同一个 window
-> 1 条 chromaprint fingerprint
-> 1 条 mert embedding
-> 1 条 muq embedding
-> 1 条 muq embedding(接入后)
-> (可选) 1 条 fallback embedding
```
......@@ -693,7 +754,87 @@ flowchart TD
---
## 17. 100w 音频 / 30w song 的批量入库与索引建设策略
## 17. 当前 live 样例:一条 feature 是怎么回到 song_id 的
下面是当前 PostgreSQL `acr_songcentric_test` 的真实主链口径:
- `feature_type = 'fingerprint'` 时,当前 live `model_name = 'chromaprint_matcher'`
- `feature_type = 'embedding'` 时,当前 live baseline `model_name = 'mert-v1-95m'`
- 历史测试里还能看到旧的 placeholder / fallback 行,但它们不是当前默认基线
### 17.1 一个真实 manifest 样例(导入前)
```json
{
"song": {"biz_key": "song_alpha", "title": "song alpha", "artist_name": "artist a"},
"asset": {"storage_uri": ".../clip1.wav", "duration_ms": 8000},
"windows": [
{
"start_ms": 0,
"end_ms": 5000,
"features": [
{
"feature_type": "fingerprint",
"model_name": "chromaprint_matcher",
"model_version": "phase1_local",
"feature_set_name": "chromaprint_matcher_5s"
},
{
"feature_type": "embedding",
"model_name": "mert-v1-95m",
"model_version": "hf-main",
"feature_set_name": "mert_5s_hop2.5_v1",
"embedding_dim": 768
}
]
}
]
}
```
### 17.2 导入后的绑定结果应该长什么样
```text
media_entity(song_alpha)
-> audio_object(asset: clip1.wav)
-> audio_object(window: 0-5000)
-> feature_fact(fingerprint, chromaprint_matcher)
-> feature_fact(embedding, mert-v1-95m)
```
### 17.3 查询某条 feature 绑定到哪个 window / asset / song
```sql
select ff.feature_id,
ff.feature_type,
ff.model_name,
ff.feature_set_name,
w.object_id as window_id,
w.start_ms,
w.end_ms,
a.object_id as asset_id,
a.storage_uri,
s.entity_id as song_id,
s.biz_key
from feature_fact ff
join audio_object w
on w.object_id = ff.object_id
and w.object_type = 'window'
join audio_object a
on a.object_id = w.parent_object_id
and a.object_type = 'asset'
join media_entity s
on s.entity_id = ff.song_id
where ff.feature_id = :feature_id;
```
这条 SQL 回答的就是:
- 这条 feature 是哪个模型算的
- 它绑定的是哪个 window
- 这个 window 属于哪个 asset
- 最终应该归到哪个 `song_id`
## 18. 100w 音频 / 30w song 的批量入库与索引建设策略
当前规模下,最重要的原则不是一次把所有模型都算完,而是:
......
......@@ -42,7 +42,7 @@ acr-engine/scripts/start_songcentric_shortest_path.sh 'postgres://d2:d2pass@127.
> **4 表 song-centric schema 已在 live PostgreSQL 上真实打通了“真实目录 -> 切片 -> exact/semantic feature enrichment -> import -> feature_fact”的宿主链。**
下一步最应该做的是:
> **在不破坏这条宿主链的前提下,把 semantic lane 从 runtime-aware fallback 升级到真实 MERT / MuQ adapter。**
> **在不破坏这条宿主链的前提下,继续把 semantic lane 从当前 MERT baseline 扩展到 MuQ challenger。**
---
......@@ -114,7 +114,7 @@ flowchart TD
5. 真实目录 -> manifest -> import 已验证通过
6. 真实目录 -> fingerprint enrichment -> import 已验证通过
7. exact lane 已优先复用仓库内 `ChromaprintMatcher`
8. semantic lane 已 runtime-ready,当前 host 已可进入 placeholder runtime 分支
8. semantic lane 已真实接入 `mert-v1-95m` baseline,当前 host 的 live 主链已不再停留在 placeholder 分支
---
......@@ -169,48 +169,67 @@ flowchart TD
---
## 10. 真实 semantic adapter 下一步应该接到哪里
当前最直接的接入点已经明确:
- 入口脚本:`acr-engine/scripts/enrich_songcentric_manifest_with_local_features.py`
- 关键函数:`build_semantic_feature(...)`
### 当前真实状态
- exact lane 已优先复用 `ChromaprintMatcher`
- semantic lane 还没有真实接入 `MERT / MuQ`
- runtime 就绪时,当前会产出:
- `model_name = mert-v1-95m`
- fallback 分支仍保留:
- `model_name = local_wavehash_embed`
### fresh 依赖检查事实
当前 host 仍缺:
- `torch`
- `torchaudio`
- `transformers`
### 下次 session 最直接的实现顺序
1. 安装 `torch / torchaudio / transformers`
2.`build_semantic_feature(...)` 内接真实 `MERT``MuQ` adapter
3. 保留当前 `local_wavehash_embed` fallback 不删
4. 重跑:
```bash
cd /workspace
/usr/local/miniconda3/bin/python acr-engine/scripts/run_songcentric_directory_pipeline_live.py \
--dsn 'postgres://d2:d2pass@127.0.0.1:5432/d2' \
--schema acr_songcentric_test \
--input-root acr-engine/data/songcentric_builder_smoke \
--output-dir acr-engine/data/pgvector_eval/music20
## 10. 数据关联与当前 live 落库事实
当前最重要的绑定关系只有 3 条:
1. `feature_fact.object_id -> audio_object.object_id`
- feature 绑定到具体音频对象
- Phase-1 默认绑定 `window`,不是直接绑定 song
2. `audio_object.parent_object_id -> audio_object.object_id`
- `window -> asset` 父子回溯链
3. `feature_fact.song_id -> media_entity.entity_id`
- 用于快速做 song-level 聚合与最终返回 `song_id`
可以用一句话理解:
> `audio_object` 说明“这段音频是谁”,`feature_fact` 说明“这段音频被哪个模型编码成了什么特征”。
### 当前 live 主链已经真实落了什么
当前 live 新数据已经真实落到:
- exact:`chromaprint_matcher / phase1_local / chromaprint_matcher_5s`
- semantic baseline:`mert-v1-95m / hf-main / mert_5s_hop2.5_v1`
当前 MuQ 状态:
- 目标模型:`OpenMuQ/MuQ-large-msd-iter`
- 当前 blocker:`import muq` 触发 `RuntimeError: operator torchvision::nms does not exist`
- 结论:MuQ 仍是下一阶段 challenger,不是当前 live 默认基线
### 当前 manifest 形状(导入前)
```json
{
"song": {"biz_key": "song_alpha", "title": "song alpha"},
"asset": {"storage_uri": ".../clip1.wav"},
"windows": [
{
"start_ms": 0,
"end_ms": 5000,
"features": [
{
"feature_type": "fingerprint",
"model_name": "chromaprint_matcher",
"feature_set_name": "chromaprint_matcher_5s"
},
{
"feature_type": "embedding",
"model_name": "mert-v1-95m",
"feature_set_name": "mert_5s_hop2.5_v1",
"embedding_dim": 768
}
]
}
]
}
```
### 期望看到的 fresh 指标变化
- `semantic_runtime_available = true`
- `semantic_runtime_ready_count > 0`
- `semantic_fallback_count` 明显下降或归零
### 下次 session 最直接的继续点
1. 不要再验证 MERT 是否接上,已经接上
2. 直接处理 MuQ 的 `torchvision::nms` 兼容问题
3. 接入 `OpenMuQ/MuQ-large-msd-iter` challenger
4. 重跑主链 runner,确认每个 window 最终可同时看到:
- `chromaprint_matcher`
- `mert-v1-95m`
- `muq-large-msd-iter`(或最终统一后的 `model_name`
......
......@@ -78,6 +78,12 @@ song -> asset -> window -> fingerprint / embedding
| 模型信息 | `feature_fact` | `model_name`, `model_version`, `feature_set_name` |
| reference/eval/hot 集 | `set_membership` | `set_type`, `set_name` |
补充理解:
- `feature_fact.object_id -> audio_object.object_id`:feature 直接绑定到具体音频对象,Phase-1 默认绑 `window`
- `audio_object.parent_object_id`:把 `window` 回溯到它的 `asset`
- `feature_fact.song_id -> media_entity.entity_id`:为了 song-level 聚合与快速返回 `song_id` 做的冗余固化
- 如果你只想看这一层的详细解释,直接读 [postgresql-data-model.md](./postgresql-data-model.md) 第 4 节和 [postgres_db_schema_samples.md](./postgres_db_schema_samples.md) 第 5 节。
---
## 5. 当前主链流程图
......@@ -99,8 +105,9 @@ flowchart TD
- live PostgreSQL schema 已真实建表通过
- 真实目录 -> manifest -> import 已打通
- 真实目录 -> fingerprint enrichment -> import 已打通
- semantic lane 已做成 runtime-ready
- 当前 host 已能进入 runtime-ready placeholder 分支,下一步可在不破坏当前 MERT 基线的前提下继续接 `MuQ`
- semantic lane 已真实接入 `mert-v1-95m` baseline
- 当前 host 上 live 主链已落 `chromaprint_matcher + mert-v1-95m`
- 下一步是在不破坏当前 MERT 基线的前提下继续接 `MuQ` challenger
- 当前 exact lane 已优先复用仓库内 `ChromaprintMatcher`
---
......