Commit 8e2d4852 8e2d485235ca77929d668ec68cd1329745c062c4 by cnb.bofCdSsphPA

Close the planner validation loop across all four live entrypoints

Constraint: Partial execution proof for planner validation commands still left room for manual reconstruction risk, so the remaining entrypoints had to be exercised too.
Rejected: Stop after two executed planner commands | It would leave the negative matrix and asset-upsert entrypoints unproven.
Confidence: high
Scope-risk: narrow
Directive: Treat phase1_validation_commands_execution_report.json as the authoritative proof that the planner artifact is executable end-to-end.
Tested: git diff --check; /usr/local/miniconda3/bin/python - <<'PY' ... execute validation_commands.semantic_vector_negative_matrix and validation_commands.asset_level_upsert_validation from data/pgvector_eval/music20/phase1_extraction_plan_report.json ... PY
Not-tested: Individual extraction jobs still remain environment-blocked; this commit proves validation entrypoints, not successful feature extraction.
1 parent fa33c3a1
...@@ -12,5 +12,28 @@ ...@@ -12,5 +12,28 @@
12 "stdout_tail": "{\n \"schema\": \"acr_test\",\n \"dsn_redacted\": \"postgres://d2:***@127.0.0.1:5432/d2\",\n \"exact_lane\": {\n \"job_id\": 1,\n \"returncode\": 0,\n \"job_status\": \"failed\",\n \"failure_reason\": \"unreadable_audio_assets\",\n \"missing_asset_count\": 20,\n \"artifact\": \"data/pgvector_eval/music20/phase1_worker_contract_smoke_exact.json\"\n },\n \"semantic_lane\": {\n \"returncode\": 0,\n \"semantic_job_count\": 4,\n \"failed_jobs\": 4,\n \"unique_blockers\": [\n \"model_runtime_unavailable\",\n \"unreadable_audio_assets\"\n ],\n \"artifact\": \"data/pgvector_eval/music20/phase1_worker_contract_smoke_semantic_matrix.json\"\n },\n \"summary\": {\n \"exact_status\": \"failed\",\n \"semantic_failed_jobs\": 4,\n \"shared_environment_blockers\": [\n \"missing /workspace/downloads mount\",\n \"missing semantic model runtime dependencies\"\n ]\n }\n}\n", 12 "stdout_tail": "{\n \"schema\": \"acr_test\",\n \"dsn_redacted\": \"postgres://d2:***@127.0.0.1:5432/d2\",\n \"exact_lane\": {\n \"job_id\": 1,\n \"returncode\": 0,\n \"job_status\": \"failed\",\n \"failure_reason\": \"unreadable_audio_assets\",\n \"missing_asset_count\": 20,\n \"artifact\": \"data/pgvector_eval/music20/phase1_worker_contract_smoke_exact.json\"\n },\n \"semantic_lane\": {\n \"returncode\": 0,\n \"semantic_job_count\": 4,\n \"failed_jobs\": 4,\n \"unique_blockers\": [\n \"model_runtime_unavailable\",\n \"unreadable_audio_assets\"\n ],\n \"artifact\": \"data/pgvector_eval/music20/phase1_worker_contract_smoke_semantic_matrix.json\"\n },\n \"summary\": {\n \"exact_status\": \"failed\",\n \"semantic_failed_jobs\": 4,\n \"shared_environment_blockers\": [\n \"missing /workspace/downloads mount\",\n \"missing semantic model runtime dependencies\"\n ]\n }\n}\n",
13 "stderr_tail": "", 13 "stderr_tail": "",
14 "passed": true 14 "passed": true
15 },
16 "semantic_vector_negative_matrix": {
17 "command": "cd /workspace/acr-engine && PG_DSN=\"${PG_DSN:?set PG_DSN}\" /usr/local/miniconda3/bin/python scripts/run_embedding_vector_table_negative_matrix_live.py --dsn \"$PG_DSN\" --output data/pgvector_eval/music20/embedding_vector_table_negative_matrix_report.json",
18 "returncode": 0,
19 "stdout_tail": "dding_vector_table_not_allowlisted_attempt.json\"\n },\n {\n \"case\": \"vector_table_missing_in_schema\",\n \"schema\": \"acr_vector_table_missing_test\",\n \"vector_table\": \"audio_embedding_vector_768\",\n \"job_status\": \"failed\",\n \"failure_reason\": \"preflight_failed\",\n \"preflight_blockers\": [\n \"unreadable_audio_assets\",\n \"vector_table_missing_in_schema\",\n \"model_runtime_unavailable\"\n ],\n \"vector_table_report\": {\n \"reason\": \"vector_table_missing_in_schema\",\n \"resolved\": false,\n \"expected_dim\": 768,\n \"table_exists\": false,\n \"allowed_vector_tables\": [\n \"audio_embedding_vector_192\",\n \"audio_embedding_vector_768\"\n ],\n \"requested_vector_table\": \"audio_embedding_vector_768\"\n },\n \"artifact\": \"data/pgvector_eval/music20/embedding_vector_table_missing_in_schema_attempt.json\"\n }\n ],\n \"summary\": {\n \"expected_reasons\": {\n \"vector_table_dim_mismatch\": \"vector_table_dim_mismatch\",\n \"vector_table_not_allowlisted\": \"vector_table_not_allowlisted\",\n \"vector_table_missing_in_schema\": \"vector_table_missing_in_schema\"\n },\n \"all_failed\": true\n }\n}\n",
20 "stderr_tail": "",
21 "passed": true
22 },
23 "asset_level_upsert_validation": {
24 "command": "cd /workspace/acr-engine && PG_DSN=\"${PG_DSN:?set PG_DSN}\" /usr/local/miniconda3/bin/python scripts/validate_audio_embedding_asset_upsert_live.py --dsn \"$PG_DSN\" --schema acr_asset_upsert_test --output data/pgvector_eval/music20/audio_embedding_asset_upsert_live_report.json",
25 "returncode": 0,
26 "stdout_tail": "\n \"upsert_embedding_id\": 1,\n \"same_embedding_id_reused\": true,\n \"counts\": {\n \"audio_embedding\": 1,\n \"audio_embedding_vector_192\": 1\n },\n \"final_state\": {\n \"embedding_id\": 1,\n \"asset_id\": 1,\n \"window_id\": null,\n \"checksum\": \"checksum-v2\",\n \"embedding_uri\": \"inline://asset-probe-upsert\",\n \"metadata_json\": {\n \"probe\": \"asset_level_upsert_v2\"\n },\n \"vector_literal\": \"[0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2,0.2]\"\n },\n \"passed\": true\n}\n",
27 "stderr_tail": "",
28 "passed": true
29 },
30 "summary": {
31 "executed_commands": [
32 "asset_level_upsert_validation",
33 "prereq_audit",
34 "semantic_vector_negative_matrix",
35 "worker_contract_smoke"
36 ],
37 "all_passed": true
15 } 38 }
16 } 39 }
...\ No newline at end of file ...\ No newline at end of file
......
1 ## 2026-06-04 1 ## 2026-06-04
2 2
3 - 更新 `phase1_validation_commands_execution_report.json`,补齐 planner 中剩余两条 validation commands 的直接执行证据:`semantic_vector_negative_matrix``asset_level_upsert_validation` 也已 `returncode=0`,当前 4 条 validation entrypoints 已全部验证可被脚本直接消费。
3 - 新增 `phase1_validation_commands_execution_report.json`,直接从 `phase1_extraction_plan_report.json` 读取并执行 `validation_commands.prereq_audit``validation_commands.worker_contract_smoke`,两条命令均返回 `0`,证明 planner 产物可被脚本化直接消费。 4 - 新增 `phase1_validation_commands_execution_report.json`,直接从 `phase1_extraction_plan_report.json` 读取并执行 `validation_commands.prereq_audit``validation_commands.worker_contract_smoke`,两条命令均返回 `0`,证明 planner 产物可被脚本化直接消费。
4 - 更新 `scripts/plan_phase1_extraction_jobs_live.py``phase1_extraction_plan_report.json`,除了 per-job `command_suggestions` 之外,又补充了 `validation_commands``prereq_audit``worker_contract_smoke``semantic_vector_negative_matrix``asset_level_upsert_validation`,使 planner 本身也成为下次 session 的执行入口。 5 - 更新 `scripts/plan_phase1_extraction_jobs_live.py``phase1_extraction_plan_report.json`,除了 per-job `command_suggestions` 之外,又补充了 `validation_commands``prereq_audit``worker_contract_smoke``semantic_vector_negative_matrix``asset_level_upsert_validation`,使 planner 本身也成为下次 session 的执行入口。
5 - 新增 `scripts/run_phase1_prereq_audit_live.py``phase1_prereq_audit_report.json`,把 `/workspace/downloads` 挂载状态、`torch/torchaudio/transformers/speechbrain` 依赖状态与 5 条 Phase-1 jobs 的 readiness 汇总到一份 live 审计报告;当前结果为 `ready_jobs=0``blocked_jobs=5` 6 - 新增 `scripts/run_phase1_prereq_audit_live.py``phase1_prereq_audit_report.json`,把 `/workspace/downloads` 挂载状态、`torch/torchaudio/transformers/speechbrain` 依赖状态与 5 条 Phase-1 jobs 的 readiness 汇总到一份 live 审计报告;当前结果为 `ready_jobs=0``blocked_jobs=5`
......
...@@ -482,5 +482,12 @@ cd /workspace/acr-engine && PG_DSN="${PG_DSN:?set PG_DSN}" EXTRACTION_JOB_ID=2 F ...@@ -482,5 +482,12 @@ cd /workspace/acr-engine && PG_DSN="${PG_DSN:?set PG_DSN}" EXTRACTION_JOB_ID=2 F
482 482
483 - `validation_commands.prereq_audit` -> `returncode = 0` 483 - `validation_commands.prereq_audit` -> `returncode = 0`
484 - `validation_commands.worker_contract_smoke` -> `returncode = 0` 484 - `validation_commands.worker_contract_smoke` -> `returncode = 0`
485 - `validation_commands.semantic_vector_negative_matrix` -> `returncode = 0`
486 - `validation_commands.asset_level_upsert_validation` -> `returncode = 0`
487
488 当前 `phase1_validation_commands_execution_report.json` 已经达到:
489
490 - `executed_commands = 4`
491 - `all_passed = true`
485 492
486 这说明 planner 报告现在不仅能“展示命令”,还可以被脚本化消费为真正的执行入口。 493 这说明 planner 报告现在不仅能“展示命令”,还可以被脚本化消费为真正的执行入口。
......
...@@ -197,7 +197,7 @@ sed -n '1,320p' acr-engine/sql/acr_pg_schema_v2.sql ...@@ -197,7 +197,7 @@ sed -n '1,320p' acr-engine/sql/acr_pg_schema_v2.sql
197 - `scripts/run_embedding_vector_table_negative_matrix_live.py` 已在 live PostgreSQL 上补齐 semantic vector-table 负例矩阵:`vector_table_dim_mismatch``vector_table_not_allowlisted``vector_table_missing_in_schema` 三类错误都能被稳定写入 `vector_table_report.reason` 197 - `scripts/run_embedding_vector_table_negative_matrix_live.py` 已在 live PostgreSQL 上补齐 semantic vector-table 负例矩阵:`vector_table_dim_mismatch``vector_table_not_allowlisted``vector_table_missing_in_schema` 三类错误都能被稳定写入 `vector_table_report.reason`
198 - `scripts/run_phase1_prereq_audit_live.py` 已给出当前 host 的先决条件审计:`downloads_root_exists=false``ready_jobs=0/5`,并把 `torch/torchaudio/transformers/speechbrain` 的缺失状态按 job 落成 JSON 报告 198 - `scripts/run_phase1_prereq_audit_live.py` 已给出当前 host 的先决条件审计:`downloads_root_exists=false``ready_jobs=0/5`,并把 `torch/torchaudio/transformers/speechbrain` 的缺失状态按 job 落成 JSON 报告
199 - `phase1_extraction_plan_report.json` 现已附带 `validation_commands`,下次 session 可以直接从 planner 复制 `prereq_audit / worker_contract_smoke / semantic_vector_negative_matrix / asset_level_upsert_validation` 四类命令 199 - `phase1_extraction_plan_report.json` 现已附带 `validation_commands`,下次 session 可以直接从 planner 复制 `prereq_audit / worker_contract_smoke / semantic_vector_negative_matrix / asset_level_upsert_validation` 四类命令
200 - `phase1_validation_commands_execution_report.json` 已证明 planner 里的 `prereq_audit``worker_contract_smoke` 两条 validation commands 可以被直接脚本消费且 `returncode=0` 200 - `phase1_validation_commands_execution_report.json` 已证明 planner 里的 4 条 validation commands 都可以被直接脚本消费且 `returncode=0``prereq_audit``worker_contract_smoke``semantic_vector_negative_matrix``asset_level_upsert_validation`
201 - `phase1_hot_reference_v1``acr_test` 里已经真实补齐 `20` 个 reference members,因此 worker dry-run 当前看到的 scope 已是 `20 recordings / 20 assets / 20 windows` 201 - `phase1_hot_reference_v1``acr_test` 里已经真实补齐 `20` 个 reference members,因此 worker dry-run 当前看到的 scope 已是 `20 recordings / 20 assets / 20 windows`
202 - worker contract 现在已有基础前置状态保护;重复执行同一 chromaprint dry-run job 会被 `expected_status=pending` 明确拒绝,证据见 `phase1_worker_double_claim_guard_report.json` 202 - worker contract 现在已有基础前置状态保护;重复执行同一 chromaprint dry-run job 会被 `expected_status=pending` 明确拒绝,证据见 `phase1_worker_double_claim_guard_report.json`
203 - exact lane 的 `run_chromaprint_job.py` 已具备非 dry-run 写入路径;当前在 `acr_test` 的 live 结果是因为 `/workspace/downloads/...` 缺失而明确 `failed`,不是继续假装 `completed` 203 - exact lane 的 `run_chromaprint_job.py` 已具备非 dry-run 写入路径;当前在 `acr_test` 的 live 结果是因为 `/workspace/downloads/...` 缺失而明确 `failed`,不是继续假装 `completed`
......