Commit d1e1a2b7 d1e1a2b76dce134c8d9334dcd76a52491c24457c by cnb.bofCdSsphPA

Let future sessions wait on archive completion without manual polling

Constraint: The real FMA archive still needs time, but once it finishes the workflow should transition into extraction and readiness with minimal operator attention
Rejected: Keep completion detection as an entirely manual loop | Wastes attention and slows handoff at the exact moment the archive becomes useful
Confidence: high
Scope-risk: narrow
Directive: Use wait_for_fma_and_prepare.py as the passive bridge from long-running download to active dataset onboarding whenever unattended waiting is acceptable
Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/scripts/wait_for_fma_and_prepare.py; /usr/local/miniconda3/bin/python acr-engine/scripts/wait_for_fma_and_prepare.py --interval 2 --max-cycles 2
Not-tested: The completed-path handoff into extraction remains pending full archive completion
1 parent 46b9d8d4
#!/usr/bin/env python3
"""Wait for the FMA archive to finish, then run post-download readiness."""
from __future__ import annotations
import argparse
import json
import subprocess
import time
PYTHON = "/usr/local/miniconda3/bin/python"
INSPECT = [PYTHON, "scripts/prepare_fma_archive.py", "inspect"]
POST = [PYTHON, "scripts/fma_postdownload_ready.py"]
def inspect() -> dict:
return json.loads(subprocess.check_output(INSPECT, text=True))
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--interval", type=float, default=30.0)
parser.add_argument("--max-cycles", type=int, default=3)
args = parser.parse_args()
snapshots = []
for _ in range(args.max_cycles):
snap = inspect()
snapshots.append(snap)
if snap.get("archive_size", 0) >= snap.get("archive_bytes_expected", 0):
result = json.loads(subprocess.check_output(POST, text=True))
print(json.dumps({"status": "completed", "snapshots": snapshots, "postdownload": result}, indent=2, ensure_ascii=False))
return
time.sleep(args.interval)
print(json.dumps({"status": "waiting", "snapshots": snapshots}, indent=2, ensure_ascii=False))
if __name__ == "__main__":
main()
......@@ -237,6 +237,30 @@
### Stage: FMA 完成前等待并自动切换
完成项:
- 新增 [acr-engine/scripts/wait_for_fma_and_prepare.py](../acr-engine/scripts/wait_for_fma_and_prepare.py)
- 支持:
- 周期性检查 FMA archive 是否已完整
- 完整后自动调用 `fma_postdownload_ready.py`
- 未完成时返回结构化 `waiting` 快照
- 将脚本接入 [docs/open-dataset-workflow.md](./open-dataset-workflow.md)[docs/session-handoff.md](./session-handoff.md)
验证结果:
- `/usr/local/miniconda3/bin/python -m py_compile scripts/wait_for_fma_and_prepare.py` 成功
- `/usr/local/miniconda3/bin/python scripts/wait_for_fma_and_prepare.py --interval 2 --max-cycles 2` 成功
- 当前结果:
- 第 1 次快照 `archive_size=2110291968`
- 第 2 次快照 `archive_size=2115256320`
- `status=waiting`
- 最新进度 `27.5439%`
结论:
- 现在仓库已经具备“等待完成 -> 自动切入解压/就绪检查”的衔接能力
- 后续 session 可以用单命令挂起等待,而不是反复手工轮询
### Stage: FMA 下载完成后自动就绪
完成项:
......
......@@ -147,6 +147,20 @@ cd acr-engine
如果归档还没下完,会返回结构化 `archive_not_complete`
### FMA 完成前等待并自动切换
```bash
cd acr-engine
/usr/local/miniconda3/bin/python scripts/wait_for_fma_and_prepare.py --interval 30 --max-cycles 120
```
作用:
- 周期性检查 `fma_small.zip` 是否完成
- 一旦完成,自动进入 [scripts/fma_postdownload_ready.py](../acr-engine/scripts/fma_postdownload_ready.py)
- 如果还没完成,则返回 `waiting` 和最近的进度快照
## Sources
- See [dataset-spec.md](./dataset-spec.md)
- See [dataset-sources-and-licensing.md](./dataset-sources-and-licensing.md)
......
......@@ -319,3 +319,5 @@
- [CHANGELOG.md](./CHANGELOG.md)
- FMA 下载完成后可直接执行:[acr-engine/scripts/fma_postdownload_ready.py](../acr-engine/scripts/fma_postdownload_ready.py)
- 若需要等待下载完成并自动切到解压/就绪检查,可直接执行:[acr-engine/scripts/wait_for_fma_and_prepare.py](../acr-engine/scripts/wait_for_fma_and_prepare.py)
......