Commit 730d9b90 730d9b908e8cb220e28b581ae65b4b185fd388d3 by cnb.bofCdSsphPA

Make long-running FMA archive progress legible at a glance

Constraint: Multi-session continuation gets brittle when large real-data downloads require manual byte math to estimate progress
Rejected: Leave inspect output as raw archive size only | Forces every future session to recalculate completion state by hand
Confidence: high
Scope-risk: narrow
Directive: Keep progress fields stable so handoff tooling and humans can rely on them during long archive transfers
Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/scripts/prepare_fma_archive.py; /usr/local/miniconda3/bin/python acr-engine/scripts/prepare_fma_archive.py inspect
Not-tested: Completion of the full archive and downstream extraction remain pending
1 parent d1d7a512
......@@ -9,6 +9,7 @@ import subprocess
from pathlib import Path
FMA_SMALL_URL = "https://modelscope.cn/datasets/pengzhendong/fma/resolve/master/fma_small.zip"
FMA_SMALL_BYTES = 7679594875
ARCHIVE_PATH = Path("data/raw/fma_small.zip")
EXTRACT_DIR = Path("data/raw/fma_small_audio")
......@@ -42,12 +43,17 @@ def inspect() -> dict:
num_audio = 0
if extract_exists:
num_audio = len([p for p in EXTRACT_DIR.rglob('*') if p.suffix.lower() in {'.mp3', '.wav', '.flac', '.ogg'}])
archive_size = ARCHIVE_PATH.stat().st_size if archive_exists else 0
progress_ratio = (archive_size / FMA_SMALL_BYTES) if archive_exists and FMA_SMALL_BYTES else 0.0
return {
"action": "inspect",
"archive_url": FMA_SMALL_URL,
"archive_bytes_expected": FMA_SMALL_BYTES,
"archive_path": str(ARCHIVE_PATH.resolve()),
"archive_exists": archive_exists,
"archive_size": ARCHIVE_PATH.stat().st_size if archive_exists else 0,
"archive_size": archive_size,
"archive_progress_ratio": round(progress_ratio, 6),
"archive_progress_percent": round(progress_ratio * 100, 4),
"extract_dir": str(EXTRACT_DIR.resolve()),
"extract_exists": extract_exists,
"num_audio_files": num_audio,
......
......@@ -230,6 +230,27 @@
### Stage: FMA 下载进度可视化
完成项:
- 增强 [acr-engine/scripts/prepare_fma_archive.py](../acr-engine/scripts/prepare_fma_archive.py)`inspect` 输出
- 新增:
- `archive_bytes_expected`
- `archive_progress_ratio`
- `archive_progress_percent`
验证结果:
- `/usr/local/miniconda3/bin/python -m py_compile scripts/prepare_fma_archive.py` 成功
- `/usr/local/miniconda3/bin/python scripts/prepare_fma_archive.py inspect` 成功
- 当前结果:
- `archive_size=61550592`
- `archive_progress_percent=0.8015`
结论:
- 新 session 现在不需要手工换算大包下载进度
- 长时间 FMA 下载的交接成本进一步降低
### Stage: FMA 源切换到 ModelScope
完成项:
......