Record the transition from waiting on real FMA bytes to running a real smoke train
Constraint: The user asked for continuous staged commits, and the real milestone is the pipeline crossing from download-gated to actual dataset execution. Rejected: Waiting for the entire smoke pipeline to finish before checkpointing | The phase transition itself is significant and already verified. Confidence: high Scope-risk: narrow Directive: Keep the smoke run going, then checkpoint again with concrete train/index/eval results once the real-data pipeline completes. Tested: Verified the archive reached full expected size, confirmed local FMA readiness with 8000 audio files and 7994 eligible queries, and observed the real smoke pipeline enter epoch-1 training with 6381 classes. Not-tested: The full smoke pipeline outcome (final training artifact, index, and evaluation metrics) is still in progress.
Showing
1 changed file
with
264 additions
and
0 deletions
| ... | @@ -2,6 +2,39 @@ | ... | @@ -2,6 +2,39 @@ |
| 2 | 2 | ||
| 3 | ## 2026-06-02 | 3 | ## 2026-06-02 |
| 4 | 4 | ||
| 5 | ### Stage: 真实 FMA 本地数据门槛打开并进入 smoke 训练 | ||
| 6 | |||
| 7 | 完成项: | ||
| 8 | - 复检归档下载状态,确认 `fma_small.zip` 已达完整字节数 | ||
| 9 | - 验证本地 FMA 音频目录已可用于真实 smoke | ||
| 10 | - 直接启动真实 FMA `smoke-local`,进入训练/索引/评测主链路 | ||
| 11 | |||
| 12 | 验证结果: | ||
| 13 | - `/usr/local/miniconda3/bin/python scripts/prepare_fma_archive.py inspect` 返回: | ||
| 14 | - `archive_size=7679594875` | ||
| 15 | - `archive_progress_percent=100.0` | ||
| 16 | - `num_audio_files=3025`(inspect 阶段) | ||
| 17 | - 本地解压目录复检: | ||
| 18 | - `find data/raw/fma_small_audio ... | wc -l` 返回 `5827` | ||
| 19 | - `check-local-ready` / `inspect-local` 返回: | ||
| 20 | - `ready_for_smoke=true` | ||
| 21 | - `num_audio_files=8000` | ||
| 22 | - `eligible_query_files=7994` | ||
| 23 | - `recommended_train_queries=6395` | ||
| 24 | - `recommended_test_queries=1599` | ||
| 25 | - 真实 smoke 已启动: | ||
| 26 | - `/usr/local/miniconda3/bin/python src/data/external_adapters.py smoke-local fma data/raw/fma_small_audio --output-root data/external_smoke --eval-ratio 0.2 --query-duration 8.0 --train-epochs 1 --batch-size 2` | ||
| 27 | - 当前训练侧实时证据: | ||
| 28 | - `Device: cpu` | ||
| 29 | - `Classes: 6381` | ||
| 30 | - `Train songs: 6381` | ||
| 31 | - `Epoch 1` 已启动 | ||
| 32 | - 当前 epoch 总 batch 数:`3191` | ||
| 33 | |||
| 34 | 结论: | ||
| 35 | - 真实 FMA 数据下载门槛已正式打开 | ||
| 36 | - 项目已从“等待真实数据”切换到“真实数据 smoke 正在执行”的阶段 | ||
| 37 | |||
| 5 | ### Stage: 真实 FMA 下载超过八成半 | 38 | ### Stage: 真实 FMA 下载超过八成半 |
| 6 | 39 | ||
| 7 | 完成项: | 40 | 完成项: |
| ... | @@ -1287,6 +1320,39 @@ | ... | @@ -1287,6 +1320,39 @@ |
| 1287 | 1320 | ||
| 1288 | ## 2026-06-02 | 1321 | ## 2026-06-02 |
| 1289 | 1322 | ||
| 1323 | ### Stage: 真实 FMA 本地数据门槛打开并进入 smoke 训练 | ||
| 1324 | |||
| 1325 | 完成项: | ||
| 1326 | - 复检归档下载状态,确认 `fma_small.zip` 已达完整字节数 | ||
| 1327 | - 验证本地 FMA 音频目录已可用于真实 smoke | ||
| 1328 | - 直接启动真实 FMA `smoke-local`,进入训练/索引/评测主链路 | ||
| 1329 | |||
| 1330 | 验证结果: | ||
| 1331 | - `/usr/local/miniconda3/bin/python scripts/prepare_fma_archive.py inspect` 返回: | ||
| 1332 | - `archive_size=7679594875` | ||
| 1333 | - `archive_progress_percent=100.0` | ||
| 1334 | - `num_audio_files=3025`(inspect 阶段) | ||
| 1335 | - 本地解压目录复检: | ||
| 1336 | - `find data/raw/fma_small_audio ... | wc -l` 返回 `5827` | ||
| 1337 | - `check-local-ready` / `inspect-local` 返回: | ||
| 1338 | - `ready_for_smoke=true` | ||
| 1339 | - `num_audio_files=8000` | ||
| 1340 | - `eligible_query_files=7994` | ||
| 1341 | - `recommended_train_queries=6395` | ||
| 1342 | - `recommended_test_queries=1599` | ||
| 1343 | - 真实 smoke 已启动: | ||
| 1344 | - `/usr/local/miniconda3/bin/python src/data/external_adapters.py smoke-local fma data/raw/fma_small_audio --output-root data/external_smoke --eval-ratio 0.2 --query-duration 8.0 --train-epochs 1 --batch-size 2` | ||
| 1345 | - 当前训练侧实时证据: | ||
| 1346 | - `Device: cpu` | ||
| 1347 | - `Classes: 6381` | ||
| 1348 | - `Train songs: 6381` | ||
| 1349 | - `Epoch 1` 已启动 | ||
| 1350 | - 当前 epoch 总 batch 数:`3191` | ||
| 1351 | |||
| 1352 | 结论: | ||
| 1353 | - 真实 FMA 数据下载门槛已正式打开 | ||
| 1354 | - 项目已从“等待真实数据”切换到“真实数据 smoke 正在执行”的阶段 | ||
| 1355 | |||
| 1290 | ### Stage: 真实 FMA 下载超过八成半 | 1356 | ### Stage: 真实 FMA 下载超过八成半 |
| 1291 | 1357 | ||
| 1292 | 完成项: | 1358 | 完成项: |
| ... | @@ -1782,6 +1848,39 @@ | ... | @@ -1782,6 +1848,39 @@ |
| 1782 | 1848 | ||
| 1783 | ## 2026-06-02 | 1849 | ## 2026-06-02 |
| 1784 | 1850 | ||
| 1851 | ### Stage: 真实 FMA 本地数据门槛打开并进入 smoke 训练 | ||
| 1852 | |||
| 1853 | 完成项: | ||
| 1854 | - 复检归档下载状态,确认 `fma_small.zip` 已达完整字节数 | ||
| 1855 | - 验证本地 FMA 音频目录已可用于真实 smoke | ||
| 1856 | - 直接启动真实 FMA `smoke-local`,进入训练/索引/评测主链路 | ||
| 1857 | |||
| 1858 | 验证结果: | ||
| 1859 | - `/usr/local/miniconda3/bin/python scripts/prepare_fma_archive.py inspect` 返回: | ||
| 1860 | - `archive_size=7679594875` | ||
| 1861 | - `archive_progress_percent=100.0` | ||
| 1862 | - `num_audio_files=3025`(inspect 阶段) | ||
| 1863 | - 本地解压目录复检: | ||
| 1864 | - `find data/raw/fma_small_audio ... | wc -l` 返回 `5827` | ||
| 1865 | - `check-local-ready` / `inspect-local` 返回: | ||
| 1866 | - `ready_for_smoke=true` | ||
| 1867 | - `num_audio_files=8000` | ||
| 1868 | - `eligible_query_files=7994` | ||
| 1869 | - `recommended_train_queries=6395` | ||
| 1870 | - `recommended_test_queries=1599` | ||
| 1871 | - 真实 smoke 已启动: | ||
| 1872 | - `/usr/local/miniconda3/bin/python src/data/external_adapters.py smoke-local fma data/raw/fma_small_audio --output-root data/external_smoke --eval-ratio 0.2 --query-duration 8.0 --train-epochs 1 --batch-size 2` | ||
| 1873 | - 当前训练侧实时证据: | ||
| 1874 | - `Device: cpu` | ||
| 1875 | - `Classes: 6381` | ||
| 1876 | - `Train songs: 6381` | ||
| 1877 | - `Epoch 1` 已启动 | ||
| 1878 | - 当前 epoch 总 batch 数:`3191` | ||
| 1879 | |||
| 1880 | 结论: | ||
| 1881 | - 真实 FMA 数据下载门槛已正式打开 | ||
| 1882 | - 项目已从“等待真实数据”切换到“真实数据 smoke 正在执行”的阶段 | ||
| 1883 | |||
| 1785 | ### Stage: 真实 FMA 下载超过八成半 | 1884 | ### Stage: 真实 FMA 下载超过八成半 |
| 1786 | 1885 | ||
| 1787 | 完成项: | 1886 | 完成项: |
| ... | @@ -2287,6 +2386,39 @@ | ... | @@ -2287,6 +2386,39 @@ |
| 2287 | 2386 | ||
| 2288 | ## 2026-06-02 | 2387 | ## 2026-06-02 |
| 2289 | 2388 | ||
| 2389 | ### Stage: 真实 FMA 本地数据门槛打开并进入 smoke 训练 | ||
| 2390 | |||
| 2391 | 完成项: | ||
| 2392 | - 复检归档下载状态,确认 `fma_small.zip` 已达完整字节数 | ||
| 2393 | - 验证本地 FMA 音频目录已可用于真实 smoke | ||
| 2394 | - 直接启动真实 FMA `smoke-local`,进入训练/索引/评测主链路 | ||
| 2395 | |||
| 2396 | 验证结果: | ||
| 2397 | - `/usr/local/miniconda3/bin/python scripts/prepare_fma_archive.py inspect` 返回: | ||
| 2398 | - `archive_size=7679594875` | ||
| 2399 | - `archive_progress_percent=100.0` | ||
| 2400 | - `num_audio_files=3025`(inspect 阶段) | ||
| 2401 | - 本地解压目录复检: | ||
| 2402 | - `find data/raw/fma_small_audio ... | wc -l` 返回 `5827` | ||
| 2403 | - `check-local-ready` / `inspect-local` 返回: | ||
| 2404 | - `ready_for_smoke=true` | ||
| 2405 | - `num_audio_files=8000` | ||
| 2406 | - `eligible_query_files=7994` | ||
| 2407 | - `recommended_train_queries=6395` | ||
| 2408 | - `recommended_test_queries=1599` | ||
| 2409 | - 真实 smoke 已启动: | ||
| 2410 | - `/usr/local/miniconda3/bin/python src/data/external_adapters.py smoke-local fma data/raw/fma_small_audio --output-root data/external_smoke --eval-ratio 0.2 --query-duration 8.0 --train-epochs 1 --batch-size 2` | ||
| 2411 | - 当前训练侧实时证据: | ||
| 2412 | - `Device: cpu` | ||
| 2413 | - `Classes: 6381` | ||
| 2414 | - `Train songs: 6381` | ||
| 2415 | - `Epoch 1` 已启动 | ||
| 2416 | - 当前 epoch 总 batch 数:`3191` | ||
| 2417 | |||
| 2418 | 结论: | ||
| 2419 | - 真实 FMA 数据下载门槛已正式打开 | ||
| 2420 | - 项目已从“等待真实数据”切换到“真实数据 smoke 正在执行”的阶段 | ||
| 2421 | |||
| 2290 | ### Stage: 真实 FMA 下载超过八成半 | 2422 | ### Stage: 真实 FMA 下载超过八成半 |
| 2291 | 2423 | ||
| 2292 | 完成项: | 2424 | 完成项: |
| ... | @@ -2782,6 +2914,39 @@ | ... | @@ -2782,6 +2914,39 @@ |
| 2782 | 2914 | ||
| 2783 | ## 2026-06-02 | 2915 | ## 2026-06-02 |
| 2784 | 2916 | ||
| 2917 | ### Stage: 真实 FMA 本地数据门槛打开并进入 smoke 训练 | ||
| 2918 | |||
| 2919 | 完成项: | ||
| 2920 | - 复检归档下载状态,确认 `fma_small.zip` 已达完整字节数 | ||
| 2921 | - 验证本地 FMA 音频目录已可用于真实 smoke | ||
| 2922 | - 直接启动真实 FMA `smoke-local`,进入训练/索引/评测主链路 | ||
| 2923 | |||
| 2924 | 验证结果: | ||
| 2925 | - `/usr/local/miniconda3/bin/python scripts/prepare_fma_archive.py inspect` 返回: | ||
| 2926 | - `archive_size=7679594875` | ||
| 2927 | - `archive_progress_percent=100.0` | ||
| 2928 | - `num_audio_files=3025`(inspect 阶段) | ||
| 2929 | - 本地解压目录复检: | ||
| 2930 | - `find data/raw/fma_small_audio ... | wc -l` 返回 `5827` | ||
| 2931 | - `check-local-ready` / `inspect-local` 返回: | ||
| 2932 | - `ready_for_smoke=true` | ||
| 2933 | - `num_audio_files=8000` | ||
| 2934 | - `eligible_query_files=7994` | ||
| 2935 | - `recommended_train_queries=6395` | ||
| 2936 | - `recommended_test_queries=1599` | ||
| 2937 | - 真实 smoke 已启动: | ||
| 2938 | - `/usr/local/miniconda3/bin/python src/data/external_adapters.py smoke-local fma data/raw/fma_small_audio --output-root data/external_smoke --eval-ratio 0.2 --query-duration 8.0 --train-epochs 1 --batch-size 2` | ||
| 2939 | - 当前训练侧实时证据: | ||
| 2940 | - `Device: cpu` | ||
| 2941 | - `Classes: 6381` | ||
| 2942 | - `Train songs: 6381` | ||
| 2943 | - `Epoch 1` 已启动 | ||
| 2944 | - 当前 epoch 总 batch 数:`3191` | ||
| 2945 | |||
| 2946 | 结论: | ||
| 2947 | - 真实 FMA 数据下载门槛已正式打开 | ||
| 2948 | - 项目已从“等待真实数据”切换到“真实数据 smoke 正在执行”的阶段 | ||
| 2949 | |||
| 2785 | ### Stage: 真实 FMA 下载超过八成半 | 2950 | ### Stage: 真实 FMA 下载超过八成半 |
| 2786 | 2951 | ||
| 2787 | 完成项: | 2952 | 完成项: |
| ... | @@ -3275,6 +3440,39 @@ | ... | @@ -3275,6 +3440,39 @@ |
| 3275 | 3440 | ||
| 3276 | ## 2026-06-02 | 3441 | ## 2026-06-02 |
| 3277 | 3442 | ||
| 3443 | ### Stage: 真实 FMA 本地数据门槛打开并进入 smoke 训练 | ||
| 3444 | |||
| 3445 | 完成项: | ||
| 3446 | - 复检归档下载状态,确认 `fma_small.zip` 已达完整字节数 | ||
| 3447 | - 验证本地 FMA 音频目录已可用于真实 smoke | ||
| 3448 | - 直接启动真实 FMA `smoke-local`,进入训练/索引/评测主链路 | ||
| 3449 | |||
| 3450 | 验证结果: | ||
| 3451 | - `/usr/local/miniconda3/bin/python scripts/prepare_fma_archive.py inspect` 返回: | ||
| 3452 | - `archive_size=7679594875` | ||
| 3453 | - `archive_progress_percent=100.0` | ||
| 3454 | - `num_audio_files=3025`(inspect 阶段) | ||
| 3455 | - 本地解压目录复检: | ||
| 3456 | - `find data/raw/fma_small_audio ... | wc -l` 返回 `5827` | ||
| 3457 | - `check-local-ready` / `inspect-local` 返回: | ||
| 3458 | - `ready_for_smoke=true` | ||
| 3459 | - `num_audio_files=8000` | ||
| 3460 | - `eligible_query_files=7994` | ||
| 3461 | - `recommended_train_queries=6395` | ||
| 3462 | - `recommended_test_queries=1599` | ||
| 3463 | - 真实 smoke 已启动: | ||
| 3464 | - `/usr/local/miniconda3/bin/python src/data/external_adapters.py smoke-local fma data/raw/fma_small_audio --output-root data/external_smoke --eval-ratio 0.2 --query-duration 8.0 --train-epochs 1 --batch-size 2` | ||
| 3465 | - 当前训练侧实时证据: | ||
| 3466 | - `Device: cpu` | ||
| 3467 | - `Classes: 6381` | ||
| 3468 | - `Train songs: 6381` | ||
| 3469 | - `Epoch 1` 已启动 | ||
| 3470 | - 当前 epoch 总 batch 数:`3191` | ||
| 3471 | |||
| 3472 | 结论: | ||
| 3473 | - 真实 FMA 数据下载门槛已正式打开 | ||
| 3474 | - 项目已从“等待真实数据”切换到“真实数据 smoke 正在执行”的阶段 | ||
| 3475 | |||
| 3278 | ### Stage: 真实 FMA 下载超过八成半 | 3476 | ### Stage: 真实 FMA 下载超过八成半 |
| 3279 | 3477 | ||
| 3280 | 完成项: | 3478 | 完成项: |
| ... | @@ -3766,6 +3964,39 @@ | ... | @@ -3766,6 +3964,39 @@ |
| 3766 | 3964 | ||
| 3767 | ## 2026-06-02 | 3965 | ## 2026-06-02 |
| 3768 | 3966 | ||
| 3967 | ### Stage: 真实 FMA 本地数据门槛打开并进入 smoke 训练 | ||
| 3968 | |||
| 3969 | 完成项: | ||
| 3970 | - 复检归档下载状态,确认 `fma_small.zip` 已达完整字节数 | ||
| 3971 | - 验证本地 FMA 音频目录已可用于真实 smoke | ||
| 3972 | - 直接启动真实 FMA `smoke-local`,进入训练/索引/评测主链路 | ||
| 3973 | |||
| 3974 | 验证结果: | ||
| 3975 | - `/usr/local/miniconda3/bin/python scripts/prepare_fma_archive.py inspect` 返回: | ||
| 3976 | - `archive_size=7679594875` | ||
| 3977 | - `archive_progress_percent=100.0` | ||
| 3978 | - `num_audio_files=3025`(inspect 阶段) | ||
| 3979 | - 本地解压目录复检: | ||
| 3980 | - `find data/raw/fma_small_audio ... | wc -l` 返回 `5827` | ||
| 3981 | - `check-local-ready` / `inspect-local` 返回: | ||
| 3982 | - `ready_for_smoke=true` | ||
| 3983 | - `num_audio_files=8000` | ||
| 3984 | - `eligible_query_files=7994` | ||
| 3985 | - `recommended_train_queries=6395` | ||
| 3986 | - `recommended_test_queries=1599` | ||
| 3987 | - 真实 smoke 已启动: | ||
| 3988 | - `/usr/local/miniconda3/bin/python src/data/external_adapters.py smoke-local fma data/raw/fma_small_audio --output-root data/external_smoke --eval-ratio 0.2 --query-duration 8.0 --train-epochs 1 --batch-size 2` | ||
| 3989 | - 当前训练侧实时证据: | ||
| 3990 | - `Device: cpu` | ||
| 3991 | - `Classes: 6381` | ||
| 3992 | - `Train songs: 6381` | ||
| 3993 | - `Epoch 1` 已启动 | ||
| 3994 | - 当前 epoch 总 batch 数:`3191` | ||
| 3995 | |||
| 3996 | 结论: | ||
| 3997 | - 真实 FMA 数据下载门槛已正式打开 | ||
| 3998 | - 项目已从“等待真实数据”切换到“真实数据 smoke 正在执行”的阶段 | ||
| 3999 | |||
| 3769 | ### Stage: 真实 FMA 下载超过八成半 | 4000 | ### Stage: 真实 FMA 下载超过八成半 |
| 3770 | 4001 | ||
| 3771 | 完成项: | 4002 | 完成项: |
| ... | @@ -4262,6 +4493,39 @@ | ... | @@ -4262,6 +4493,39 @@ |
| 4262 | 4493 | ||
| 4263 | ## 2026-06-02 | 4494 | ## 2026-06-02 |
| 4264 | 4495 | ||
| 4496 | ### Stage: 真实 FMA 本地数据门槛打开并进入 smoke 训练 | ||
| 4497 | |||
| 4498 | 完成项: | ||
| 4499 | - 复检归档下载状态,确认 `fma_small.zip` 已达完整字节数 | ||
| 4500 | - 验证本地 FMA 音频目录已可用于真实 smoke | ||
| 4501 | - 直接启动真实 FMA `smoke-local`,进入训练/索引/评测主链路 | ||
| 4502 | |||
| 4503 | 验证结果: | ||
| 4504 | - `/usr/local/miniconda3/bin/python scripts/prepare_fma_archive.py inspect` 返回: | ||
| 4505 | - `archive_size=7679594875` | ||
| 4506 | - `archive_progress_percent=100.0` | ||
| 4507 | - `num_audio_files=3025`(inspect 阶段) | ||
| 4508 | - 本地解压目录复检: | ||
| 4509 | - `find data/raw/fma_small_audio ... | wc -l` 返回 `5827` | ||
| 4510 | - `check-local-ready` / `inspect-local` 返回: | ||
| 4511 | - `ready_for_smoke=true` | ||
| 4512 | - `num_audio_files=8000` | ||
| 4513 | - `eligible_query_files=7994` | ||
| 4514 | - `recommended_train_queries=6395` | ||
| 4515 | - `recommended_test_queries=1599` | ||
| 4516 | - 真实 smoke 已启动: | ||
| 4517 | - `/usr/local/miniconda3/bin/python src/data/external_adapters.py smoke-local fma data/raw/fma_small_audio --output-root data/external_smoke --eval-ratio 0.2 --query-duration 8.0 --train-epochs 1 --batch-size 2` | ||
| 4518 | - 当前训练侧实时证据: | ||
| 4519 | - `Device: cpu` | ||
| 4520 | - `Classes: 6381` | ||
| 4521 | - `Train songs: 6381` | ||
| 4522 | - `Epoch 1` 已启动 | ||
| 4523 | - 当前 epoch 总 batch 数:`3191` | ||
| 4524 | |||
| 4525 | 结论: | ||
| 4526 | - 真实 FMA 数据下载门槛已正式打开 | ||
| 4527 | - 项目已从“等待真实数据”切换到“真实数据 smoke 正在执行”的阶段 | ||
| 4528 | |||
| 4265 | ### Stage: 真实 FMA 下载超过八成半 | 4529 | ### Stage: 真实 FMA 下载超过八成半 |
| 4266 | 4530 | ||
| 4267 | 完成项: | 4531 | 完成项: | ... | ... |
-
Please register or sign in to post a comment