1. 03 Jun, 2026 1 commit
    • Document the production decision to stabilize the embedding space before onboarding a 300k-song catalog, and record the migration rules for future encoder upgrades.
      
      Constraint: 300k-song production rollout makes embedding churn expensive and risky
      Rejected: keep iterating encoder before defining a production embedding version | would force repeated full-vector rebuilds and unstable rollout criteria
      Confidence: high
      Scope-risk: narrow
      Directive: Treat encoder changes as versioned index migrations, not in-place model swaps
      Tested: reviewed rendered markdown content, docs index link, changelog entry, and git diff for the three touched docs
      Not-tested: git push / remote sync outcome depends on repository remote state
      cnb.bofCdSsphPA authored
  2. 02 Jun, 2026 39 commits
    • Capture the latest sweep evidence so the next session can resume cleanly.
      
      Constraint: docs only; keep large data and checkpoints out of git
      Rejected: leaving hum_guard unrecorded | would lose the newest verification evidence
      Confidence: high
      Scope-risk: narrow
      Directive: continue the next search from hum_focus
      Tested: reviewed the eval.json evidence and diff
      Not-tested: no code or model changes in this commit
      cnb.bofCdSsphPA authored
    • Keep the current optimization state resumable and concise.
      
      Constraint: docs only; avoid raw data, checkpoints, and __pycache__
      Rejected: continuing implementation now | user requested a fast delivery package first
      Confidence: high
      Scope-risk: narrow
      Directive: resume from the handoff docs on the next session
      Tested: reviewed diff and confirmed only four docs changed
      Not-tested: no code or training pipeline changes in this commit
      cnb.bofCdSsphPA authored
    • …doff restart-safe and avoid staging temporary sweep artifacts\nRejected: Switch back to v6 or continue blind search | Fresh evidence shows hum_focus is the current best candidate and the right anchor for finer tuning\nConfidence: high\nScope-risk: narrow\nDirective: Use hum_focus as the baseline for the next micro-search, preserving humming_like gains while keeping confused at 0.25\nTested: Verified hum_focus versus hum_balanced with fresh eval results and updated docs accordingly\nNot-tested: Whether a further micro-tuned variant beats hum_focus
      cnb.bofCdSsphPA authored
    • …f must record the fresh dual-axis metric outcome without staging temporary smoke artifacts\nRejected: Keep tuning weights before checkpointing | The first end-to-end dual-axis result is already a meaningful evidence point and restart-safe boundary\nConfidence: high\nScope-risk: narrow\nDirective: Continue with finer-grained dual-axis weight search, targeting humming_like recovery while preserving confused gains\nTested: Verified dual-axis smoke completed train, build-index, and evaluate with top1 0.5 / topk 0.9 and updated handoff/changelog docs\nNot-tested: Improved dual-axis weight combinations beyond this first balanced trial
      cnb.bofCdSsphPA authored
    • …t: Keep the training pipeline behavior stable while exposing humming_like and confused controls through config only\nRejected: Add a brand-new sampler framework first | The smallest useful step is config-driven control on the existing dataset weighting path\nConfidence: high\nScope-risk: narrow\nDirective: Run weight-search experiments through training.sample_type_weights and training.pair_type_weights before attempting broader training-stack refactors\nTested: py_compile passed, train.py dry-run on synthetic_v2 passed, and custom SongPairDataset weighting instantiation produced expected hard_weight output\nNot-tested: End-to-end retraining and metric improvements from new dual-axis weight combinations
      cnb.bofCdSsphPA authored
    • … handoff must convert baseline metrics into an actionable causal explanation without staging report artifacts\nRejected: Start a new weighting experiment immediately | Source-backed explanation of the existing split is cheaper and reduces blind iteration risk\nConfidence: high\nScope-risk: narrow\nDirective: Treat dual-axis hard-case weighting as the next design lane, using v6 as the base and v5 as the humming_like reference\nTested: Verified source-backed v5/v6 definitions from changelog and smoke-v6 config artifacts, then updated handoff/changelog docs\nNot-tested: A new merged weighting strategy or its downstream metric impact
      cnb.bofCdSsphPA authored
    • … Handoff must encode the new baseline decision without staging temporary sweep artifacts\nRejected: Jump straight into retraining without baseline comparison | Fresh sweep evidence now makes a targeted v6-vs-v5 optimization path cheaper and safer\nConfidence: high\nScope-risk: narrow\nDirective: Use v6 as the overall baseline and treat v5 as the humming_like comparison target before changing training or segmentation logic\nTested: Ran a synthetic_v2 hard-case sweep across v3-v6, verified summary metrics, and updated handoff/changelog docs with the baseline decision\nNot-tested: Whether a merged v6-plus-v5 strategy improves real open-data derived hard cases
      cnb.bofCdSsphPA authored
    • …off must distinguish clean real-path evidence from hard-case evidence without staging temporary evaluation artifacts\nRejected: Keep scaling clean-only FMA smoke first | Fresh evidence shows the next highest-yield work is hard-case top1 improvement\nConfidence: high\nScope-risk: narrow\nDirective: Treat humming_like and confused as the primary optimization targets before investing more cycles in larger clean-only smoke runs\nTested: Audited manifest type coverage, verified synthetic_v2 hard-case evaluate results, and updated handoff/changelog docs with the gap analysis\nNot-tested: Post-optimization hard-case improvements on real open-data derived hard cases
      cnb.bofCdSsphPA authored
    • …ate must reflect fresh evaluate evidence without staging temporary eval assets\nRejected: Wait for larger-scale or hard-case metrics | The first explicit evaluate closure is already a meaningful milestone and restart-safe handoff point\nConfidence: high\nScope-risk: narrow\nDirective: Reuse /tmp/fma_realpath_small_rerun_index2 and /tmp/fma_realpath_small_rerun_eval as the next validation baseline before scaling up\nTested: Verified eval_top50.json at num_queries 35 with top1 0.8571 and topk 1.0, confirmed query-count explanation, and updated handoff/changelog docs\nNot-tested: Larger query caps, hard-case buckets, and full-scale FMA evaluate runs
      cnb.bofCdSsphPA authored
    • …livery docs must reflect fresh post-fix completion evidence and exclude data/index artifacts\nRejected: Delay until evaluate evidence exists | Completed reference index is already a distinct stage milestone the user asked us to checkpoint\nConfidence: high\nScope-risk: narrow\nDirective: Use /tmp/fma_realpath_small_rerun_index2 as the primary handoff artifact and validate evaluate or identify next before expanding sample size\nTested: Verified reference_progress.json complete at 200/200, reference_embs.npy and reference_ids.npy present, embedding_shape [2068, 192], and handoff/changelog docs updated\nNot-tested: Automatic evaluate chaining and retrieval quality on the completed 200-ref index
      cnb.bofCdSsphPA authored
    • … must reflect fresh observable evidence before restart and avoid staging data artifacts\nRejected: Wait for full reference completion | User asked for immediate delivery package and current checkpoint is already a meaningful stage transition\nConfidence: high\nScope-risk: narrow\nDirective: Treat session 19709 and /tmp/fma_realpath_small_rerun_index2 as the primary continuation path until final reference artifacts or a new traceback appear\nTested: Verified chromaprint 200/200 complete, reference_progress.json 25/200 checkpoint, partial reference numpy artifacts, and updated handoff/changelog files\nNot-tested: Full reference completion and downstream evaluate stage on the active rerun
      cnb.bofCdSsphPA authored
    • Constraint: Real-path investigation exposed decode failures from mpg123/librosa on some MP3s during long index runs
      Rejected: Abort the entire job on first decode error | it turns one bad asset into total index failure
      Confidence: high
      Scope-risk: narrow
      Directive: Keep per-file skip logging and skipped_refs accounting while continuing the real-path root-cause run
      Tested: Verified /tmp/chroma_skip_repro with 1 good MP3 + 1 bad MP3 completes RC=0, logs skip decode failure, writes reference outputs, and records skipped_refs=1
      Not-tested: Full real-path FMA rerun after tolerance change is still pending
      cnb.bofCdSsphPA authored
    • Constraint: The live build-index investigation was blocked by stdout/stderr buffering that left log files at 0 bytes during long runs
      Rejected: Keep diagnosing from progress files alone | they do not preserve traceback or stage-transition context
      Confidence: high
      Scope-risk: narrow
      Directive: Preserve flush-on-progress behavior while chasing the remaining real-path build-index root cause
      Tested: Verified tiny repro /tmp/chroma_repro_tiny12 writes live logs and traceback with RC=1 after flush=True change
      Not-tested: No final fix for the real-path build-index exit yet
      cnb.bofCdSsphPA authored
    • Constraint: Both observable and legacy build-index jobs exited without producing reference_* or evaluate artifacts
      Rejected: Keep treating the run as slow linear progress | no longer matches the fresh ps/pgrep evidence
      Confidence: high
      Scope-risk: narrow
      Directive: Start the next cycle with build-index exit-path diagnosis before launching more long runs
      Tested: Verified ps/pgrep show no active build/evaluate process; verified observable directory still only has chromaprint progress/cache files; reviewed updated handoff docs
      Not-tested: No root-cause reproduction or fix yet
      cnb.bofCdSsphPA authored
    • Constraint: Long-running CPU-only chromaprint indexing has not reached evaluate yet
      Rejected: Keep appending linear refs_done updates | produces noise without a stage transition
      Confidence: high
      Scope-risk: narrow
      Directive: Do not create the next handoff commit until chromaprint completes, reference_* appears, evaluate starts, or the process fails
      Tested: Verified /tmp/chroma_index_observable_smoke progress snapshot; reviewed updated handoff/changelog files
      Not-tested: No new model/evaluation result because build-index has not reached the next stage
      cnb.bofCdSsphPA authored
    • Constraint: the live 8000-reference FMA run is already in flight, so observability had to be added as forward-safe progress and partial-cache outputs for future runs instead of altering the active process
      Rejected: keep waiting on blind build-index runs | hides whether chromaprint is advancing and blocks operational debugging
      Confidence: high
      Scope-risk: narrow
      Directive: Prefer progress JSON plus partial cache evidence for future large-index investigations before assuming a stall
      Tested: py_compile on chromaprint_matcher.py and run_demo.py, verified chromaprint_progress.json and chromaprint.pkl appear in /tmp/chroma_index_observable_smoke at refs_done=50
      Not-tested: end-to-end completion of the new observable build-index flow has not finished yet
      cnb.bofCdSsphPA authored
    • Constraint: the live FMA smoke is still running, so the optimization had to preserve existing hash semantics rather than adopt the faster non-equivalent peak picker
      Rejected: maximum_filter-based peak picking | changed peak/hash outputs despite much larger speedup
      Confidence: high
      Scope-risk: narrow
      Directive: Keep future chromaprint optimizations hash-equivalent unless evaluation baselines are intentionally regenerated
      Tested: compared old vs new peaks and hashes on fma_00000.mp3, measured 2.02x speedup, py_compile passed, rechecked live FMA smoke still in build-index
      Not-tested: full build-index completion on the live 8000-reference FMA run has not finished yet
      cnb.bofCdSsphPA authored
    • Constraint: CPU-only real FMA smoke is still running, so delivery must emphasize resumable evidence instead of final metrics
      Rejected: wait for evaluate completion | would block handoff and delay resumable delivery with no new guaranteed result
      Confidence: high
      Scope-risk: narrow
      Directive: Keep future commits limited to explicit doc files unless index/evaluate artifacts are intentionally being reported
      Tested: verified running PIDs, checked best_model.pt and song_to_idx.json existence, revalidated manifests with validate-splits
      Not-tested: final index artifact emission and evaluate metrics are not available yet
      cnb.bofCdSsphPA authored
    • Capture the newest verified downstream state so restart docs show that the real FMA smoke still remains in build-index at 13:34 UTC with no evaluate stage or emitted index artifacts yet.
      
      Constraint: Downstream completion evidence is still unavailable because build-index has not produced artifacts or switched to evaluate
      Rejected: Leave the 13:34 observation only in chat | would break restart continuity and phase-by-phase commit discipline
      Confidence: high
      Scope-risk: narrow
      Directive: Next capture either the first index artifact file or the transition into evaluate before changing the downstream status summary again
      Tested: process scan showing build-index and no evaluate; presence of /tmp/fma_real_smoke_stopcheck/fma_index_smoke directory; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests
      Not-tested: Completed build-index output, evaluate, final metrics/report generation
      cnb.bofCdSsphPA authored
    • Capture the newest verified downstream state so restart docs show that the real FMA smoke still remains in build-index at 13:28 UTC with no evaluate stage or emitted index artifacts yet.
      
      Constraint: Downstream completion evidence is still unavailable because build-index has not produced artifacts or switched to evaluate
      Rejected: Leave the 13:28 observation only in chat | would break restart continuity and phase-by-phase commit discipline
      Confidence: high
      Scope-risk: narrow
      Directive: Next capture either the first index artifact file or the transition into evaluate before changing the downstream status summary again
      Tested: process scan showing build-index and no evaluate; presence of /tmp/fma_real_smoke_stopcheck/fma_index_smoke directory; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests
      Not-tested: Completed build-index output, evaluate, final metrics/report generation
      cnb.bofCdSsphPA authored
    • Capture the newest verified downstream state so restart docs show that the real FMA smoke still remains in build-index at 13:22 UTC with no evaluate stage or emitted index artifacts yet.
      
      Constraint: Downstream completion evidence is still unavailable because build-index has not produced artifacts or switched to evaluate
      Rejected: Leave the 13:22 observation only in chat | would break restart continuity and phase-by-phase commit discipline
      Confidence: high
      Scope-risk: narrow
      Directive: Next capture either the first index artifact file or the transition into evaluate before changing the downstream status summary again
      Tested: process scan showing build-index and no evaluate; presence of /tmp/fma_real_smoke_stopcheck/fma_index_smoke directory; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests
      Not-tested: Completed build-index output, evaluate, final metrics/report generation
      cnb.bofCdSsphPA authored
    • Capture the newest verified downstream state so restart docs show that the real FMA smoke still remains in build-index at 13:16 UTC with no evaluate stage or emitted index artifacts yet.
      
      Constraint: Downstream completion evidence is still unavailable because build-index has not produced artifacts or switched to evaluate
      Rejected: Leave the 13:16 observation only in chat | would break restart continuity and phase-by-phase commit discipline
      Confidence: high
      Scope-risk: narrow
      Directive: Next capture either the first index artifact file or the transition into evaluate before changing the downstream status summary again
      Tested: process scan showing build-index and no evaluate; presence of /tmp/fma_real_smoke_stopcheck/fma_index_smoke directory; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests
      Not-tested: Completed build-index output, evaluate, final metrics/report generation
      cnb.bofCdSsphPA authored
    • Capture the newest verified downstream state so restart docs show that the real FMA smoke still remains in build-index at 13:10 UTC with no evaluate stage or emitted index artifacts yet.
      
      Constraint: Downstream completion evidence is still unavailable because build-index has not produced artifacts or switched to evaluate
      Rejected: Leave the latest observation only in chat | would break restart continuity and phase-by-phase commit discipline
      Confidence: high
      Scope-risk: narrow
      Directive: Next capture either the first index artifact file or the transition into evaluate before changing the downstream status summary again
      Tested: process scan showing build-index and no evaluate; presence of /tmp/fma_real_smoke_stopcheck/fma_index_smoke directory; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests
      Not-tested: Completed build-index output, evaluate, final metrics/report generation
      cnb.bofCdSsphPA authored
    • Capture the newest verified downstream state so restart docs show that the real FMA smoke still remains in build-index at 13:04 UTC with no evaluate stage or emitted index artifacts yet.
      
      Constraint: Downstream completion evidence is still unavailable because build-index has not produced artifacts or switched to evaluate
      Rejected: Reuse the 12:59 UTC checkpoint | would leave the handoff behind the latest verified downstream state
      Confidence: high
      Scope-risk: narrow
      Directive: Next capture either the first index artifact file or the transition into evaluate before changing the downstream status summary again
      Tested: process scan showing build-index and no evaluate; presence of /tmp/fma_real_smoke_stopcheck/fma_index_smoke directory; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests
      Not-tested: Completed build-index output, evaluate, final metrics/report generation
      cnb.bofCdSsphPA authored
    • Capture the newest verified downstream state so restart docs show that the real FMA smoke still remains in build-index at 12:59 UTC with no evaluate stage or emitted index artifacts yet.
      
      Constraint: Downstream completion evidence is still unavailable because build-index has not produced artifacts or switched to evaluate
      Rejected: Reuse the 12:55 UTC checkpoint | would leave the handoff behind the latest verified downstream state
      Confidence: high
      Scope-risk: narrow
      Directive: Next capture either the first index artifact file or the transition into evaluate before changing the downstream status summary again
      Tested: process scan showing build-index and no evaluate; presence of /tmp/fma_real_smoke_stopcheck/fma_index_smoke directory; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests
      Not-tested: Completed build-index output, evaluate, final metrics/report generation
      cnb.bofCdSsphPA authored
    • Capture the newest verified downstream state so restart docs show that the real FMA smoke still remains in build-index at 12:55 UTC with no evaluate stage or emitted index artifacts yet.
      
      Constraint: Downstream completion evidence is still unavailable because build-index has not produced artifacts or switched to evaluate
      Rejected: Reuse the 12:51 UTC checkpoint | would leave the handoff behind the latest verified downstream state
      Confidence: high
      Scope-risk: narrow
      Directive: Next capture either the first index artifact file or the transition into evaluate before changing the downstream status summary again
      Tested: process scan showing build-index and no evaluate; presence of /tmp/fma_real_smoke_stopcheck/fma_index_smoke directory; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests
      Not-tested: Completed build-index output, evaluate, final metrics/report generation
      cnb.bofCdSsphPA authored
    • Capture the newest verified downstream state so restart docs show that the real FMA smoke still remains in build-index with no evaluate stage or emitted index artifacts yet.
      
      Constraint: Downstream completion evidence is still unavailable because build-index has not produced artifacts or switched to evaluate
      Rejected: Reuse the 12:43 UTC checkpoint | would leave the handoff behind the latest verified downstream state
      Confidence: high
      Scope-risk: narrow
      Directive: Next capture either the first index artifact file or the transition into evaluate before changing the downstream status summary again
      Tested: process scan showing build-index and no evaluate; presence of /tmp/fma_real_smoke_stopcheck/fma_index_smoke directory; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests
      Not-tested: Completed build-index output, evaluate, final metrics/report generation
      cnb.bofCdSsphPA authored
    • Persist the latest downstream checkpoint so restart docs show that the real FMA smoke still remains in build-index, with no evaluate stage or emitted index artifacts yet.
      
      Constraint: Downstream completion evidence is still unavailable because build-index has not produced artifacts or switched to evaluate
      Rejected: Reuse the 12:39 UTC checkpoint | would leave the handoff behind the latest verified downstream state
      Confidence: high
      Scope-risk: narrow
      Directive: Next capture either the first index artifact file or the transition into evaluate before changing the downstream status summary again
      Tested: process scan showing build-index and no evaluate; presence of /tmp/fma_real_smoke_stopcheck/fma_index_smoke directory; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests
      Not-tested: Completed build-index output, evaluate, final metrics/report generation
      cnb.bofCdSsphPA authored
    • Capture the next downstream checkpoint so restart docs reflect that the real FMA smoke remains in build-index, with no evaluate stage yet and no emitted index artifacts so far.
      
      Constraint: Final downstream evidence is still unavailable because build-index has not produced artifacts or switched to evaluate
      Rejected: Wait silently for evaluate to start | would leave the handoff missing the latest verified downstream state
      Confidence: high
      Scope-risk: narrow
      Directive: Next verify the first index artifact file or the transition into evaluate before changing the delivery summary again
      Tested: process scan showing build-index and no evaluate; presence of /tmp/fma_real_smoke_stopcheck/fma_index_smoke directory; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests
      Not-tested: Completed build-index output, evaluate, final metrics/report generation
      cnb.bofCdSsphPA authored
    • Update the handoff package with the next downstream checkpoint so a restarted session knows training is done, build-index is active, and evaluate has not started yet.
      
      Constraint: Final evaluation evidence is still unavailable because build-index has not completed
      Rejected: Wait silently for evaluate to start | would lose a useful downstream checkpoint for restart continuity
      Confidence: high
      Scope-risk: narrow
      Directive: Next capture either the first index artifact file or the transition into evaluate
      Tested: process scan showing build-index and no evaluate; presence of /tmp/fma_real_smoke_stopcheck/fma_index_smoke directory; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests
      Not-tested: Completed build-index output, evaluate, final metrics/report generation
      cnb.bofCdSsphPA authored
    • Record the first decisive runtime milestone so restart docs show that the real FMA smoke has finished training, produced a model, and moved into build-index.
      
      Constraint: Final evaluation metrics are not available yet because the smoke is still running downstream of training
      Rejected: Keep describing the run as training-only | would now be materially inaccurate
      Confidence: high
      Scope-risk: narrow
      Directive: Next verify the transition from build-index into evaluate and then capture the final report artifacts
      Tested: process scan showing build-index; absence of train.py PID 311629; presence of best_model.pt and song_to_idx.json; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests
      Not-tested: Completed build-index, evaluate, and final metrics/report generation
      cnb.bofCdSsphPA authored
    • Persist a wider observation checkpoint so restart docs show continued forward motion across a 180-second window while the real FMA smoke remains inside Epoch 1.
      
      Constraint: Verification is still limited to runtime evidence and manifest revalidation because Epoch 1 has not completed
      Rejected: Stop at the 120-second checkpoint | would miss stronger evidence from the longer observation window
      Confidence: high
      Scope-risk: narrow
      Directive: Keep monitoring until the first saved model file or transition into build-index/evaluate appears
      Tested: ps on PID 311629 after 180s wait; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests; find on /tmp/fma_real_smoke_stopcheck/fma_models_smoke
      Not-tested: End-of-epoch artifacts, build-index, evaluate, final metrics
      cnb.bofCdSsphPA authored
    • Persist a wider observation checkpoint so restart docs demonstrate continued forward motion across a longer interval while the real FMA smoke remains inside Epoch 1.
      
      Constraint: Verification is still limited to runtime evidence and manifest revalidation because Epoch 1 has not completed
      Rejected: Stop at the 30-second checkpoint | would miss stronger evidence from a longer observation window
      Confidence: high
      Scope-risk: narrow
      Directive: Keep monitoring until the first saved model file or transition into build-index/evaluate appears
      Tested: ps on PID 311629 after 120s wait; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests; find on /tmp/fma_real_smoke_stopcheck/fma_models_smoke
      Not-tested: End-of-epoch artifacts, build-index, evaluate, final metrics
      cnb.bofCdSsphPA authored
    • Capture a more meaningful follow-up checkpoint after an added wait window so the restart docs show continued forward motion rather than trivial second-to-second sampling.
      
      Constraint: Epoch 1 still has not completed, so verification is limited to runtime evidence and manifest revalidation
      Rejected: Skip the wider-window checkpoint | would miss the chance to prove progress across a longer observation gap
      Confidence: high
      Scope-risk: narrow
      Directive: Keep watching for the first saved model file or transition into build-index/evaluate before changing the project status summary
      Tested: ps on PID 311629 after 30s wait; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests; find on /tmp/fma_real_smoke_stopcheck/fma_models_smoke
      Not-tested: End-of-epoch artifacts, build-index, evaluate, final metrics
      cnb.bofCdSsphPA authored
    • Persist a newer runtime checkpoint so restart docs continue to prove that the real FMA smoke is still progressing inside Epoch 1 without yet saving a model or entering downstream stages.
      
      Constraint: Verification is still limited to live runtime evidence because Epoch 1 has not completed
      Rejected: Reuse the prior 22:10 checkpoint | would leave handoff docs behind the latest verified state
      Confidence: high
      Scope-risk: narrow
      Directive: Keep monitoring until the first saved model file or stage transition appears
      Tested: ps on PID 311629; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests; find on /tmp/fma_real_smoke_stopcheck/fma_models_smoke
      Not-tested: End-of-epoch artifacts, build-index, evaluate, final metrics
      cnb.bofCdSsphPA authored
    • Record a later live checkpoint so restart docs keep proving that the real FMA smoke is still advancing inside Epoch 1 without yet producing a saved model or entering downstream stages.
      
      Constraint: Verification is still limited to live runtime evidence because Epoch 1 has not completed
      Rejected: Reuse the prior 20:08 checkpoint | would leave handoff docs behind the latest verified state
      Confidence: high
      Scope-risk: narrow
      Directive: Keep monitoring until the first saved model file or stage transition appears
      Tested: ps on PID 311629; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests; find on /tmp/fma_real_smoke_stopcheck/fma_models_smoke
      Not-tested: End-of-epoch artifacts, build-index, evaluate, final metrics
      cnb.bofCdSsphPA authored
    • Capture a newer live checkpoint so restart docs continue to prove the real FMA smoke is progressing inside Epoch 1 without yet reaching model save or downstream evaluation stages.
      
      Constraint: Verification remains limited to live runtime state because the first epoch has not completed
      Rejected: Stop at the prior 19:12 checkpoint | would leave the handoff behind the latest verified state
      Confidence: high
      Scope-risk: narrow
      Directive: Keep monitoring until the first saved model file or stage transition appears
      Tested: ps on PID 311629; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests; find on /tmp/fma_real_smoke_stopcheck/fma_models_smoke
      Not-tested: End-of-epoch artifacts, build-index, evaluate, final metrics
      cnb.bofCdSsphPA authored
    • Preserve a newer restart checkpoint so the next session inherits up-to-date proof that the real FMA smoke continues progressing inside Epoch 1 without yet saving a model or entering downstream stages.
      
      Constraint: Verification is still limited to live runtime evidence because Epoch 1 has not completed
      Rejected: Keep the prior 18:22 checkpoint only | would leave the handoff one monitoring cycle behind reality
      Confidence: high
      Scope-risk: narrow
      Directive: Continue monitoring until the first saved model file or stage transition appears before changing status conclusions
      Tested: ps on PID 311629; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests; find on /tmp/fma_real_smoke_stopcheck/fma_models_smoke
      Not-tested: End-of-epoch artifacts, build-index, evaluate, final metrics
      cnb.bofCdSsphPA authored
    • Keep the restart artifacts synchronized with the newest observed elapsed time so the next session can see that the real FMA smoke is still advancing without yet reaching model save or evaluation stages.
      
      Constraint: Training remains inside Epoch 1, so verification is limited to live runtime evidence
      Rejected: Stop at the prior 17:07 checkpoint | would leave handoff docs behind the latest verified state
      Confidence: high
      Scope-risk: narrow
      Directive: Continue monitoring until the first saved model file or stage transition appears
      Tested: ps on PID 311629; validate-splits on /tmp/fma_real_smoke_stopcheck/fma/manifests; find on /tmp/fma_real_smoke_stopcheck/fma_models_smoke
      Not-tested: End-of-epoch artifacts, build-index, evaluate, final metrics
      cnb.bofCdSsphPA authored