Commits · d1f132034b28317088dcb5b94eb8460f5a3c66f3 · wanghai-tech / hikoon-ACR

02 Jun, 2026 40 commits

Promote cap48 guidance once the third seed confirmed the stable winner · d1f13203 ...

Constraint: Strategy guidance had to wait until the full seed=999 report landed and all three cap48 runs could be aggregated consistently
Rejected: Keep treating cap48 as unresolved | The third seed now confirms high_energy repeats the same score while hybrid remains volatile
Confidence: high
Scope-risk: narrow
Directive: Treat high_energy as the cap48 default only within the documented FMA smoke condition until larger cap64 and bucketed benchmarks either confirm or overturn it
Tested: Verified seed=999 report.json, high_energy eval.json, hybrid eval.json, and computed three-seed aggregate showing high_energy mean_top1=0.9167 with zero variance versus hybrid mean_top1=0.8750
Not-tested: cap64-or-larger benchmarks, bucket/style-aware evaluations, and any future hybrid redesign

authored 2026-06-02 18:29:00 +0800

Preserve the hybrid seed999 score before the second strategy finishes · d13a3b8b ...

d13a3b8b Browse Directory

Constraint: The cap48 seed=999 run has only completed the hybrid leg, so the three-seed aggregate is still incomplete
Rejected: Wait for high_energy to finish before checkpointing | Would risk losing the verified hybrid seed999 score from the active Ralph session
Confidence: high
Scope-risk: narrow
Directive: Keep recording verified partial benchmark milestones, but do not revise default-strategy guidance until both strategies and the final report are available
Tested: Verified hybrid eval.json reports num_queries=24, top1=0.875, topk=1.0; verified progress.json records the same result; verified high_energy is still running and report.json is still absent
Not-tested: Final high_energy seed999 metrics, final report.json, and updated three-seed aggregate

authored 2026-06-02 18:25:51 +0800

Preserve fresh benchmark evidence before the evaluation finishes · bdc04f72 ...

bdc04f72 Browse Directory

Constraint: The running cap48 seed=999 benchmark has not emitted its final report yet, so only in-flight evidence can be recorded safely
Rejected: Claim a new three-seed conclusion now | The aggregate would be speculative without report.json and eval outputs
Confidence: high
Scope-risk: narrow
Directive: When a long benchmark is still active, checkpoint stage evidence explicitly and wait for report.json before changing strategy guidance
Tested: Verified process tree shows hybrid moved from build-index to evaluate.py; verified reference_progress.json reports 48 refs, 491 windows, 192-d embeddings, and complete status; verified report.json is still absent
Not-tested: Final hybrid eval metrics, subsequent high_energy run, and final three-seed aggregate

authored 2026-06-02 18:22:40 +0800

Preserve restartable delivery state before the long benchmark finishes · 0d40b05c ...

0d40b05c Browse Directory

Constraint: The cap48 seed=999 benchmark is still running, so this checkpoint must avoid unverified algorithm conclusions
Rejected: Wait for the CPU benchmark to finish | Would delay handoff and leave the next session without a clean restart package
Confidence: high
Scope-risk: narrow
Directive: Keep future doc-only checkpoints surgically staged and do not add data/raw, external_smoke, /tmp outputs, or model artifacts
Tested: Verified staged diff only includes AGENT memory, handoff, changelog, and changelist docs; confirmed /tmp cap48 seed=999 report is not ready yet
Not-tested: The in-flight cap48 seed=999 benchmark result and any follow-up aggregate metrics

authored 2026-06-02 18:20:30 +0800

Promote the cap48 discussion from single runs to two-seed aggregates · ae0d14a5 ...

ae0d14a5 Browse Directory

Persist the current two-seed cap48 summary so the strategy recommendation is grounded in aggregated evidence rather than whichever single run happened most recently.

Constraint: Only documentation changes are allowed because benchmark artifacts remain outside version control
Rejected: Keep narrating cap48 one run at a time | The aggregate is now more informative than any individual cap48 run
Confidence: high
Scope-risk: narrow
Directive: Prefer reporting aggregate seed statistics once two or more runs exist; avoid re-elevating single-seed claims above the aggregate
Tested: Verified both cap48 report.json files; computed aggregate mean/min/max/stdev; verified docs now record high_energy mean_top1=0.9167 and hybrid mean_top1=0.8750
Not-tested: Aggregates beyond two seeds or style-bucketed aggregates

authored 2026-06-02 18:15:34 +0800

Reframe the cap48 finding as seed-sensitive after the second rerun · e519dab7 ...

e519dab7 Browse Directory

Persist the completed seed123 benchmark showing hybrid ahead again, and update the strategy guidance from single-run winner claims to a multi-seed interpretation.

Constraint: Only documentation changes are allowed because benchmark outputs remain outside version control
Rejected: Keep framing cap48 as a stable high_energy win | The second seed materially weakens that interpretation
Confidence: high
Scope-risk: narrow
Directive: Base the hybrid vs high_energy default decision on aggregated multi-seed evidence, not any single cap48 run
Tested: Verified /tmp/ab_smoke_seg_cap48_top2_seed123/report.json; verified high_energy eval.json; verified docs now record hybrid=24/0.9583/1.0 and high_energy=24/0.9167/1.0 for seed123
Not-tested: Formal aggregation across multiple seeds beyond these two cap48 runs

authored 2026-06-02 18:13:48 +0800

Record the first cap48 seed123 hybrid score for the multi-seed check · a3a5303f ...

a3a5303f Browse Directory

Persist the newly finished cap48 seed123 hybrid result so the second-seed validation run now has measured evidence instead of only a runtime checkpoint.

Constraint: seed123 high_energy and the final report are still pending
Rejected: Wait for the full seed123 report before updating docs | Would leave the multi-seed evidence stale across sessions
Confidence: high
Scope-risk: narrow
Directive: Replace the seed123 partial section with the final two-strategy ranking once high_energy eval and report.json land
Tested: Verified /tmp/ab_smoke_seg_cap48_top2_seed123/hybrid/fma_reports_smoke/eval.json; verified docs record hybrid=24/0.9583/1.0 and high_energy still in build-index
Not-tested: Final seed123 comparison because high_energy has not finished yet

authored 2026-06-02 18:10:08 +0800

Refresh the second cap48 seed checkpoint now that hybrid reached evaluation · ef7e4493 ...

ef7e4493 Browse Directory

Update the handoff and changelog with the newer seed123 runtime milestone so later sessions know the hybrid lane has advanced from build-index into capped evaluation.

Constraint: No measured seed123 score is available yet, only a later execution milestone
Rejected: Leave the older build-index note in place | Would make the restart handoff stale and less actionable
Confidence: high
Scope-risk: narrow
Directive: Replace the seed123 runtime note with measured scores as soon as hybrid eval.json or report.json land
Tested: Verified active seed123 hybrid evaluate.py process; verified docs now record seed123 current phase as evaluate.py --max-queries 24
Not-tested: Seed123 strategy scores because hybrid eval.json has not landed yet

authored 2026-06-02 18:08:52 +0800

Checkpoint the second cap48 seed while the rerun is still building · 124d4612 ...

124d4612 Browse Directory

Preserve the second-seed cap48 entry point and current build-index phase so later sessions can validate whether the cap48 reversal was stable or a seed artifact.

Constraint: The second-seed run has not produced scores yet, so only execution-state evidence is available
Rejected: Wait for the seed123 scores before recording anything | Risks losing the multi-seed validation checkpoint if the session ends first
Confidence: high
Scope-risk: narrow
Directive: Replace the seed123 running-state section with measured scores once hybrid eval.json or report.json land
Tested: Verified active cap48 seed123 processes; verified handoff records work-root, seed, subset size, query cap, and current build-index phase
Not-tested: cap48 seed123 strategy scores because the run is still in progress

authored 2026-06-02 18:04:26 +0800

Revise the default-strategy story after the cap48 reversal · d82d217a ...

d82d217a Browse Directory

Persist the larger 48-track benchmark where high_energy overtook hybrid, and downgrade the previously overconfident default-strategy claim to a conditional recommendation pending broader validation.

Constraint: Only documentation changes are allowed because benchmark outputs remain outside version control
Rejected: Keep asserting hybrid as fully settled default after cap48 | The 48-track capped benchmark materially contradicts that stronger claim
Confidence: high
Scope-risk: narrow
Directive: Resolve the hybrid vs high_energy default question with larger, multi-seed, style-aware benchmarks before making a final hard default claim
Tested: Verified /tmp/ab_smoke_seg_cap48_top2/report.json; verified high_energy eval.json; verified docs now record high_energy=24/0.9167/1.0 and hybrid=24/0.7917/1.0
Not-tested: Multi-seed or style-balanced follow-up benchmark beyond the single cap48 run

authored 2026-06-02 18:00:55 +0800

Refresh the cap48 checkpoint now that high-energy reached evaluation · 7769be8c ...

7769be8c Browse Directory

Update the handoff and changelog with the newer cap48 runtime milestone so later sessions know the high_energy lane has advanced from build-index into capped evaluation.

Constraint: No measured cap48 high_energy score is available yet, only a later execution milestone
Rejected: Leave the older build-index note in place | Would make the restart handoff stale and less actionable
Confidence: high
Scope-risk: narrow
Directive: Replace the cap48 runtime note with final top-two scores as soon as high_energy eval.json or report.json lands
Tested: Verified active cap48 high_energy evaluate.py process; verified docs now record high_energy current phase as evaluate.py --max-queries 24
Not-tested: Final cap48 comparison because high_energy eval.json has not landed yet

authored 2026-06-02 17:59:27 +0800

Record the first cap48 hybrid score while the larger run continues · 0f84d109 ...

0f84d109 Browse Directory

Persist the newly finished cap48 hybrid result so the next session can continue the 48-track validation run from measured evidence instead of only a runtime checkpoint.

Constraint: cap48 high_energy and the final report are still pending
Rejected: Wait for the full cap48 report before updating docs | Would leave the largest current real-data checkpoint stale across sessions
Confidence: high
Scope-risk: narrow
Directive: Replace the cap48 partial section with the final two-strategy ranking once high_energy eval and report.json land
Tested: Verified /tmp/ab_smoke_seg_cap48_top2/hybrid/fma_reports_smoke/eval.json; verified docs record hybrid=24/0.7917/1.0 and high_energy still in build-index
Not-tested: Final cap48 comparison because high_energy has not finished yet

authored 2026-06-02 17:55:53 +0800

Refresh the cap48 checkpoint now that hybrid reached evaluation · 727f06c5 ...

727f06c5 Browse Directory

Update the handoff and changelog with the newer cap48 runtime milestone so later sessions know the run has advanced from build-index into capped evaluation.

Constraint: No measured cap48 score is available yet, only a later execution milestone
Rejected: Leave the older build-index note in place | Would make the restart handoff stale and less actionable
Confidence: high
Scope-risk: narrow
Directive: Replace the cap48 runtime note with hybrid scores as soon as eval.json lands
Tested: Verified active cap48 evaluate.py process; verified docs now record cap48 current phase as evaluate.py --max-queries 24
Not-tested: cap48 strategy scores because hybrid eval.json has not landed yet

authored 2026-06-02 17:54:44 +0800

Checkpoint the cap48 benchmark while the larger run is still building · 026b5539 ...

026b5539 Browse Directory

Preserve the new 48-track top-two benchmark entry point and current build-index phase so later sessions can continue the expanding validation ladder without rediscovering runtime state.

Constraint: cap48 has not produced scores yet, so only execution-state evidence is available
Rejected: Wait for cap48 scores before recording anything | Risks losing the larger-benchmark checkpoint if the session ends first
Confidence: high
Scope-risk: narrow
Directive: Replace the cap48 running-state section with measured scores once hybrid eval.json or report.json land
Tested: Verified active cap48 processes; verified handoff records work-root, subset size, query cap, and current build-index phase
Not-tested: cap48 strategy scores because the run is still in progress

authored 2026-06-02 17:50:57 +0800

Lock the cap32 result and harden the hybrid default recommendation · f05e7023 ...

f05e7023 Browse Directory

Persist the larger 32-track benchmark showing hybrid strongly outperforming high_energy, so the default strategy decision rests on multiple larger real-data checkpoints instead of a single subset.

Constraint: Only documentation changes are allowed because benchmark artifacts stay outside version control
Rejected: Keep the default recommendation tentative after cap32 | The 24-track and 32-track capped benchmarks now agree on hybrid superiority
Confidence: high
Scope-risk: narrow
Directive: Use cap24 and cap32 together as the current strongest strategy evidence until a broader multi-style benchmark supersedes them
Tested: Verified /tmp/ab_smoke_seg_cap32_top2/report.json; verified high_energy eval.json; verified docs now record hybrid=20/0.95/1.0 and high_energy=20/0.5/1.0
Not-tested: Wider style-balanced benchmark beyond the FMA top-two subsets

authored 2026-06-02 17:46:42 +0800

Record the first cap32 hybrid score while the larger run continues · f228197d ...

f228197d Browse Directory

Persist the newly finished cap32 hybrid result so the next session can continue the top-two validation run from measured evidence instead of only a running-state checkpoint.

Constraint: cap32 high_energy and the final report are still pending
Rejected: Wait for the full cap32 report before updating docs | Would leave the larger-subset evidence stale across sessions
Confidence: high
Scope-risk: narrow
Directive: Replace the cap32 partial section with the final two-strategy ranking once high_energy eval and report.json land
Tested: Verified /tmp/ab_smoke_seg_cap32_top2/hybrid/fma_reports_smoke/eval.json; verified docs record hybrid=20/0.95/1.0 and high_energy still training
Not-tested: Final cap32 comparison because high_energy has not finished yet

authored 2026-06-02 17:42:43 +0800

Checkpoint the larger cap32 benchmark before results land · 5dadbae3 ...

5dadbae3 Browse Directory

Preserve the new 32-track top-two benchmark entry point and current build-index phase so a later session can continue the stronger validation run without losing runtime context.

Constraint: The cap32 benchmark is still running, so only execution-state evidence is available
Rejected: Wait for cap32 results before recording anything | Risks losing the larger-benchmark checkpoint if the session ends first
Confidence: high
Scope-risk: narrow
Directive: Replace the cap32 running-state section with measured scores once hybrid eval.json and report.json land
Tested: Verified active cap32 processes; verified handoff records work-root, subset size, query cap, and current build-index phase
Not-tested: cap32 strategy scores because the run is still in progress

authored 2026-06-02 17:41:01 +0800

Promote hybrid to the default strategy using the stronger cap24 evidence · 08379e56 ...

08379e56 Browse Directory

Persist the larger real-FMA benchmark result showing hybrid clearly outperforming high_energy, so the project recommendation can converge on one default instead of an unresolved tie.

Constraint: Only docs change because benchmark outputs remain outside version control
Rejected: Keep treating hybrid and high_energy as co-equal defaults | The larger 24-track capped benchmark now separates them clearly
Confidence: high
Scope-risk: narrow
Directive: Use cap24 top-two as the current strongest public evidence until a larger capped benchmark supersedes it
Tested: Verified /tmp/ab_smoke_seg_cap24_top2/report.json; verified high_energy eval.json; verified docs now state hybrid=16/1.0/1.0 and high_energy=16/0.8125/1.0
Not-tested: Broader strategy comparison beyond hybrid vs high_energy on the 24-track subset

authored 2026-06-02 17:36:12 +0800

Preserve the larger cap24 top-two benchmark checkpoint · 48a5957a ...

48a5957a Browse Directory

Record the new 24-track capped benchmark setup and the first completed hybrid result so the next session can continue the stronger tie-break experiment without rediscovering runtime state.

Constraint: The cap24 benchmark is still in progress, so only partial evidence can be documented now
Rejected: Wait for high_energy to finish before updating handoff | Risks losing the fresh larger-subset evidence if the session ends first
Confidence: high
Scope-risk: narrow
Directive: Replace the partial cap24 section with the final two-strategy ranking once report.json lands
Tested: Verified /tmp/ab_smoke_seg_cap24_top2/hybrid/fma_reports_smoke/eval.json; verified active cap24 processes; verified docs include the exact work-root and resume command
Not-tested: Final cap24 top-two comparison because high_energy is still training

authored 2026-06-02 17:33:42 +0800

Lock the final cap16 FMA benchmark ranking into the workflow docs · c659380d ...

c659380d Browse Directory

Persist the completed capped real-data benchmark results so future sessions can use the final strategy ordering and recommendation without replaying the run.

Constraint: Only documentation should change because benchmark artifacts live outside version control
Rejected: Leave the result only in /tmp report files | Would make the evidence fragile across sessions
Confidence: high
Scope-risk: narrow
Directive: Use cap16 as the current default evidence point until a larger capped benchmark supersedes it
Tested: Verified /tmp/ab_smoke_seg_cap16/report.json; verified repeated_section_aware eval.json; verified docs reflect final ranking hybrid/high_energy/beat_aware/repeated_section_aware
Not-tested: Larger real-dataset benchmark beyond the 16-track capped subset

authored 2026-06-02 17:27:36 +0800

Capture fresh high-energy benchmark evidence in the restart handoff · 29c1962c ...

29c1962c Browse Directory

Update the handoff and changelog with the newly finished capped FMA high_energy result so the next session starts from current evidence instead of stale partials.

Constraint: Benchmark is still running overall and only partial strategies are complete
Rejected: Wait for repeated_section_aware to finish before updating handoff | Risks another stale restart gap
Confidence: high
Scope-risk: narrow
Directive: Replace the partial cap16 table with the final ranking once repeated_section_aware and report.json land
Tested: Verified /tmp/ab_smoke_seg_cap16/high_energy/fma_reports_smoke/eval.json; verified docs now record high_energy = 12 / 1.0 / 1.0
Not-tested: Final cap16 multi-strategy report because repeated_section_aware is still in progress

authored 2026-06-02 17:24:47 +0800

Preserve restart-safe handoff for the capped FMA benchmark · 2c909862 ...

2c909862 Browse Directory

Record the latest delivered benchmark evidence, active work-root, partial results, and exact resume commands so a new session can continue without rediscovering context.

Constraint: User requested immediate delivery artifacts before the long benchmark fully finishes
Rejected: Wait for the entire cap16 benchmark to finish before handing off | Would delay delivery and risk losing resumable context
Confidence: high
Scope-risk: narrow
Directive: Update the handoff again once high_energy and repeated_section_aware finish on cap16
Tested: Verified partial eval files for hybrid and beat_aware; verified active cap16 benchmark processes; verified session-handoff contains resume commands and partial scores
Not-tested: Final multi-strategy cap16 ranking because high_energy and repeated_section_aware are still running

authored 2026-06-02 17:22:44 +0800

Make segmentation strategy benchmarks comparable under fixed query budgets · 62327872 ...

62327872 Browse Files

Clarify that the pipeline already mixes random sampling with librosa-guided candidate selection, while keeping heavier structural segmentation as a later optimization path.

Constraint: Must avoid staging local datasets and transient smoke artifacts
Rejected: Full librosa.segment.* default rollout | Too CPU-heavy and too distribution-shaping for current smoke/training stage
Confidence: high
Scope-risk: narrow
Directive: Keep future segmentation comparisons capped by equal query budgets when reporting quality deltas
Tested: py_compile for evaluate/external_adapters/ab_smoke_segmentation; evaluate.py --max-queries 5; ab_smoke_segmentation end-to-end smoke with max_test_queries=5
Not-tested: Multi-strategy medium-size capped A/B benchmark on larger real FMA subset

authored 2026-06-02 17:13:03 +0800

Benchmark segmentation strategies on a real FMA mini-smoke set · f04a314e ...

f04a314e Browse Directory

Constraint: Strategy comparisons need real-audio evidence, but the benchmark must stay cheap enough to run repeatedly on CPU during active development
Rejected: Judge winners only by top1/topk on a tiny subset | ties hide the practical value of strategies that generate far more usable queries
Confidence: medium
Scope-risk: narrow
Directive: Keep num_queries as a tie-breaker for tiny-smoke comparisons; increase subset size before promoting benchmark winners to default training policy
Tested: /usr/local/miniconda3/bin/python acr-engine/scripts/ab_smoke_segmentation.py --dataset fma --input-dir acr-engine/data/raw/fma_small_audio --work-root /tmp/ab_smoke_seg --subset-size 8 --query-duration 8 --train-epochs 1 --batch-size 2 --device cpu --output-json /tmp/ab_smoke_seg/report.json; post-run ranking verification from /tmp/ab_smoke_seg/report.json
Not-tested: Larger FMA subsets or difficult internal query mixes in the same benchmark script

authored 2026-06-02 17:01:23 +0800

Prioritize repeated chorus-like regions in music crop selection · 8ed3e34e ...

8ed3e34e Browse Directory

Constraint: Music retrieval should sample repeated hook-like regions without adding heavyweight structure models or breaking the existing lightweight candidate stack
Rejected: Reserve repeated-section logic for a later dedicated chorus detector | delays a practical chorus-like signal that can already improve query realism today
Confidence: medium
Scope-risk: moderate
Directive: Treat repeated_section_aware as a lightweight chorus proxy; future chorus ranking should refine rather than discard these candidates
Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/src/data/dataset.py acr-engine/src/data/manifest_tools.py acr-engine/train.py acr-engine/src/data/external_adapters.py; synthetic_v2 dry-run with --segment-strategy repeated_section_aware; handcrafted 24s repeated-motif fixture with repeated_section_aware and hybrid offset checks
Not-tested: Full end-to-end metric impact on FMA/internal datasets with repeated_section_aware enabled

authored 2026-06-02 16:45:29 +0800

Align music crop sampling with rhythmic grid candidates · d7a08944 ...

d7a08944 Browse Directory

Constraint: Music queries often begin near stable pulse locations, but beat tracking can fail on sparse or synthetic signals and must degrade safely
Rejected: Depend on beat tracking alone for all rhythmic sampling | too brittle when beat extraction is weak or absent
Confidence: high
Scope-risk: moderate
Directive: Keep beat_aware as a lightweight candidate generator with onset fallback; future chorus/repeated-section logic should compose with beat-aware rather than bypass it
Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/src/data/dataset.py acr-engine/src/data/manifest_tools.py acr-engine/train.py acr-engine/src/data/external_adapters.py; synthetic_v2 dry-run with --segment-strategy beat_aware; handcrafted 20s pulse-track fixture with beat_aware and hybrid offset checks
Not-tested: Full retraining/evaluation impact on open/internal datasets using beat_aware end-to-end

authored 2026-06-02 16:41:17 +0800

Bias music training crops toward salient energy and attack regions · b6cdf668 ...

b6cdf668 Browse Directory

Constraint: Music ACR queries should be closer to choruses, strong rhythmic sections, and attack regions without giving up the existing random and silence-aware fallbacks
Rejected: Add only heavier beat/chorus modeling first | higher complexity and more brittle than lightweight energy/onset heuristics for the current training pipeline
Confidence: high
Scope-risk: moderate
Directive: Keep high_energy/onset_aware as heuristic candidate generators; future beat/chorus logic should layer on top of them rather than replace the fallback stack
Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/src/data/dataset.py acr-engine/src/data/manifest_tools.py acr-engine/train.py acr-engine/src/data/external_adapters.py; synthetic_v2 dry-run with --segment-strategy high_energy and onset_aware; handcrafted 20s audio fixture with high_energy/onset_aware query offset checks
Not-tested: Full retraining/evaluation impact on FMA or internal production datasets

authored 2026-06-02 16:35:02 +0800

Resume smoke indexing safely without mixing model generations · 4ceaa995 ...

4ceaa995 Browse Directory

Constraint: smoke-local must recover long CPU index builds automatically, but partial embeddings from an older model must never contaminate a newly trained index
Rejected: Always reuse any existing partial checkpoint | can silently blend embeddings from different model generations into one index
Confidence: high
Scope-risk: moderate
Directive: Keep model-signature checks on all future index resume paths; auto-resume should fall back to clean rebuild on any signature mismatch
Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/src/engines/ecapa_embedder.py acr-engine/src/data/external_adapters.py acr-engine/run_demo.py; same-model partial checkpoint resume vs fresh rebuild equality; mismatched-model checkpoint rejection and clean rebuild equality
Not-tested: Reattaching the currently running real FMA smoke process after an external interruption

authored 2026-06-02 16:29:11 +0800

Make long CPU index builds resumable and root-path tolerant · e45896b7 ...

e45896b7 Browse Directory

Constraint: Real FMA smoke indexing can run for a long time on CPU and synthetic/root-layout datasets must still use the same build-index entrypoint
Rejected: Treat build-index as all-or-nothing and require full reruns after interruption | wastes hours on CPU and obscures whether work was already completed
Confidence: high
Scope-risk: moderate
Directive: Preserve checkpoint file compatibility; future smoke-local automation should prefer resume before rebuilding from scratch
Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/src/engines/ecapa_embedder.py acr-engine/src/engines/chromaprint_matcher.py acr-engine/run_demo.py; synthetic_v2 partial-checkpoint resume vs fresh rebuild equality check (shape/ids/embeddings/progress)
Not-tested: In-place resumption of the currently running real FMA process after an actual external kill/restart

authored 2026-06-02 16:16:23 +0800

Reduce silent-query noise in training and open-dataset preparation · 90e252b8 ...

90e252b8 Browse Directory

Constraint: Real music queries often include long silence heads/tails, but the pipeline still needs random-crop generalization and simple CLI controls
Rejected: Replace all random crops with structure-aware segmentation | would overfit to curated boundaries and diverge from messy real-world query distributions
Confidence: high
Scope-risk: moderate
Directive: Keep random as fallback; layer beat/onset/chorus-aware segmentation on top instead of removing silence-aware and sliding paths
Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/src/data/dataset.py acr-engine/src/data/manifest_tools.py acr-engine/train.py acr-engine/src/data/external_adapters.py; external_adapters.py prepare-local fma /tmp/segtest_audio --query-strategy silence_aware; train.py --data data/synthetic_v2 --dry-run --segment-strategy hybrid
Not-tested: Full FMA smoke retraining/eval with the new segmentation strategies

authored 2026-06-02 16:09:00 +0800

Preserve internal query window semantics for trainable asset exports · d61ee980 ...

d61ee980 Browse Directory

Constraint: Internal assets must support both manually labeled clips and whole-track auto-window generation without breaking pgvector export
Rejected: Treat missing query duration as full audio duration | prevents multi-window query expansion for long source audio
Confidence: high
Scope-risk: narrow
Directive: Keep explicit CSV offset authoritative; only auto-expand when offset is absent and query_stride is set
Tested: /usr/local/miniconda3/bin/python -m py_compile acr-engine/scripts/internal_asset_type_mapper.py; local 30s/40s WAV fixture export with manifest + pgvector verification
Not-tested: End-to-end retraining with newly expanded internal manifests

authored 2026-06-02 15:53:57 +0800

Fill internal query timing semantics before training on imported clips · 3e13c578 ...

3e13c578 Browse Directory

Constraint: Internal short-video and demo assets need explicit duration/offset semantics before they can behave like real training or pgvector segment records
Rejected: Leave query offsets empty by default | Produces weaker provenance and less useful downstream segment metadata
Confidence: high
Scope-risk: narrow
Directive: Prefer source CSV timing when available, then fall back to inspected audio duration and conservative default offsets
Tested: Sample CSV run confirmed one query used CSV duration/offset (5.0/12.5) and another fell back to inspected duration/default offset (6.5/0.0), with pgvector segments matching
Not-tested: Complex multi-segment offset generation from long-form internal masters

authored 2026-06-02 15:45:28 +0800

Connect internal asset exports to pgvector preparation early · 58041e10 ...

58041e10 Browse Directory

Constraint: Internal CSV ingestion should reach a pgvector-ready payload without requiring a second custom export path
Rejected: Limit the mapper to manifest outputs only | Forces another transformation layer before database loading
Confidence: high
Scope-risk: narrow
Directive: Keep pgvector payloads aligned with the shared songs/references/segments contract while preserving internal asset metadata fields
Tested: internal_asset_type_mapper.py with --emit-pgvector-json produced songs=2 references=2 segments=2 and included audio_role/asset_type_code/validation_status in sample rows
Not-tested: Direct bulk load into PostgreSQL using a live pgvector database

authored 2026-06-02 15:41:42 +0800

Validate internal audio assets before manifest-scale training · 5334df1f ...

5334df1f Browse Directory

Constraint: Internal CSV exports should expose missing audio and usable durations before they are treated as train-ready manifests
Rejected: Defer path and duration checks to later training failures | Would make ingestion debugging slow and noisy
Confidence: high
Scope-risk: narrow
Directive: Keep internal asset validation lightweight at mapping time; surface existence and duration early, then layer richer QC rules incrementally
Tested: internal_asset_type_mapper.py with --audio-root on a 6-row sample detected missing_audio=2 and emitted durations for existing reference/query assets
Not-tested: Production-scale scans over the full internal asset repository

authored 2026-06-02 15:38:16 +0800

Bridge internal CSV exports into manifest bundles before ingestion at scale · f048e400 ...

f048e400 Browse Directory

Constraint: Internal asset exports should reach train/test-ready manifests without repeated manual reshaping
Rejected: Stop at references/queries JSON only | Still leaves each import needing custom bundle assembly and split logic
Confidence: high
Scope-risk: narrow
Directive: Keep internal manifest emission conservative and deterministic; preserve train/test query presence even on tiny exports
Tested: internal_asset_type_mapper.py sample run with --emit-manifests produced catalog/train/test/val and balanced 1 query in both train and test
Not-tested: Duration/offset enrichment from live source metadata and audio-path existence checks on production exports

authored 2026-06-02 15:34:29 +0800

Make internal asset policies executable before DB-scale import · 728ef117 ...

728ef117 Browse Files

Constraint: Internal type enums need a repeatable mapping path into manifest-ready buckets before bulk database exports begin
Rejected: Leave type handling as documentation only | Would force repeated manual filtering and inconsistent ingestion decisions
Confidence: high
Scope-risk: narrow
Directive: Keep internal asset mapping defaults conservative; conditional instrumental variants should stay opt-in until version-aware training is ready
Tested: internal_asset_type_mapper.py on a 6-row sample CSV produced references=2 queries=2 metadata_only=1 excluded=1 with expected type routing
Not-tested: Direct SQL export integration against the live source database

authored 2026-06-02 15:30:22 +0800

Document asset-type training policy before bulk internal ingestion · bf098870 ...

bf098870 Browse Directory

Constraint: Internal media types need a clear training whitelist and versioning policy before they are mapped into manifests and pgvector
Rejected: Treat all audio-like assets as the same training label source | Would blur original-vs-instrumental semantics and degrade retrieval quality
Confidence: high
Scope-risk: narrow
Directive: Keep original recordings, instrumental variants, and short-video clips explicitly separated by audio_role and version semantics during ingestion
Tested: Verified new documentation anchors and mapping tables in training-data-and-pgvector-guide.md
Not-tested: Automated import from the upstream SQL type enum into manifests

authored 2026-06-02 15:26:19 +0800

Expand external dataset coverage before harder real-data training · a68a7296 ...

a68a7296 Browse Directory

Constraint: Open-dataset ingestion needs a way to generate multiple overlapping queries per track, otherwise training/eval coverage stays too sparse
Rejected: Keep only one random external query per track | Leaves long songs underrepresented and weakens reproducibility
Confidence: high
Scope-risk: moderate
Directive: Preserve single-query behavior as the default, but keep overlap-query generation configurable through query_stride for future corpora
Tested: manifest_tools audio-dir-to-splits --help shows --query-stride; prepare-local on data/synthetic_v2/songs with query_duration=8.0 and query_stride=4.0 produced 72 queries with query_index fields
Not-tested: Full end-to-end smoke-local completion on the still-running real FMA corpus with overlap-query mode enabled

authored 2026-06-02 15:21:48 +0800

Make smoke metadata explicit before more real-data comparisons · d7df0087 ...

d7df0087 Browse Directory

Constraint: Real-data smoke reports must distinguish manifest query duration from training segment duration to avoid 5s-vs-8s confusion across runs
Rejected: Keep a single ambiguous query_duration field | Makes cross-run analysis and handoff error-prone
Confidence: high
Scope-risk: narrow
Directive: Preserve explicit duration semantics in future smoke/report artifacts and keep legacy aliases only for compatibility
Tested: build_smoke_config_summary() emits manifest_query_duration=8.0 and train_segment_duration=5.0 using configs/default.yaml
Not-tested: End-to-end regeneration of the still-running real FMA smoke report bundle with the new config schema

authored 2026-06-02 15:14:22 +0800

Preserve repo continuity before the next session handoff · 05a2ccca ...

05a2ccca Browse Files

Constraint: Future sessions need startup memory for user preferences, real-data status, and the current FMA bottleneck without re-discovery
Rejected: Leave continuity only in transient chat context | Would force every new session to reconstruct state from scratch
Confidence: high
Scope-risk: narrow
Directive: Keep AGENTS continuity memory concise, code-true, and refreshed when project direction or bottlenecks materially change
Tested: AGENTS.md anchor search for continuity keys; verified host CUDA snapshot; verified build-index progress logs on small smoke artifacts
Not-tested: Full completion of the long-running real FMA CPU build-index stage

authored 2026-06-02 15:11:13 +0800