| Name |
Last Commit
|
History
|
Last Update |
|---|---|---|
| .. | ||
| artifact-manifest.json | ||
| benchmark-report.md | ||
| eval.json | ||
| model-card.md | ||
| release-checklist.md |
| Name |
Last Commit
81704ace
–
capture the first real-path index-to-evaluate closure\n\nConstraint: Delivery state must reflect fresh evaluate evidence without staging temporary eval assets\nRejected: Wait for larger-scale or hard-case metrics | The first explicit evaluate closure is already a meaningful milestone and restart-safe handoff point\nConfidence: high\nScope-risk: narrow\nDirective: Reuse /tmp/fma_realpath_small_rerun_index2 and /tmp/fma_realpath_small_rerun_eval as the next validation baseline before scaling up\nTested: Verified eval_top50.json at num_queries 35 with top1 0.8571 and topk 1.0, confirmed query-count explanation, and updated handoff/changelog docs\nNot-tested: Larger query caps, hard-case buckets, and full-scale FMA evaluate runs
|
History
|
Last Update |
|---|---|---|
| .. | ||
| artifact-manifest.json | Loading commit data... | |
| benchmark-report.md | Loading commit data... | |
| eval.json | Loading commit data... | |
| model-card.md | Loading commit data... | |
| release-checklist.md | Loading commit data... |