SKILL.md 39.7 KB

Raw Blame History Permalink


name: deep-interview
description: "[OMX] Socratic deep interview with mathematical ambiguity gating before execution"
argument-hint: "[--quick|--standard|--deep] [--autoresearch] <idea or vague description>"


Deep Interview is an intent-first Socratic clarification loop before planning or implementation. It turns vague ideas into execution-ready specifications by asking targeted questions about why the user wants a change, how far it should go, what should stay out of scope, and what OMX may decide without confirmation.


The request is broad, ambiguous, or missing concrete acceptance criteria
The user says "deep interview", "interview me", "ask me everything", "don't assume", or "ouroboros"
The user wants to avoid misaligned implementation from underspecified requirements
You need a requirements artifact before handing off to ralplan, autopilot, ralph, or team


The request already has concrete file/symbol targets and clear acceptance criteria
The user explicitly asks to skip planning/interview and execute immediately
The user asks for lightweight brainstorming only (use plan instead)
A complete PRD/plan already exists and execution should start


Execution quality is usually bottlenecked by intent clarity, not just missing implementation detail. A single expansion pass often misses why the user wants a change, where the scope should stop, which tradeoffs are unacceptable, and which decisions still require user approval. This workflow applies Socratic pressure + quantitative ambiguity scoring so orchestration modes begin with an explicit, testable, intent-aligned spec.


Quick (--quick): fast pre-PRD pass; target threshold <= 0.30; max rounds 5

Standard (--standard, default): full requirement interview; target threshold <= 0.20; max rounds 12

Deep (--deep): high-rigor exploration; target threshold <= 0.15; max rounds 20

Autoresearch (--autoresearch): same interview rigor as Standard, but specialized for $autoresearch mission readiness and .omx/specs/ artifact handoff


Profile max rounds is a hard cap, not a target. Do not continue only to reach a numbered round count. Extra Socratic rigor does not override the active threshold unless the profile/config changes.

If no flag is provided, use Standard.


--autoresearch: switch the interview into autoresearch-intake mode for $autoresearch handoff. In this mode, the interview should converge on a validator-ready research mission, write canonical artifacts under .omx/specs/, and preserve the explicit refine further vs launch boundary for downstream skill intake.


Ask ONE question per round (never batch multiple interview rounds into one questions[] form)
Ask about intent and boundaries before implementation detail
Target the weakest clarity dimension each round after applying the stage-priority rules below
Treat every answer as a claim to pressure-test before moving on: the next question should usually demand evidence or examples, expose a hidden assumption, force a tradeoff or boundary, or reframe root cause vs symptom
Do not rotate to a new clarity dimension just for coverage when the current answer is still vague; stay on the same thread until one layer deeper, one assumption clearer, or one boundary tighter
Before crystallizing, complete at least one explicit pressure pass that revisits an earlier answer with a deeper, assumption-focused, or tradeoff-focused follow-up
Gather codebase facts via explore before asking user about internals

omx explore is deprecated. Use normal repository inspection tools/subagents for simple read-only brownfield fact gathering; use omx sparkshell only for explicit shell-native read-only evidence, and keep ambiguous or non-shell-only investigation on the richer normal path.
Always run a preflight context intake before the first interview question
For brownfield work, preflight must include doc/context grounding before user-facing questions: inspect applicable AGENTS.md files, README/getting-started docs, relevant docs/ contracts/plans/ADRs, existing .omx/context/ snapshots, and any project-local glossary/context files such as CONTEXT.md or CONTEXT-MAP.md when present.
Treat existing repo language as evidence, not authority: if the user uses a fuzzy, overloaded, or conflicting term, surface the specific doc/code wording and ask which meaning should govern before implementation.
Cross-check user claims about current behavior against code or documented contracts when discoverable. If docs and code disagree, ask a confirmation question that names both sources instead of silently choosing one.
Use scenario-based edge-case grilling when relationships, boundaries, or handoff behavior are unclear: invent one concrete scenario that stresses the ambiguous boundary, then ask one focused question about the expected outcome.
Durable docs, glossary, ADR, or memory updates are opt-in and public-safe only. Deep-interview may recommend such updates in the handoff summary, but must not automatically create or dump public docs from interview transcripts unless the user explicitly chooses that as in-scope.
If initial context is oversized or would exceed the prompt budget, do not paste or forward the raw payload into interview prompts; request and record a prompt-safe initial-context summary first
The oversized initial-context summary gate is blocking: wait for the concise summary before ambiguity scoring, crystallizing artifacts, or any downstream execution handoff
The summary must preserve goals, constraints, success criteria, non-goals, decision boundaries, and references to any full source documents so downstream consumers receive a prompt-safe but faithful context
Keep total prompt payloads within a safe budget by summarizing or trimming retained history; preserve newest/highest-signal answers and never let raw oversized context crowd out the current question
Reduce user effort: ask only the highest-leverage unresolved question, and never ask the user for codebase facts that can be discovered directly
For brownfield work, prefer evidence-backed confirmation questions such as "I found X in Y. Should this change follow that pattern?"
Route facts before judgment in the Ouroboros style: before presenting a user-facing interview round, classify whether the needed information is a discoverable fact, a fact needing confirmation, or a human decision. The interview is with the human for judgment, not for facts the agent can inspect.
When unresolved ambiguity depends on current external best practices, official/upstream guidance, standards, or version-aware behavior, use $best-practice-research as the bounded evidence wrapper before crystallizing requirements or handing off to planning/execution.
Use these transcript/spec labels only; never use them as omx question source values, and never replace the runtime source: "deep-interview" contract for user-facing deep-interview questions:


[from-code][auto-confirmed] — exact, high-confidence codebase facts from manifests/configs or direct source evidence, with no prescription attached.

[from-code] — codebase findings that are useful but inferred, pattern-based, or low/medium confidence and therefore need a confirmation-style user-facing round before being treated as settled.

[from-research] — externally sourced facts such as API limits, compatibility, or public documentation; facts only, not decisions.

[from-user] — goals, preferences, business logic, scope, non-goals, acceptance criteria, tradeoffs, and any decision-bearing interpretation.


Treat [from-code][auto-confirmed] and other non-user fact discoveries as context/transcript updates, not interview rounds: do not call omx question, do not create a pending deep-interview question obligation, and do not increment the user-facing round number for facts the agent can safely establish.
Auto-confirm only descriptive facts. If a finding implies what the new feature should do, which pattern it should follow, which tradeoff to accept, or what should stay in/out of scope, route the entire decision-bearing question to the user as [from-user] even when code or research facts are available.
In attached-tmux Codex CLI, deep-interview uses omx question as the required OMX-owned structured questioning path for every interview round
When invoking omx question through attached-tmux Bash/tool paths, preserve the leader-pane return target by prefixing the command with OMX_QUESTION_RETURN_PANE=$TMUX_PANE (or a concrete %pane value)
If you launch omx question in a background terminal, immediately wait for that background terminal to finish and read its JSON answer before scoring ambiguity, asking another round, or handing off
Treat answers[] as the primary omx question success contract. For a single interview round, read answers[0].answer; use legacy top-level answer only as a compatibility fallback when needed.
If the current runtime is outside tmux and cannot render omx question, use the native structured question tool when available; otherwise ask exactly one concise plain-text question and wait for the answer
Re-score ambiguity after each answer and show progress transparently
Once ambiguity is at or below the active profile threshold, stop ordinary questioning. Run the practical closure audit: crystallize/handoff when readiness gates pass; otherwise ask only the final closure question needed to satisfy a named gate.
Treat max_rounds as a stop cap, not evidence that more rounds are needed.
Do not hand off to execution while ambiguity remains above threshold unless user explicitly opts to proceed with warning
Do not crystallize or hand off while Non-goals or Decision Boundaries remain unresolved, even if the weighted ambiguity threshold is met
Treat early exit as a safety valve, not the default success path
Persist mode state for resume safety with CLI-first state commands (omx state write/read --input '<json>' --json); use state_write / state_read only when explicit MCP compatibility is enabled


Phase 0: Preflight Context Intake


Parse {{ARGUMENTS}} and derive a short task slug.
Attempt to load the latest relevant context snapshot from .omx/context/{slug}-*.md.
Check whether the provided initial context or loaded snapshot is too large for safe prompt use. If it is oversized, the first interview round must ask for a concise prompt-safe summary instead of scoring ambiguity or continuing to downstream handoff.
If no snapshot exists, create a minimum context snapshot with:


Task statement
Desired outcome
Stated solution (what the user asked for)
Probable intent hypothesis (why they likely want it)
Known facts/evidence
Constraints
Unknowns/open questions
Decision-boundary unknowns
Likely codebase touchpoints
Relevant repo docs/rules/context inspected
Terminology or doc/code conflicts found
Prompt-safe initial-context summary status (not_needed, needed, or recorded)


For brownfield tasks, inspect the applicable documentation/rule surface before the first user-facing round. Prefer exact, nearby sources over broad scans:


governing AGENTS.md files and template/runtime instruction surfaces that apply to the touched paths
README/getting-started docs and relevant docs under docs/, especially contracts, plans, ADR-like records, and workflow docs
existing .omx/context/ snapshots, .omx/specs/, and planning artifacts relevant to the slug
project-local glossary/context files such as CONTEXT.md, CONTEXT-MAP.md, or context-specific docs when they exist


Save snapshot to .omx/context/{slug}-{timestamp}.md (UTC YYYYMMDDTHHMMSSZ) and reference it in mode state.


Phase 1: Initialize


Parse {{ARGUMENTS}} and depth profile (--quick|--standard|--deep).
Detect project context:


Run explore to classify brownfield (existing codebase target) vs greenfield.
For brownfield, collect relevant codebase context before questioning.


Initialize state via omx state write --input '{"mode":"deep-interview","active":true}' --json:


{
  "active": true,
  "current_phase": "deep-interview",
  "state": {
    "interview_id": "<uuid>",
    "profile": "quick|standard|deep",
    "type": "greenfield|brownfield",
    "initial_idea": "<user input>",
    "rounds": [],
    "current_ambiguity": 1.0,
    "threshold": 0.3,
    "max_rounds": 5,
    "challenge_modes_used": [],
    "codebase_context": null,
    "current_stage": "intent-first",
    "current_focus": "intent",
    "context_snapshot_path": ".omx/context/<slug>-<timestamp>.md"
  }
}


Announce kickoff with profile, threshold, and current ambiguity.


Phase 2: Socratic Interview Loop

Repeat until ambiguity <= threshold, the pressure pass is complete, the readiness gates are explicit, the user exits with warning, or max rounds are reached. This is a stop condition: below threshold, do not open a new ordinary interview branch.


2a) Generate next question

If the initial context is oversized and no prompt-safe summary has been recorded yet, the next question must be only a summary request. Do not score ambiguity, do not run readiness gates, and do not hand off to $ultragoal, $ralplan, $autopilot, $ralph, or $team until that summary answer is captured.

Use:


Original idea
Prior Q&A rounds
Current dimension scores
Brownfield context (if any)
Doc/context grounding notes, including existing terminology, governing rules, and any doc/code mismatch
Activated challenge mode injection (Phase 3)


Target the lowest-scoring dimension, but respect stage priority:


Stage 1 — Intent-first: Intent, Outcome, Scope, Non-goals, Decision Boundaries

Stage 2 — Feasibility: Constraints, Success Criteria

Stage 3 — Brownfield grounding: Context Clarity (brownfield only)


Follow-up pressure ladder after each answer:


Ask for a concrete example, counterexample, or evidence signal behind the latest claim
Probe the hidden assumption, dependency, or belief that makes the claim true
Force a boundary or tradeoff: what would you explicitly not do, defer, or reject?
Challenge fuzzy or conflicting terms against the repo's documented language and current code behavior
Stress-test the boundary with one concrete scenario or edge case when a relationship or handoff remains ambiguous
If the answer still describes symptoms, reframe toward essence / root cause before moving on


Prefer staying on the same thread for multiple rounds when it has the highest leverage. Breadth without pressure is not progress.

Maintain a Breadth Ledger across independent ambiguity tracks: scope, constraints, outputs, verification, brownfield integration, and any user-mentioned deliverable tracks. The ledger is a guard, not a mandatory rotation rule: stay deep on the current thread until it has been pressure-tested, then zoom out only when another material track remains unresolved and would change execution.

Maintain a Docs/Terminology Ledger for brownfield interviews:


repo docs/rules/context sources inspected, with path references
canonical terms already used by the repo and terms to avoid or disambiguate
user terms that conflict with docs or current code behavior
doc/code mismatches that require a human decision before implementation
optional durable-doc follow-ups that are safe to propose but not auto-apply


Detailed dimensions:


Intent Clarity — why the user wants this
Outcome Clarity — what end state they want
Scope Clarity — how far the change should go
Constraint Clarity — technical or business limits that must hold
Success Criteria Clarity — how completion will be judged
Context Clarity — existing codebase understanding (brownfield only)


Non-goals and Decision Boundaries are mandatory readiness gates. Ask about them early and keep revisiting them until they are explicit.


2b) Ask the question

Use the surface-appropriate structured questioning path for every interview round. In attached-tmux sessions, use OMX-owned structured questioning via omx question (this is the required structured-question equivalent and required AskUserQuestion equivalent for deep-interview). Outside tmux, use native structured input when available; otherwise ask exactly one concise plain-text question and wait for the answer. Present:

Round {n} | Target: {weakest_dimension} | Ambiguity: {score}%

{question}


omx question payload guidance for interview rounds:


Deep-interview is Socratic: ask one focused round at a time. Do not use batch questions[] to combine multiple interview rounds, even though omx question supports batch forms for other workflows.
Use canonical type values instead of authoring raw multi_select flags by hand. type: "single-answerable" is the default for one-path decisions; type: "multi-answerable" is the canonical shape for bounded multi-select rounds. The runtime will keep multi_select aligned with type.
Use single-answerable when exactly one answer should drive the next branch, the options are mutually exclusive, or selecting more than one answer would blur the decision boundary. Typical cases: handoff lane selection, choosing the primary failure mode, or confirming which of several competing interpretations is correct.
Use multi-answerable when multiple options may all be true at once and you need to capture a bounded set of coexisting constraints, non-goals, risks, or acceptance checks in one round. Typical cases: selecting all out-of-scope items, all success metrics that must hold, or all deployment constraints that apply together.
If one selected option would immediately require a follow-up question to disambiguate the others, prefer a single-answerable round now and ask the follow-up next. Do not hide a branching interview tree inside one overloaded multi-select prompt.
Keep interview options bounded and concrete. If the valid answers are already known, set allow_other: false; only leave allow_other: true when the interview genuinely needs one user-supplied option that cannot be enumerated in advance.
Read answers structurally from the primary answers[] array. For a normal single-round interview response, use answers[0].answer as the source of truth; the top-level answer field is a legacy single-question projection/fallback only.
For single-answerable, expect one decisive selection in the value field of answers[0].answer plus its selected-values metadata. For multi-answerable, treat the selected-values field inside answers[0].answer as the source of truth for all chosen constraints/non-goals and preserve the full set in the transcript/spec. In legacy single-question projections, this is equivalent to: For multi-answerable, treat answer.selected_values as the source of truth.


Canonical bounded single-choice payload:

{
  "question": "Which execution lane should own this once the interview is complete?",
  "type": "single-answerable",
  "options": [
    {
      "label": "Plan first",
      "value": "ralplan",
      "description": "Need architecture and test-shape review before execution"
    },
    {
      "label": "Execute directly",
      "value": "autopilot",
      "description": "Requirements are already explicit enough for planning plus execution"
    },
    {
      "label": "Refine further",
      "value": "refine",
      "description": "Clarification is still needed before any handoff"
    }
  ],
  "allow_other": false,
  "other_label": "Other",
  "source": "deep-interview"
}


Canonical bounded multi-select payload:

{
  "question": "Which non-goals must stay out of scope for the first pass?",
  "type": "multi-answerable",
  "options": [
    {
      "label": "No UI redesign",
      "value": "no-ui-redesign",
      "description": "Keep layout and styling unchanged"
    },
    {
      "label": "No new dependencies",
      "value": "no-new-dependencies",
      "description": "Work within the existing toolchain"
    },
    {
      "label": "No API contract changes",
      "value": "no-api-contract-changes",
      "description": "Preserve external request and response shapes"
    }
  ],
  "allow_other": false,
  "other_label": "Other",
  "source": "deep-interview"
}


Canonical answer-shape reminders:

{
  "answer": {
    "kind": "option",
    "value": "ralplan",
    "selected_labels": ["Plan first"],
    "selected_values": ["ralplan"]
  }
}


{
  "answer": {
    "kind": "multi",
    "value": ["no-new-dependencies", "no-api-contract-changes"],
    "selected_labels": ["No new dependencies", "No API contract changes"],
    "selected_values": ["no-new-dependencies", "no-api-contract-changes"]
  }
}


2c) Score ambiguity

Score each weighted dimension in [0.0, 1.0] with justification + gap.

Greenfield: ambiguity = 1 - (intent × 0.30 + outcome × 0.25 + scope × 0.20 + constraints × 0.15 + success × 0.10)

Brownfield: ambiguity = 1 - (intent × 0.25 + outcome × 0.20 + scope × 0.20 + constraints × 0.15 + success × 0.10 + context × 0.10)

Readiness gate:


Non-goals must be explicit

Decision Boundaries must be explicit
A pressure pass must be complete: at least one earlier answer has been revisited with an evidence, assumption, or tradeoff follow-up
A practical closure audit must pass: another question would change execution materially, not merely polish wording or chase a narrow edge case
If either gate is unresolved, or the pressure pass is incomplete, continue below threshold only with a final closure question that names the unresolved gate and would materially change execution.
Treat a low ambiguity score as permission to audit closure, not permission to keep drilling indefinitely. If remaining uncertainty would not change implementation, crystallize the spec instead of opening a new branch.
If ambiguity is <= 0.10, another user-facing question is allowed only as that final closure question; otherwise crystallize immediately.


2d) Report progress

Show weighted breakdown table, readiness-gate status (Non-goals, Decision Boundaries), and the next focus dimension.


2e) Persist state

Append round result and updated scores via omx state write --input '<json>' --json; use state_write only when explicit MCP compatibility is enabled.


2f) Round controls


Do not offer early exit before the first explicit assumption probe and one persistent follow-up have happened
Apply a Dialectic Rhythm Guard: track consecutive non-user fact discoveries and confirmation-style answers ([from-code][auto-confirmed], [from-code], or [from-research]). After 3 consecutive non-user or confirmation answers, the next material user-facing round must solicit direct human judgment ([from-user]) unless the closure audit says the interview is ready to crystallize.
Round 4+: allow explicit early exit with risk warning
Soft warning at profile midpoint (e.g., round 3/6/10 depending on profile)
Hard cap at profile max_rounds; never treat this cap as a desired interview length or quota


Phase 3: Challenge Modes (assumption stress tests)

Use each mode once when applicable. These are normal escalation tools, not rare rescue moves:


Contrarian (round 2+ or immediately when an answer rests on an untested assumption): challenge core assumptions

Terminologist (brownfield, whenever a key term is fuzzy, overloaded, or conflicts with repo docs/code): force a canonical meaning against existing project language before implementation

Simplifier (round 4+ or when scope expands faster than outcome clarity): probe minimal viable scope

Ontologist (round 5+ and ambiguity > 0.25, or when the user keeps describing symptoms): ask for essence-level reframing


Track used modes in state to prevent repetition.


Phase 4: Crystallize Artifacts

When threshold is met (or user exits with warning / hard cap):


Write interview transcript summary to:


.omx/interviews/{slug}-{timestamp}.md

(kept for ralph PRD compatibility)


Write execution-ready spec to:


.omx/specs/deep-interview-{slug}.md


Spec should include:


Metadata (profile, rounds, final ambiguity, threshold, context type)
Context snapshot reference/path (for ralplan/team reuse)
Prompt-safe initial-context summary when oversized context was provided, plus references to any full source documents
Clarity breakdown table
Intent (why the user wants this)
Desired Outcome
In-Scope
Out-of-Scope / Non-goals
Decision Boundaries (what OMX may decide without confirmation)
Constraints
Testable acceptance criteria
Assumptions exposed + resolutions
Pressure-pass findings (which answer was revisited, and what changed)
Brownfield evidence vs inference notes for any repository-grounded confirmation questions
Docs/Terminology Ledger with inspected repo docs/rules/context, term conflicts, and any doc/code mismatch decisions
Scenario/edge-case pressure findings that materially shaped scope or acceptance criteria
Optional durable documentation recommendations, explicitly marked opt-in and public-safe; do not include raw private transcript dumps
Technical context findings
Full or condensed transcript


Autoresearch specialization

When the clarified task is specifically about $autoresearch, or the skill is invoked with --autoresearch, keep the interview domain-specific and emit skill-consumable artifacts without skipping clarification.


Accepted seed inputs: topic, evaluator, keep-policy, slug, existing mission draft text, and prior evaluator examples/templates

Required interview focus: mission clarity, evaluator readiness, keep policy, slug/session naming, and whether the draft is ready to launch now or should refine further

Canonical artifact path: .omx/specs/deep-interview-autoresearch-{slug}.md


Launch artifact bundle: .omx/specs/autoresearch-{slug}/mission.md, .omx/specs/autoresearch-{slug}/sandbox.md, and .omx/specs/autoresearch-{slug}/result.json


Launch artifact directory: .omx/specs/autoresearch-{slug}/


Required artifact sections:


Mission Draft
Evaluator Draft
Launch Readiness
Seed Inputs
Confirmation Bridge


Required launch artifacts under .omx/specs/autoresearch-{slug}/:


mission.md
sandbox.md
result.json


Launch-readiness rule: mark the draft as not launch-ready while the evaluator command still contains placeholder markers such as <...>, TODO, TBD, REPLACE_ME, CHANGEME, or your-command-here


Structured result contract: result.json should point to the draft + mission/sandbox artifacts and carry the finalized topic, evaluatorCommand, keepPolicy, slug, launchReady, and blockedReasons fields so $autoresearch can consume it directly

Confirmation bridge: after artifact generation, offer at least refine further and launch; do not run direct CLI launch or detached/split tmux launch, and only hand off to $autoresearch after explicit confirmation

Handoff rule: downstream execution must preserve the clarified mission intent, evaluator expectations, decision boundaries, and launch-readiness status from this artifact rather than bypassing the draft review step


Phase 5: Execution Bridge

Present execution options after artifact generation using explicit handoff contracts. Treat the deep-interview spec as the current requirements source of truth and preserve intent, non-goals, decision boundaries, acceptance criteria, docs/terminology grounding, and any residual-risk warnings across the handoff.


Goal-mode follow-ups

Include these product-facing suggestions when they fit the clarified spec, without removing the existing $ultragoal, $ralplan, $autopilot, $ralph, and $team handoff options:


$ultragoal — default goal-mode follow-up for implementation or general goal-oriented follow-up specs that should be converted into durable Codex/OMX goals with sequential completion tracking.

$autoresearch-goal — use when the clarified context is a research project: a research question, reference/literature gathering, evaluator-backed analysis, or professor/critic-style deliverable.

$performance-goal — use when the clarified context is an optimization or performance project with measurable speed, latency, throughput, memory, benchmark, or evaluator criteria.


Recommend $ultragoal as the default durable goal-mode follow-up because it supersedes Ralph for goal tracking. Preserve $team for coordinated parallel implementation and keep $ralph only as an explicit fallback for persistent single-owner execution/verification when the user specifically selects it.


1. $ultragoal (Default durable execution follow-up)


Input Artifact: .omx/specs/deep-interview-{slug}.md (optionally accompanied by the transcript/context snapshot for traceability)

Invocation: $ultragoal create-goals --brief-file <spec-path> followed by $ultragoal complete-goals in the active execution lane

Consumer Behavior: Convert the clarified spec into durable goal-mode work. Preserve intent, non-goals, decision boundaries, acceptance criteria, docs/terminology grounding, scenario-pressure findings, and residual-risk warnings as binding story constraints.

Skipped / Already-Satisfied Stages: Requirement interview, ambiguity clarification, doc/context preflight, and early intent-boundary elicitation

Expected Output: .omx/ultragoal/brief.md, .omx/ultragoal/goals.json, .omx/ultragoal/ledger.jsonl, implementation evidence, verification evidence, and final cleanup/review-gate evidence

Best When: The clarified spec is execution-ready or the user explicitly wants durable goal tracking as the next step

Next Recommended Step: Run the Ultragoal completion loop; launch $team only inside an active Ultragoal story when parallel lanes are warranted, and use $ralph only as an explicit fallback when the user asks for that legacy persistence mode


2. $ralplan (Recommended when architecture/test-shape review is still needed)


Input Artifact: .omx/specs/deep-interview-{slug}.md (optionally accompanied by the transcript/context snapshot for traceability)

Invocation: $plan --consensus --direct <spec-path>


Consumer Behavior: Treat the deep-interview spec as the requirements source of truth. Do not repeat the interview by default; refine architecture/feasibility around the clarified intent and boundaries instead.

Skipped / Already-Satisfied Stages: Requirements discovery, ambiguity clarification, and early intent-boundary elicitation

Expected Output: Canonical planning artifacts under .omx/plans/, especially prd-*.md and test-spec-*.md


Best When: Requirements are clear enough to stop interviewing, but architectural validation / consensus planning is still desirable

Next Recommended Step: Use the approved planning artifacts with $ultragoal as the default durable goal-mode follow-up (optionally with $team for parallel lanes); choose $autoresearch-goal for research validation or $performance-goal for measurable optimization, and use $ralph only as an explicit fallback when a narrow single-owner persistence loop is requested


3. $autopilot


Input Artifact: .omx/specs/deep-interview-{slug}.md


Invocation: $autopilot <spec-path>


Consumer Behavior: Use the deep-interview spec as the clarified execution brief. Preserve intent, non-goals, decision boundaries, and acceptance criteria as binding context for planning/execution.

Skipped / Already-Satisfied Stages: Initial requirement discovery and ambiguity reduction

Expected Output: Planning/execution progress, QA evidence, and validation artifacts produced by autopilot

Best When: The clarified spec is already strong enough for direct planning + execution without an additional consensus gate

Next Recommended Step: Continue through autopilot's execution/QA/validation flow; if coordination-heavy execution emerges, prefer $team under a leader-owned $ultragoal ledger, using $ralph only as an explicit fallback when a narrow single-owner persistence loop is requested


4. $ralph (Explicit fallback only)


Input Artifact: .omx/specs/deep-interview-{slug}.md


Invocation: $ralph <spec-path>


Consumer Behavior: Use the spec's acceptance criteria and boundary constraints as the persistence target. Do not reopen requirements discovery unless the user explicitly asks to refine further.

Skipped / Already-Satisfied Stages: Requirement interview, ambiguity clarification, and initial scope-definition work

Expected Output: Iterative execution progress and verification evidence tracked against the clarified criteria

Best When: The user explicitly asks for Ralph's persistent sequential completion pressure; otherwise use $ultragoal for durable goal tracking and completion checkpoints

Next Recommended Step: If this explicit fallback is selected, continue Ralph's persistence loop; if work expands into coordination-heavy lanes, hand off to $team under $ultragoal checkpointing rather than promoting Ralph as the next default


5. $team


Input Artifact: .omx/specs/deep-interview-{slug}.md


Invocation: $team <spec-path>


Consumer Behavior: Treat the spec as shared execution context for coordinated parallel work. Preserve the clarified intent, non-goals, decision boundaries, and acceptance criteria as common lane constraints.

Skipped / Already-Satisfied Stages: Requirement clarification and early ambiguity reduction

Expected Output: Coordinated multi-agent execution against the shared spec, with evidence that can later feed Ultragoal checkpoints by default, or an explicit Ralph verification pass only when requested

Best When: The task is large, multi-lane, or blocker-sensitive enough to justify coordinated parallel execution instead of a single persistent loop

Next Recommended Step: Follow the team verification path when the coordinated execution phase finishes; checkpoint completion through $ultragoal by default, escalating to a separate Ralph loop only when the user explicitly asks for that persistent verification/fix owner


6. Refine further


Input Artifact: Existing transcript, context snapshot, and current spec draft

Invocation: Continue the interview loop

Consumer Behavior: Re-enter questioning to resolve the highest-leverage remaining uncertainty

Skipped / Already-Satisfied Stages: None beyond already-captured context

Expected Output: A lower-ambiguity spec with tighter boundaries and fewer unresolved assumptions

Best When: Residual ambiguity is still too high, the user wants stronger clarity, or the above-threshold / early-exit warning indicates too much risk to proceed cleanly

Next Recommended Step: Return to one of the execution handoff contracts above once the spec is sufficiently clarified


Residual-Risk Rule: If the interview ended via early exit, hard-cap completion, or above-threshold proceed-with-warning, explicitly preserve that residual-risk state in the handoff so the downstream skill knows it inherited a partially clarified brief.

IMPORTANT: Deep-interview is a requirements mode. On handoff, invoke the selected skill using the contract above. Do NOT implement directly inside deep-interview.


Use explore for codebase fact gathering
Use omx question as the OMX-native structured user-input tool for each interview round when an attached tmux renderer is available
From attached-tmux Bash/tool paths, call it as OMX_QUESTION_RETURN_PANE=$TMUX_PANE omx question ... unless an explicit %pane return target is already known
If the current runtime is outside tmux and cannot render omx question, use native structured input when available; otherwise ask exactly one concise plain-text question and wait for the answer
After omx question returns JSON, prefer answers[0].answer / answers[]; use legacy answer only as a fallback for older records
Use omx state write/read --input '<json>' --json for resumable mode state; state_write / state_read are explicit MCP compatibility fallbacks only
If the interview cannot ask a required omx question round, persist the blocker as terminal state with active: false and current_phase: "blocked"; do not write a terminal blocked phase with active: true

Read/write context snapshots under .omx/context/

Read applicable repo docs/rules/context during preflight; write durable docs, glossary, ADR, or memory updates only when the user explicitly opts in and the content is public-safe
Record whether the oversized-context summary gate is not needed, pending, or satisfied before any scoring or handoff step
Save transcript/spec artifacts under .omx/interviews/ and .omx/specs/


User says stop/cancel/abort -> persist state and stop
Ambiguity stalls for 3 rounds (+/- 0.05) -> force Ontologist mode once
Max rounds reached -> proceed with explicit residual-risk warning
All dimensions >= 0.9 -> allow early crystallization even before max rounds


 Preflight context snapshot exists under .omx/context/{slug}-{timestamp}.md


 Oversized initial context, if present, has a prompt-safe summary recorded before ambiguity scoring or downstream handoff

 Ambiguity score shown each round

 Intent-first stage priority used before implementation detail

 Weakest-dimension targeting used within the active stage

 At least one explicit assumption probe happened before crystallization

 At least one persistent follow-up / pressure pass deepened a prior answer

 Challenge modes triggered at thresholds (when applicable)

 Transcript written to .omx/interviews/{slug}-{timestamp}.md


 Spec written to .omx/specs/deep-interview-{slug}.md


 Brownfield questions use evidence-backed confirmation when applicable

 Brownfield preflight inspected applicable repo docs/rules/context before user-facing questions

 Fuzzy or conflicting terminology was challenged against repo language/current code behavior when applicable

 Scenario-based edge-case grilling was used when boundary ambiguity would materially affect implementation

 Durable docs/ADR/memory updates, if any, were explicitly opted into and public-safe

 Handoff options provided ($ultragoal, $ralplan, $autopilot, $ralph, $team) plus context-sensitive goal-mode suggestions ($autoresearch-goal, $performance-goal) when applicable

 No direct implementation performed in this mode


Suggested Config (optional)

Deep-interview reads runtime defaults from the first existing config source in this order:


Repository-local .omx/config.toml

Repository-root omx.toml

User-global ~/.omx/config.toml


This section is currently a deep-interview-specific runtime override surface, not a general replacement for Codex config.toml or .omx-config.json model/env routing.
Malformed config files are ignored fail-soft so $deep-interview activation can continue with built-in defaults.
Explicit --quick, --standard, or --deep invocation flags override defaultProfile.

[omx.deepInterview]
defaultProfile = "standard"
quickThreshold = 0.30
standardThreshold = 0.20
deepThreshold = 0.15
quickMaxRounds = 5
standardMaxRounds = 12
deepMaxRounds = 20
enableChallengeModes = true


Resume

If interrupted, rerun $deep-interview. Resume from persisted mode state via omx state read --input '{"mode":"deep-interview"}' --json.


Recommended 3-Stage Pipeline

deep-interview -> ralplan -> autopilot


Stage 1 (deep-interview): clarity gate
Stage 2 (ralplan): feasibility + architecture gate
Stage 3 (autopilot): execution + QA + validation gate