RAGit
CommandsHarness

harness run

Execute a suite and evaluate it with deterministic local rules

What It Does

harness run executes a stored suite against one local command executor backend. It runs structural preflight, executes selected cases sequentially, evaluates outputs with deterministic rules, writes a run record into .ragit/log/harness-runs/**, and emits harness.run into the operational timeline.

When To Use / When Not To Use

When to use it

  • You already have a suite artifact and want an actual execution result.
  • You want failure evidence artifacts instead of only structural checks.
  • You need a timeline-visible operational run record.

When not to use it

  • You only want to know whether the suite wiring is valid. Use harness verify.
  • You want to materialize or review resources first. Use harness capture.

Syntax

pnpm ragit harness run --input <path|-> \
  [--dry-run] [--format text|json|both] [--cwd <path>]

Input Contract

{
  "suiteRef": "art_harness_suite_xxx",
  "executor": {
    "kind": "command",
    "argv": ["node", "scripts/run-harness-case.mjs"],
    "cwd": ".",
    "env": {},
    "timeoutMs": 60000
  },
  "cases": ["art_harness_case_xxx"],
  "concurrency": 1
}

Deterministic Evaluation Rules

  • v1 supports only executor.kind = "command".
  • v1 supports only sequential execution. concurrency may be omitted or set to 1.
  • Rule-bearing resources are case, oracle, checker, rubric, and golden.
  • Supported rule fields are exitCode, mustInclude, mustNotInclude, stderrMustInclude, stderrMustNotInclude, and jsonSubset.
  • fixture, trace, and envAssumption are passed through as support resources but are not a second evaluation language.

Output Contract

  • JSON output includes runId, suiteId, runPath, preflight, summary, caseResults, dryRun, and warnings.
  • Failed or errored cases can emit harness failure artifacts with compact masked excerpts.
  • --dry-run resolves the suite, validates the executor contract, and plans the run without writing the run record or failure artifacts.