Harness And Bench
Harness and bench both deal with verification evidence, but they are not the same tool.
Harness
Use harness when you want canonical, repeatable verification runs.
CLI:
bash
grok harness run runtime
grok harness report
grok harness diff runtime <run-id>
grok harness bless baseline runtime --run-id <run-id>TUI:
/harness/harness run/harness report/harness diff/harness bless
Bench
Use bench when you want route scoring or evidence-backed routing decisions.
Commands:
/bench/bench run/bench report/bench adopt
Practical Rule
- use harness for the main verification story
- use bench when you are evaluating route policy or benchmark evidence