Harness And Bench

Harness and bench both deal with verification evidence, but they are not the same tool.

Harness

Use harness when you want canonical, repeatable verification runs.

CLI:

bash
grok harness run runtime
grok harness report
grok harness diff runtime <run-id>
grok harness bless baseline runtime --run-id <run-id>

TUI:

  • /harness
  • /harness run
  • /harness report
  • /harness diff
  • /harness bless

Bench

Use bench when you want route scoring or evidence-backed routing decisions.

Commands:

  • /bench
  • /bench run
  • /bench report
  • /bench adopt

Practical Rule

  • use harness for the main verification story
  • use bench when you are evaluating route policy or benchmark evidence

Was this page helpful? Report issues on GitHub.