Governance And Verification
This page explains how Grok Code handles review, rollback, and verification.
Detailed pages:
The Big Difference: TUI vs CLI
This is the most important behavior to understand before you let Grok Code modify real files.
In The TUI
Grok Code usually stages changes as reviewable artifacts.
You inspect them in Gate, then choose whether to apply or reject them.
In The CLI
Successful grok run and grok exec flows apply edits live.
There is no built-in Gate review step in the CLI path.
Use the TUI if you want governed review before files change.
Gate
Open it with:
/gateCore commands:
/gate drawer/gate apply/gate reject/gate rollback
What Gate is for:
- inspect staged artifacts
- decide whether to land or reject them
- recover with a checkpoint when needed
What to expect:
- the lane stays empty until a run stages an artifact
- if there is more than one candidate artifact, Grok Code opens a picker
Checkpoints
Checkpoints are rollback snapshots of changed files.
Commands:
/checkpoint/checkpoint save/checkpoint list/checkpoint restore/gate rollback
Important limits:
- checkpoints only work inside a git repository
- there must be changed files to save
- restoring a checkpoint rewrites file contents back to the saved snapshot
Inspecting Available Checks
Use:
/verifyGrok Code detects common checks by project type.
Examples:
- Rust:
cargo check,cargo test,cargo clippy -- -D warnings - Node:
npm run build,npm test, and optionallynpx eslint . - Go:
go build ./...,go test ./...,go vet ./... - Python:
python -m pytest,python -m ruff check .
This inspector shows what Grok Code found. It does not automatically make every project “verified.”
Harness
Harness is the structured evidence system for verified runs.
Use it in the TUI:
/harness/harness run/harness report/harness diff/harness bless
Or in the CLI:
grok harness run runtime
grok harness report
grok harness diff runtime <run-id>
grok harness bless baseline runtime --run-id <run-id>What harness is good for:
- repeatable evidence
- comparing current results to a blessed baseline
- release-blocking verification flows
What to expect:
- harness data stays empty until at least one run exists
- bless sets a baseline for later comparison
Bench
Bench is related to routing evidence, but it is not the main verification flow.
Use it when you want:
- a benchmark scorecard
- projected route evidence
- to adopt an evidence-backed benchmark policy
Commands:
/bench/bench run/bench report/bench adopt
Practical rule:
- use harness for canonical verification
- use bench when you are comparing or adopting route decisions
When Verification Looks Empty
That usually means the underlying state does not exist yet.
Examples:
- no detected checks for
/verify - no harness runs to report or diff
- no checkpoints to restore
- no staged artifacts in
Gate
This is not always an error. Sometimes it just means you have not generated any evidence yet.