Governance And Verification

This page explains how Grok Code handles review, rollback, and verification.

Detailed pages:

The Big Difference: TUI vs CLI

This is the most important behavior to understand before you let Grok Code modify real files.

In The TUI

Grok Code usually stages changes as reviewable artifacts.

You inspect them in Gate, then choose whether to apply or reject them.

In The CLI

Successful grok run and grok exec flows apply edits live.

There is no built-in Gate review step in the CLI path.

Use the TUI if you want governed review before files change.

Gate

Open it with:

text

/gate

Core commands:

/gate drawer
/gate apply
/gate reject
/gate rollback

What Gate is for:

inspect staged artifacts
decide whether to land or reject them
recover with a checkpoint when needed

What to expect:

the lane stays empty until a run stages an artifact
if there is more than one candidate artifact, Grok Code opens a picker

Checkpoints

Checkpoints are rollback snapshots of changed files.

Commands:

/checkpoint
/checkpoint save
/checkpoint list
/checkpoint restore
/gate rollback

Important limits:

checkpoints only work inside a git repository
there must be changed files to save
restoring a checkpoint rewrites file contents back to the saved snapshot

Inspecting Available Checks

Use:

text

/verify

Grok Code detects common checks by project type.

Examples:

Rust: cargo check, cargo test, cargo clippy -- -D warnings
Node: npm run build, npm test, and optionally npx eslint .
Go: go build ./..., go test ./..., go vet ./...
Python: python -m pytest, python -m ruff check .

This inspector shows what Grok Code found. It does not automatically make every project “verified.”

Harness

Harness is the structured evidence system for verified runs.

Use it in the TUI:

/harness
/harness run
/harness report
/harness diff
/harness bless

Or in the CLI:

bash

grok harness run runtime
grok harness report
grok harness diff runtime <run-id>
grok harness bless baseline runtime --run-id <run-id>

What harness is good for:

repeatable evidence
comparing current results to a blessed baseline
release-blocking verification flows

What to expect:

harness data stays empty until at least one run exists
bless sets a baseline for later comparison

Bench

Bench is related to routing evidence, but it is not the main verification flow.

Use it when you want:

a benchmark scorecard
projected route evidence
to adopt an evidence-backed benchmark policy

Commands:

/bench
/bench run
/bench report
/bench adopt

Practical rule:

use harness for canonical verification
use bench when you are comparing or adopting route decisions

When Verification Looks Empty

That usually means the underlying state does not exist yet.

Examples:

no detected checks for /verify
no harness runs to report or diff
no checkpoints to restore
no staged artifacts in Gate

This is not always an error. Sometimes it just means you have not generated any evidence yet.