Policy Engine

Decision matrix & examples

The full mapping from guardrail actions and policy action to a decision, with worked examples for every combination.

This page expands the rules from Policy engine into a lookup table and worked examples. It assumes no session rule matched first — a matching session rule always takes precedence.

The matrix

Given the guardrail(s) that fired on a turn and the policy's action, the decision is:

Guardrail(s) that firedpolicy = `block`policy = `flag`
deny or followDENYFLAG
async (redact) onlyMODIFYMODIFY
pass onlyFLAGFLAG
async + deny/followDENY (+ redaction)FLAG (+ redaction)
nothing firedALLOWALLOW

Two rules make the table easy to remember: **async/redact never blocks (it only ever cleans content), and a flag policy never denies** (it caps at FLAG).

Worked examples

Reading the table as policy.action + guardrail expected ⇒ decision:

Policy actionGuardrails that firedDecisionOutcome
flagRedact guardrail only (async)MODIFYPII redacted; request proceeds. redacted=true, flagged=false, deny=false.
blockRedact guardrail only (async)MODIFYContent cleaned; request still proceeds — redaction alone never blocks.
blockPrompt-injection (deny)DENYRequest rejected. flagged=true, deny=true.
flagPrompt-injection (deny)FLAGAllowed but marked — flag policy caps at FLAG. flagged=true, deny=false.
blockModeration (follow)DENYFollows the policy → deny on a block policy.
flagModeration (follow)FLAGFollows the policy → flag on a flag policy.
blockRedact (async) + prompt-injection (deny)DENYDenied by the blocking guardrail; redaction is still applied to the content.
flagRedact (async) + prompt-injection (deny)FLAGCapped at FLAG; redaction still applied as a side-effect.
block or flagFlag-only guardrail (pass)FLAGSurfaces the signal; never denies regardless of policy.
block or flagNothing firedALLOWClean pass. All flags false.

Session rules override the matrix

Before the matrix is consulted, the engine evaluates session rules. A matching rule decides the outcome — for example, enough flagged turns in a session can escalate an otherwise-FLAG turn to DENY on a block policy. See Session risk & rules.