Policy and governance

Purpose

Define a portable, developer-usable reference for policy enforcement in agentic systems: clear component roles, a standard evaluation pipeline, interoperable request/response schemas, packaging and lifecycle guidance, and operational guarantees.

7.1 Components and roles

Policy Enforcement Point (PEP): Intercepts requests and enforces decisions. Typical placements: Secure Agent Gateway, service middleware, or sidecar.
Policy Decision Point (PDP): Central decision service backed by a policy engine and decision cache.
Policy Information Point (PIP): Context assembly that enriches requests with identity, tool metadata, environmental context, and recent history.
Policy Repository (PRP): Versioned policy bundles with tests and metadata.
Policy Distribution Service (PDS): Secure distribution of signed bundles to PDP nodes; supports hot-reload and rollback.
Key & Secret Management: mTLS/JWT for PDP authn, bundle signing, and secure cache credentials.

7.2 Enforcement patterns

Synchronous gating (default): PEP blocks request pending decision (sub-50ms typical; <5ms cached).
Asynchronous obligations: Non-blocking obligations (e.g., detailed audit) execute after decision.
Constraint-based allow: Allow with conditions[] (rate limits, field masking, time window, require-approval).
Human-in-the-loop: PEP triggers approval workflow when policies return conditions[] like require_approval:<role>.
Placement: Prefer gateway PEP for tool calls; use in-service PEP for high-assurance paths; sidecars for legacy components.

7.3 Evaluation pipeline (hybrid intent-aware)

1) Normalize & validate: Schema validation, basic allow/deny lists, quota sanity checks.
2) Behavioral analysis: Analyzer produces intent_risk with behavioral_signals and risk_dimensions.
3) Final adjudication: Policy engine computes allow, deny[], conditions[].
4) Decision caching: Keyed by stable request fingerprint; TTL policy-driven; safe invalidation on bundle or trust updates.
5) Fail-safe: On PDP error/timeouts, PEP fails closed with standardized denial and logs an operational incident.

Implementation note: Any compliant policy engine may be used. Engines like OPA/Rego are suitable examples for stages (1) and (3) with an external analyzer for stage (2).

7.4 Canonical policy input schema

Minimal, portable contract consumed by the PDP. Tool/frameworks may extend via attributes.

{
  "version": "1.0",
  "tenant_id": "string",
  "source": "mcp_proxy",
  "agent": {
    "id": "string",
    "type": "string",
    "trust_level": 0.0,
    "attributes": {"string": "any"}
  },
  "tool": {
    "id": "string",
    "name": "string",
    "operation": "string",
    "attributes": {"string": "any"}
  },
  "context": {
    "timestamp": "RFC3339",
    "request_id": "string",
    "environment": "dev|staging|prod",
    "ip": "string",
    "attributes": {"string": "any"}
  },
  "intent_risk": {
    "decision": "ALLOW|REVIEW|BLOCK",
    "explanation": "string",
    "behavioral_signals": [{"signal_type": "string", "details": {}}],
    "risk_dimensions": {"string": 0.0}
  }
}

7.5 Decision output schema

Standardized response enables consistent PEP behavior and rich audit.

{
  "allow": true,
  "deny": [
    {"code": "string", "message": "string", "path": "policies/base/authorization"}
  ],
  "conditions": [
    {"type": "audit_level", "value": "detailed"},
    {"type": "rate_limit", "value": "10_per_minute"},
    {"type": "mask_fields", "value": ["ssn", "email"]},
    {"type": "require_approval", "value": "security_officer"}
  ],
  "obligations": [
    {"type": "log_intent_signals", "value": true}
  ],
  "cache": {"ttl_seconds": 120, "scope": "agent+tool+operation"},
  "meta": {"policy_version": "v1.3.2", "evaluation_ms": 3.4}
}

7.6 Policy packaging and lifecycle

Structure:

policies/
├── base/          # Authentication, authorization, audit
├── tools/         # Tool-type policies
├── compliance/    # SOX/GDPR/HIPAA
├── a2a/           # Agent-to-Agent collaboration
└── intent/        # Consumers of input.intent_risk

Versioning: Semantic; embed policy_version in decisions; retain N previous.
Testing: Unit tests per rule; golden tests per scenario; CI gate on coverage and regression.
Rollout: Signed bundles, staged rollouts, canary, auto-rollback on error budget breach.

7.7 PDP API (reference)

Secure service exposed to PEPs via mTLS or JWT. Example endpoint: - POST /v1/policy/evaluate - Auth: mTLS or Authorization: Bearer <token> - Body: Canonical input - Response: Decision output

Example request:

curl -sS -X POST https://pdp.local/v1/policy/evaluate \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
        "version": "1.0",
        "tenant_id": "acme",
        "source": "mcp_proxy",
        "agent": {"id": "agent_123", "type": "analyst", "trust_level": 0.82},
        "tool": {"id": "tool_789", "name": "financial_db", "operation": "query"},
        "context": {"timestamp": "2025-09-12T10:00:00Z", "request_id": "req_abc"},
        "intent_risk": {"decision": "REVIEW", "behavioral_signals": [{"signal_type": "scope_violation"}], "risk_dimensions": {"data_exfiltration": 0.7}}
      }'

7.8 Performance and caching

Targets: <5ms cached, <50ms uncached (p95) on commodity nodes.
Caches: decision, context (agent/tool metadata), and compilation. Invalidate on bundle change, trust score update, tenant override, or TTL.
Partial evaluation and policy partitioning for hot paths. Keep rules small and composable.

7.9 Security of the policy plane

Mutual TLS or signed JWT between PEP and PDP; zero-trust network assumptions.
Sign and verify policy bundles; restrict who can approve/publish policies.
Least-privilege access to PRP/PDS; rotate secrets; encrypt cache at rest if multi-tenant.
Deny-by-default on PDP errors; emit high-severity ops events.

7.10 Example policy fragments (illustrative)

Engine-agnostic logic (pseudocode):

deny "exfiltration_risk_high" if risk_dimensions.data_exfiltration > 0.9

condition add "require_approval:security_officer" if intent_risk.decision == "REVIEW"

allow if agent.trust_level >= 0.7 and tool.operation == "read" and within_business_hours(context.timestamp)

Example with a policy language (e.g., Rego) is acceptable as a non-normative reference implementation.