verify_action

A small, JP-jurisdiction reference verification primitive for AI agent actions. Submit a claim about what your agent did, plus structured evidence of what actually happened, and receive an independent integrity check plus an HMAC-attested receipt that downstream tooling, CI gates, and audit reviewers can reference. Open source, free, no warranty (see ToS).

Why this exists

Pre-action policy admission control systems (e.g., policy-as-code admission control with Cedar / Rego, lifecycle hooks) decide "is this action allowed?" before execution. That is a different problem.

This service answers a complementary question: "after the action ran, does the evidence support the agent's claim about what it did?". AI agents commonly assert success when reality didn't match — rows that weren't deleted, files that weren't created, emails that bounced, code changes that touched five unrelated files. We catch that drift with structured evidence comparison and emit a content-addressed HMAC-attested receipt that can be referenced later.

This is a small reference implementation, not a canonical standard. The receipt format is forkable; vendors and verifiers may diverge.

API

POST /verify — REST. JSON body {claim, evidence, kind?, context?, caller_context?}. Returns {verdict, reasoning, confidence, verifier_used, receipt}. caller_context is optional metadata; if provided it is echoed in the receipt for downstream consumers.
POST /mcp — MCP JSON-RPC 2.0. Methods: initialize, tools/list, tools/call (with name=verify_action). Stateless transport.
GET /spec — JSON schema for the verify_action tool and the receipt format.
GET /healthcheck — liveness probe.
GET /stats — service-internal aggregate counters (verdict distribution, no per-request data).

The receipt

Every verification produces an HMAC-attested receipt (verify_action_receipt.v0) containing: SHA-256 of claim, SHA-256 of evidence manifest, verifier id and version, key id (kid), verdict (one of verified / contradicted / insufficient_evidence / unsafe_to_verify), confidence, reason codes, issuance timestamp, and HMAC-SHA256 signature. Raw claim and evidence are not in the receipt; consumers can re-hash to verify the receipt covers a particular pair. Receipts attest issuance and integrity, not factual truth or legal admissibility — they prove a single private key signed under our verifier version, not that the verdict is correct.

Who can call this

Anyone — operators running agents (operator-funded), agents calling on their own behalf (agent-wallet via x402 / Stripe MPP / etc.), or hybrid setups. Same receipt regardless. The optional caller_context field is informational metadata for downstream consumers; it does not gate verification.

Privacy

IPs are hashed (SHA-256 + salt) at receipt; plaintext IPs are never stored.
Personal data is rejected at receipt with HTTP 400. Submissions containing email addresses, phone numbers, postal codes, 12-digit identifiers (マイナンバー shape), passport numbers, or credit-card-shaped numbers are refused before any processing. See Privacy Policy.
Raw claim and evidence values are not stored. Trace logs retain only metadata: claim length, SHA-256 prefix, evidence type, top-level key count, byte size. Plaintext values are discarded after the response is sent.
No public aggregate dashboard. Internal counters only (see /stats for verdict distribution since process start).

Phase 1 limitations

The verification engine in Phase 1 is rule-based (no LLM call). Specialized verifiers handle code diffs, DB ops, file ops, and HTTP API calls. A generic fallback handles arbitrary shapes weakly. Provide a specific kind for the strongest result. insufficient_evidence is a first-class verdict — refusing to claim certainty is information, not failure.

Legal

Open source. Source code and detection rules public. This is a probe — small, forkable, no warranty. Fork it if useful.