§16 AI Accountability¶
Status: Stable · RCAN v1.3
Overview: As RCAN robots become AI-driven rather than rule-driven, the reasoning layer becomes a first-class safety surface. §6 defines what was commanded; §16 defines requirements for recording how a command was produced, gating dispatch on model confidence, and requiring human authorization for high-risk actions.
16.1 Overview¶
All features in this section are backwards-compatible with v1.1 implementations. New audit fields are SHOULD; gate behavior is opt-in via config. Implementations that do not use AI inference are unaffected by this section.
The three mechanisms defined here are complementary:
- Model Identity — record which AI model produced a command and with what confidence.
- Confidence Gates — automatically block or escalate low-confidence commands.
- HiTL Gates — require explicit human approval for high-risk action types.
16.2 Model Identity in Audit Records¶
If a COMMAND message was produced by an AI inference call, implementations SHOULD include an ai block in the audit record alongside the existing §6 fields.
Schema¶
| Field | Requirement | Type | Description |
|---|---|---|---|
ai.provider |
MUST | string | Non-empty string identifying the LLM provider (e.g. "anthropic", "openai", "ollama"). |
ai.model |
MUST | string | Model identifier as returned by or configured for the provider. |
ai.model_version |
SHOULD | string | Model version or snapshot date (e.g. "2026-02"). |
ai.layer |
SHOULD | string | One of reactive, fast, or planner; or originating agent name for multi-agent systems. |
ai.confidence |
SHOULD | float [0–1] | Self-reported or harness-computed confidence. MUST be omitted (not defaulted to 1.0) if unavailable. |
ai.inference_latency_ms |
SHOULD | number | Wall-clock time for the inference call in milliseconds. |
ai.thought_id |
SHOULD | string | Unique ID enabling correlation with the full Thought log (§16.5). |
ai.escalated |
MUST if present | bool | true if produced after automatic escalation from a lower brain layer; false otherwise (do not omit). |
Example extended audit record¶
{
"principal": "operator@rcan://local.rcan/opencastor/rover/abc123",
"ruri": "rcan://local.rcan/opencastor/rover/abc123",
"timestamp_ms": 1740721482934,
"message_id": "a3f7c1d9",
"event": "COMMAND",
"outcome": "ok",
"ai": {
"provider": "anthropic",
"model": "claude-sonnet-4-6",
"model_version": "2026-02",
"layer": "planner",
"confidence": 0.87,
"inference_latency_ms": 1240,
"thought_id": "th_8f3a2c...",
"escalated": false
}
}
16.3 Confidence Gates¶
Implementations MAY declare confidence gates under agent.confidence_gates in the .rcan.yaml config file. A gate is a protocol-level constraint that blocks or escalates low-confidence commands before they reach the driver layer.
Gate evaluation¶
- Before dispatching any COMMAND with a gated scope, the runtime MUST evaluate the corresponding gate.
- If
ai.confidenceis present and belowmin_confidence, the gate fails. - If
ai.confidenceis absent and a gate is declared for that scope, the gate defaults to theon_failbehavior (treat missing confidence as a gate miss).
on_fail behaviors¶
| Value | Behavior |
|---|---|
block |
Command MUST NOT be dispatched. An ERROR response with code CONFIDENCE_GATE_FAIL MUST be returned. |
escalate |
Command MUST be re-submitted to the next higher brain layer before dispatch. If no higher layer is available, equivalent to block. |
allow |
Command proceeds. The gate miss MUST be noted in the audit record with "gate_bypassed": true. |
16.4 Human-in-the-Loop Gates¶
Implementations MAY declare Human-in-the-Loop (HiTL) gates under agent.hitl_gates in the .rcan.yaml config file. A HiTL gate requires explicit human authorization before a matched command is dispatched.
Gate trigger sequence¶
When require_auth: true is declared and an incoming command's action type matches an entry in action_types:
- The implementation MUST NOT dispatch the command immediately.
- The implementation MUST emit a
PENDING_AUTHmessage to all channels listed innotify, including the originalmessage_id,action_type, a human-readable description, and theauth_timeout_ms. - The implementation MUST await an
AUTHORIZEmessage from a principal with OWNER (level 4) or CREATOR (level 5) role withinauth_timeout_msmilliseconds. - On
decision: approve— the command is dispatched and both the original command and theAUTHORIZEmessage are logged in the audit chain. - On
decision: denyor timeout expiry — the command is dropped, anERRORis returned to the originating principal, and the event is audited.
on_timeout behaviors¶
| Value | Behavior |
|---|---|
block |
Command is dropped; ERROR returned; event audited. |
allow |
Command is dispatched after timeout with "hitl_timeout_bypass": true in the audit record. |
Role enforcement: AUTHORIZE messages from principals below OWNER role MUST be rejected. A rejected AUTHORIZE attempt MUST be audited (see §2). All gate events (trigger, approve, deny, timeout) MUST be written to the audit log with the original message_id.
16.5 Thought Log¶
Implementations SHOULD expose a queryable Thought log to enable full reasoning auditability and correlation with command audit records.
GET /api/thoughts/<thought_id>
→ {
"id": "th_8f3a2c...",
"timestamp_ms": 1740721482100,
"provider": "anthropic",
"model": "claude-sonnet-4-6",
"layer": "planner",
"instruction": "Move toward the open doorway",
"context_snapshot": { "...": "..." },
"action": { "type": "move", "linear_x": 0.4 },
"confidence": 0.87,
"reasoning": "<omitted — requires config scope>",
"escalated_from": null
}
Access control¶
- The endpoint MUST require at minimum
statusscope to read any Thought record. - The
reasoningfield MUST be omitted from the response unless the caller holdsconfigscope (OWNER or higher). This prevents prompt extraction by lower-privilege principals who have operational access but not configuration access. - Implementations that do not expose the Thought log MUST return
404or501 Not Implementedat this path.
16.6 AI Output Watermarking¶
Version: v1.7 · Status: Stable
EU AI Act Art. 50 requires that AI-generated content be machine-detectable. §16.6 specifies a cryptographic watermark token embedded in every AI-generated COMMAND payload and its corresponding audit record.
Token Format¶
Watermark tokens use the prefix rcan-wm-v1: followed by 32 lowercase hex characters (16 bytes of HMAC-SHA256 output). The format is machine-detectable by regex: ^rcan-wm-v1:[0-9a-f]{32}$
import hashlib, hmac
def compute_watermark_token(rrn, thought_id, timestamp_iso, ml_dsa_private_bytes):
message = f"{rrn}:{thought_id}:{timestamp_iso}".encode()
digest = hmac.new(ml_dsa_private_bytes, message, hashlib.sha256).digest()
return f"rcan-wm-v1:{digest[:16].hex()}"
# Example output:
# "rcan-wm-v1:a3f9c1d2b8e47f20a3f9c1d2b8e47f20"
HMAC Key¶
The HMAC secret is the robot's ML-DSA-65 private key bytes (§1, 4032 bytes). This key is already present at runtime for message signing — no additional key material is required. The token proves the command originated from a robot with a specific identity; verification requires the robot's public verification endpoint (see below).
Required Fields¶
Implementations at conformance level L2+ MUST include watermark_token in:
- The
COMMANDmessage payload (§3) - The corresponding audit record (§16.2)
{
"type": "COMMAND",
"source": "rcan://robot.local:8000/bob",
"payload": {
"action": "move",
"linear": 0.3,
"angular": 0.0,
"watermark_token": "rcan-wm-v1:a3f9c1d2b8e47f20a3f9c1d2b8e47f20"
},
"sig": { "alg": "ml-dsa-65", "kid": "a3f9c1d2", "value": "..." }
}
Verification Endpoint¶
Implementations MUST expose a public (no authentication required) verification endpoint. The endpoint looks up the token in the tamper-evident audit log and returns the full audit entry, proving both token validity and that the command was logged.
# Request
GET /api/v1/watermark/verify?token=rcan-wm-v1:a3f9c1d2b8e47f20a3f9c1d2b8e47f20&rrn=RRN-000000000001
# Response 200 — token found in audit log
{
"valid": true,
"rrn": "RRN-000000000001",
"watermark_token": "rcan-wm-v1:a3f9c1d2b8e47f20a3f9c1d2b8e47f20",
"audit_entry": {
"ts": "2026-04-10T14:32:01.123456",
"event": "motor_command",
"source": "brain",
"watermark_token": "rcan-wm-v1:a3f9c1d2b8e47f20a3f9c1d2b8e47f20",
"ai": { "thought_id": "thought-abc123", "confidence": 0.91, "model": "claude-sonnet-4-6" },
"action": { "type": "move", "linear": 0.3, "angular": 0.0 },
"sig": { "alg": "ml-dsa-65", "kid": "a3f9c1d2", "value": "..." }
}
}
# Response 404 — token not in audit log
{ "error": "Watermark token not found in audit log", "code": "HTTP_404" }
# Response 400 — malformed token format
{ "error": "Invalid watermark token format", "code": "HTTP_400" }
EU AI Act Art. 50 Compliance¶
RCAN §16.6 satisfies the machine-detectability requirement for AI-generated content in robot command pipelines (EU AI Act Art. 50(2)). The rcan-wm-v1: prefix is detectable by regex without cryptographic operations. Full verification via the public endpoint proves both origin and audit chain membership, satisfying Art. 12 record-keeping requirements simultaneously.
Conformance¶
| Level | Requirement |
|---|---|
| L1 Core | Not required |
| L2 Secure | MUST embed token in COMMAND payload and audit record; MUST expose verify endpoint |
| L3 Federated | L2 requirements + token preserved in forwarded COMMAND across delegation chain |
| L4 Registry | L3 requirements + RRF registry MAY cache verify results for cross-robot auditability |