The accountability gap in ROS2: where does "why did the robot do that?" get answered?

A question I keep running into and don’t have a clean answer for: when a
ROS2-based autonomous system makes a consequential decision — a mobile
robot reroutes around a person, an arm stops mid-motion, a drone aborts —
we can answer what it did. ros2 bag captures the topics. But why it
did that, in a form a safety officer, an insurance adjuster, or a regulator
can read, is almost always reconstructed after the fact, by hand.

Four specific observations, curious where I’m wrong:

1. Rule provenance is invisible. When a BehaviorTree node fires, we
log the node, not the human-authored policy that made the node legal. No
first-class link from “robot stopped” to “rule §3.2 of safety policy v4
triggered.”

2. Guardrails are one-way safety, not auditable downgrades. Most ROS2
safety layers I’ve seen are kill switches or velocity caps. They prevent
harm but produce no signed record of “planner wanted X, guardrail
downgraded to Y, here’s the chain.”

3. LLM-in-the-loop adds a new failure mode. With VLA stacks plugging
into task planning, the “why” gets harder. Did the model suggest the
action? Was it followed, overridden, sanitized? I don’t see standard hooks
for any of this in the stack.

4. EU AI Act Article 12 and 14 are now in force for high-risk autonomous
systems.
Most teams I talk to plan to handle “logging” and “human
oversight” with ros2 bag plus a spreadsheet. That will not survive a
regulator audit, and CE marking deadlines for some categories hit in 2027.

Three questions for people deeper in this than me:

  • Is there an active REP or working group on decision provenance that I
    missed? I found scattered threads, no spec.
  • For Nav2 + BehaviorTree.CPP teams: how do you currently answer “why
    did the robot decide that?” for non-engineer stakeholders?
  • Has anyone added cryptographic signing to the rosbag pipeline, or is
    everyone trusting the filesystem and timestamps?

I’ve been building an opinionated implementation of some of this — rule
provenance, signed audit chain, guardrail-downgrade-only pattern, LLM
sanitization — outside of ROS2, and I’m trying to figure out if the pieces
that generalize are worth porting and open-sourcing.

If this resonates, drop a reply or DM. Looking for both “you’re missing
existing work X” and “yes this is broken in our deployment, here’s how.”

4 Likes

#1 is pretty much standard requirements traceability. There are tools for that which are widely used already.

#2 is that way because that’s what most safety standards specify in their fields, such as AMRs in a factory setting.

#3 is #2 with neural nets thrown in to make things interesting, since it’s hard, if not impossible, to understand why a neural net made the decision it did.

#4, I’m curious why you think the log file + spreadsheet approach would not survive a regulatory audit. I don’t think it’s a good or scalable approach, but it’s not wrong.


Having said that, I do agree that tracing from “what happened” to “why it happened” needs better tooling and processes in robotics. Labour- and time-intensive investigations work in aerospace because accidents are few and far between. I don’t think it’s a suitable approach in autonomous systems.

2 Likes

Geoff — appreciated. Some pushback, some concessions.

#1 (traceability): Fair on design-time tooling. The gap I’m pointing at is runtime: given a logged decision at t=14:32:07, can the operator deterministically recover which policy version was deployed, which rule fired, and verify neither has been tampered with? Design-time formal verification establishes correctness; runtime tamper-evident attribution is what I haven’t found packaged in the ROS2 stack. Nav2 logs the node, not the policy version it was generated from. If that bridge exists in production tooling, I’d value the pointer.

#2 (kill-switch): Conceded. For AMR settings under ISO 3691-4 the pattern is what’s specified and it works. Downgrade-only-with-trace matters where the safety envelope is conditional — geofence, payload, mission phase. Sub-case of your point, not counter to it.

#3 (neural nets): Right that “why did the net say X” is largely intractable. The structural move I’m betting on is keeping the model strictly advisory — it suggests, deterministic rules decide, audit captures the divergence. Doesn’t make the model explainable, just keeps it out of the action layer.

#4 (audit): Fair pushback, I overstated. Sharper version: imagine an AMR collides with a pedestrian. A safety officer arrives and asks “show me the exact policy that was running, the rule that fired, and prove neither was modified after the incident.” Log file + spreadsheet answers that with operator trust; a signed chain answers it without. The gap is the link, not the logging. Article 12 + Article 14 turn that “trust me” gap into a compliance liability.

The closing remark — “aerospace-style investigations don’t scale for autonomy” — is exactly the lever I’m trying to pull. Hours of forensic engineering per incident has to become minutes, and the only architecturally honest way I’ve found is making the trace fall out of runtime, not be reconstructed after the fact. Curious what Tier IV / Autoware converged on for this in production.

If there’s a relevant REP I missed, I’ll go read it. If not and there’s interest, happy to draft something concrete for the WG to tear apart.

2 Likes

Hi @altunbulakemre75 and @gbiggs,

This discussion resonates with a narrow runtime case I have been working on: auditable guardrail downgrades for ROS 2 mobile bases.
I agree that rosbag tells us what happened, but rarely why a guardrail changed the command stream. To address this, I built a lightweight PoC called ros2_kinematic_guard . It sits inline between the planner and the base driver:

planner / teleop
    ↓
  /cmd_vel
    ↓
Kinematic Guard  ← (Monitors /odom consistency)
    ↓
/safe_cmd_vel
    ↓
base driver

When the robot’s physical response no longer matches the command stream (e.g., bad timing or physical slip), it downgrades the command locally (GREEN → YELLOW_SLOWDOWN → BRAKE_AND_RESYNC) and immediately emits a compact runtime record:

{
  "recordType": "GUARDRAIL_DOWNGRADE",
  "inputCommand": {
    "topic": "/cmd_vel",
    "linear_vx": 0.8
  },
  "outputCommand": {
    "topic": "/safe_cmd_vel",
    "linear_vx": 0.0
  },
  "decision": {
    "status": "RESYNCING",
    "guardAction": "BRAKE_AND_RESYNC",
    "dominantCause": "WHEEL_SLIP",
    "residual": 5.391
  }
}

This does not solve general policy provenance or cryptographic signing. But it tries to bridge one concrete runtime accountability gap:

Planner wanted X

Guardrail downgraded to Y

Because command/odom execution integrity broke (with a machine-readable root cause).

This is part of a broader concept I’m exploring called CLIM (Causal Link Integrity Middleware), but the ROS 2 package is intentionally narrow: a local pre-E-stop guard providing clear telemetry for each intervention.
I’d be interested whether a GuardDecisionRecord like this would be useful in the REP / runtime provenance discussion.

Thanks @zc_Liu — this is exactly the concrete artifact the thread
needed. GuardDecisionRecord is precisely the structured runtime
evidence a REP needs to standardize on. The field choices
(inputCommand / outputCommand / decision with dominantCause +
residual) are the right shape.

I also went back and read your CLIM post in Open-RMF Ideas and the
NARH update in ROS General — the fleet-level integrity layer
matters here, because VDA5050 deployments are exactly where “the
trace has to fall out of runtime” stops being theoretical and
becomes a compliance prerequisite.

The layer I’ve been working on (open-sourced at
GitHub - altunbulakemre75/kernel: Decision provenance and accountability infrastructure for autonomous systems · GitHub ) sits one level up from your
kinematic guard — between policy intent and command emission. The
record shape looks like:

{
“action”: “HANDOFF”,
“rule_id”: “r_003”,
“threat_level”: “HIGH”,
“guardrails_triggered”: [“friendly_zone_guardrail”],
“policy_version_id”: “c1fc5724f6b02970”,
“signature”: “”,
“prev_hash”: “”,
“chain_index”: 47
}

Unsigned + local in your case, signed + chain-linked in mine.
Stacked, they give an unbroken provenance chain: kinematic event →
policy decision → cryptographic attestation. Different scopes, same
architectural commitment.

For a REP I’d sketch three layers:

  1. Kinematic — command/execution integrity (your CLIM/NARH work)
  2. Policy — rule provenance + guardrail downgrade trace (kernel)
  3. Cryptographic — Ed25519 chain over both (kernel)

Open question for the WG: one REP across the stack, or separate
REPs per layer with a shared schema contract? I lean toward
separate-with-contract — kinematic deployments don’t always need
the policy/crypto layers, and forcing the full stack raises
adoption cost. Your fleet-telemetry framing (NARH) actually argues
for the same modularity at the upper boundary.

Happy to take this to a call or DM. I have the policy + crypto
layers working with an MCP server for natural-language audit
queries; you have the kinematic + fleet-integrity layers. An
end-to-end demo across the stack — kinematic event triggers
policy decision, both signed into one chain — is probably a week
of integration work and would give the WG something concrete to
react to rather than a pure spec doc.

@altunbulakemre75, thank you — this framing makes a lot of sense to me.I agree with your “separate REPs with a shared schema contract” direction.For adoption, the kinematic layer should not require the policy or cryptographic layers. Many robot-side deployments may only need a lightweight local guard and a machine-readable intervention record. But if a higher-level provenance layer exists, the same kinematic event should be able to flow upward into a signed policy/audit chain.

The way I currently see the layering is:

  1. Kinematic layer:
    ros2_kinematic_guard emits a local GuardDecisionRecord.
    Example:
    planner wanted /cmd_vel = X
    guard published /safe_cmd_vel = Y
    dominantCause = WHEEL_SLIP / LOCALIZATION_JUMP / TIMING
    residual = numeric execution-integrity score
  2. Policy layer:
    kernel records which rule or guardrail policy consumed that event and what decision it produced.
  3. Cryptographic layer:
    kernel signs and chain-links the resulting decision record.

That separation is important because a small AMR developer should be able to run only layer 1 in observe mode, while a regulated fleet deployment could stack all three layers.

On NARH: in the ROS 2 package I am intentionally keeping it practical. NARH-lite is used as an order-sensitive residual engine over command / feedback / timing windows. The implementation is not asking users to accept a theory first; it simply outputs whether the recent command stream and physical response still belong to the same execution episode.

I like the integration idea, but I suggest keeping the first PoC very narrow:
ros2_kinematic_guard
→ emits GuardDecisionRecord
→ kernel ingests it as a runtime evidence event
→ kernel signs it into the decision chain
→ MCP query can answer: “why was this command downgraded?”

That would give the WG something concrete without forcing a full-stack architecture too early.
I’m happy to continue by DM or email first so we can align the schema and demo boundary.

Thanks @zc_Liu — fully aligned on the narrow PoC scope. The boundary
you drew is exactly the right one: GuardDecisionRecord → ingested as
runtime evidence → signed into kernel’s decision chain → queryable via
MCP. No forced architectural commitment beyond the schema contract at
the boundary.

Three concrete things on my side, then I’ll move to DM:

  1. Schema contract. kernel’s current chain entries are Decision records
    (action, rule_id, threat_level, signature, prev_hash,
    policy_version_id, chain_index). There’s no formal upstream event
    type yet — so the integration has two clean options:

    (a) GuardDecisionRecord arrives as input to kernel’s policy engine,
    which produces a downstream Decision; the record itself isn’t
    in the chain, just referenced.

    (b) kernel grows a new UpstreamEvent type at the chain boundary;
    your record is signed verbatim into the chain alongside
    Decisions, preserving the original schema.

    I lean toward (b) because it keeps your kinematic vocabulary intact
    for the WG audience and avoids semantic collapse. Either way, the
    only field I’d want at the boundary is a stable source_id (UUID or
    ROS2 node name + namespace) so the chain can reference your guard
    instance after the fact.

  2. MCP query semantics. kernel’s MCP server already exposes
    query_events, get_event, get_stats, verify_chain, and search_events.
    Once your record is signed into the chain (via option a or b above),
    “why was this command downgraded?” is answerable today by
    search_events surfacing dominantCause + residual verbatim. No new
    MCP tool needed — I can demo on a fixture chain.

  3. Demo boundary. Smallest possible end-to-end:

    • Your ros2_kinematic_guard running on a sim or recorded bag
    • A single WHEEL_SLIP-induced downgrade event
    • kernel ingesting it via a thin ROS2 subscriber bridge
    • Signed chain entry with policy_version_id stub
    • MCP query returning the full causal trail
      Once we agree the schema, the integration is well-scoped. WG
      reaction surface > spec doc.

Moving to DM. My email: altunbulakemre75@gmail.com — happy with DM,
email, or a 30-min call, whichever you prefer.

Also: on NARH-lite as an order-sensitive residual engine over
command/feedback/timing windows — that framing is cleaner than the
broader CLIM concept for an initial REP. Worth keeping the REP scoped
to what’s directly observable, not the theoretical superstructure.
That’s an opinion, not a request.

Thanks, this is very clear and I agree with the boundary. I also lean toward option (b): treating GuardDecisionRecord as an UpstreamEvent signed verbatim into the chain. That preserves the kinematic vocabulary instead of forcing it into a policy-decision schema too early.

For the first boundary contract, I can keep the ROS 2 side narrow and stable:

  • source_id
  • timestamp
  • inputCommand
  • outputCommand
  • stateSource
  • decision.status
  • decision.guardAction
  • decision.dominantCause
  • decision.residual
  • lookbackWindowMs

That should be enough for kernel to ingest a WHEEL_SLIP downgrade event and sign it into the chain.

I agree that the first demo should stay concrete:

ros2_kinematic_guard emits one GuardDecisionRecord
→ kernel ingests it as UpstreamEvent
→ kernel signs it
→ MCP answers: “why was this command downgraded?”

Also agreed on the REP scope: for now, NARH-lite should be presented simply as an order-sensitive residual engine over command / feedback / timing windows. No need to bring the broader CLIM theory into the initial REP discussion. I’ll follow up by email so we can align the schema without expanding the public thread too much.

If you move this discussion to private direct messages, no one else will be able to participate and you won’t benefit from insights others might have to offer.

Hi guys,

I’ve just published a project announcement for runtime_integrity (formerly ros2_kinematic_guard) that focuses exactly on translating physical-execution divergence into structured, machine-readable causal evidence for Article 12/14 compliance.
It runs as a non-invasive middleware layer below Nav2 without altering the autonomy stack. I’d love to get your insights on the planned audit event schema and how it could interface with higher-level policy engines or governance frameworks:

1 Like

@gbiggs which tools would u suggest someone interested in #1 should start looking into? I am quite at home with Behavior Trees but to trace what went wrong I usually use Groot’s visualizer and console print outs. Not the best approach but for my project it has worked so far.