Dynamic Event (#632)

Archive-o-matic · April 8, 2025, 10:03am

Posted by @mxgrey:

Are you noticing a problem with reporting the cancelled status? I would expect it to work fine by virtue of this line which changes the status as soon as the cancellation happens. Then we protect the cancelled status here so it should remain in effect for the rest of the task’s lifetime.

If we do what you’re suggesting we would lose the distinction between cancelled and killed status, which isn’t a big deal but I’d rather keep the distinction if we can.

Archive-o-matic · April 10, 2025, 6:51am

Posted by @asrafhamid:

Thanks Grey for clarifying.

I suspect the issue occurs because the status is being updated to Standby (line) even after the event has already been canceled. This causes the status to get stuck in Standby instead of remaining in Canceled.

To resolve this, I suggest moving the cancellation check earlier in the on_finish block and skipping the Standby status update when the event is canceled:

if (me->_cancelled)
{
  me->_finished();
  return; // Skip the Standby status update
}

me->_state->update_status(rmf_task::Event::Status::Standby);
me->_publish_update();

Although this local fix may work, I hope it addresses your concerns and aligns with the intended behavior.

Archive-o-matic · April 10, 2025, 7:19am

Posted by @mxgrey:

Ahhhh I didn’t notice that stray status change. You’re right, it shouldn’t be doing that specific status update when it’s cancelled.

I think we should still do everything else before we reach the me->_finished() line, so I’d tweak your suggestion to this:

if (!me->_cancelled)
{
    me->_state->update_status(rmf_task::Event::Status::Standby);
}

Then everything else can remain the same.

Edited by @mxgrey at 2025-04-10T07:19:44Z

Archive-o-matic · April 15, 2025, 8:17am

Posted by @mxgrey:

I believe 22d407f should fix the status updates.

I’ve also added aa3a86a which introduces the stubborness feature we discussed, so you can set a time period that the robot will report itself to the traffic schedule as remaining in its current position. Every second this time period will be refreshed.

I believe with those two changes we’ve addressed all the concerns that we discussed. Let me know if I’ve forgotten anything or if you encounter a new issue.

Edited by @mxgrey at 2025-04-15T08:18:24Z

Archive-o-matic · April 16, 2025, 5:23am

Posted by @asrafhamid:

sorry, is there an updated version of DynamicEvent.action? Seems like its missing a stubborn_period field

Archive-o-matic · April 16, 2025, 3:35am

Posted by @asrafhamid:

Thanks Grey for the status fix and stubbornness feature, will get to testing asap.

On a separate note, I noticed that out of the 3 events (next, cancel, end). Only the ‘next’ event returns as goal succeed but not for the rest. Curious if this is an intended behavior since it seems more consistent if all events return succeed (with meaningful result) when they successfully complete their respective actions.

Archive-o-matic · April 16, 2025, 4:55am

Posted by @mxgrey:

Only the ‘next’ event returns as goal succeed but not for the rest. Curious if this is an intended behavior

Whoops, that’s definitely not intentional! I’ll fix that up right now.

Archive-o-matic · April 16, 2025, 7:34am

Posted by @mxgrey:

I’ve pushed a few fixes:

a61a23d should allow cancellation actions to get a success response once the event is finished cancelling
e5ea8a4 should allow finishing actions to get a success response as soon as the request is processed
c42fdda will ensure that the scheduled location of the robot moves along with the robot’s actual location when you use the stubbornness setting. I realized that after cancelling an event the robot may still be moving, but the stubbornness schedule location would remain frozen in place. That should be fixed now.

Archive-o-matic · April 16, 2025, 9:52am

Posted by @asrafhamid:

a61a23d Cancel Event goal response : Works
e5ea8a4 End Event goal response : Segmentation Fault, crashes fleet adapter
Stubbornness feature: despite adding a stubborn_period (10s - 60s) to a next event, no change in behaviour upon reaching destination. Currently tinkering to see how it works

Archive-o-matic · April 16, 2025, 10:49am

Posted by @mxgrey:

End Event goal response : Segmentation Fault, crashes fleet adapter

Whoops, that was a copy/paste error. Should be fixed by this: 6d69c47

despite adding a stubborn_period (10s - 60s) to a next event, no change in behaviour upon reaching destination. Currently tinkering to see how it works

That’s surprising.. be sure to update the rmf_internal_msgs branch that you’re using, if you haven’t done that already.

I opened this PR that offers a --stubbornness argument so that the feature can be tested through the command line. When you assign some value for --stubbornness you should see the robot maintain its position on the traffic schedule after arriving at its destination or even after its go_to_place gets cancelled. If you’re looking at rviz that will show a large yellow dot around the robot’s purple dot.

I would suggest trying out the rmf_demos_tasks script on your machine and see if the stubbornness works through that. If it does, then I would guess something is going wrong in the message translation.

Archive-o-matic · April 25, 2025, 4:03am

Posted by @asrafhamid:

apr_21_issue.zip

Regarding the issue observed on Monday (April 21), I’ve attached the fleet adapter log and ROS bag for reference.

Summary (office world):

Send tinyRobot1 to patrol_D1 (stubborn_period:20) till reach
Send tinyRobot2 to patrol_A1 (stubborn_period:30)
tinyRobot2 will get blocked by tinyRobot1
after 30s, tinyRobot2 will attempt to replan (it might rotate in place)
tinyRobot2’s yellow marker will cross over tinyRobot1 till it reaches patrol_A1

Relevant Log Excerpt (Replanning Attempt):

[INFO] [1745551317.451841260] [tinyRobot_fleet_adapter]: [tinyRobot/tinyRobot2] waiting for traffic
[INFO] [1745551347.452872743] [tinyRobot_fleet_adapter]: Replanning for [tinyRobot/tinyRobot2] because a traffic dependency is excessively delayed

Archive-o-matic · April 25, 2025, 12:56pm

Posted by @mxgrey:

Thanks for documenting this problem. I did manage to recreate it on my own machine.

This was quite a vexing problem to unravel because I couldn’t find any logical reason that one robot would be failing to update its schedule but not the other, especially since both are managed by the same fleet adapter.

Ultimately what I discovered was the rclcpp timer for updating the schedule delay stopped firing for one of the robots even though it should have kept firing every second.

I created this debug_vanishing_timer branch to narrow the problem down to this. If you run with that branch you’ll see

[fleet_adapter-16] timer: 0x7fb0901a90c0 is_cancelled: 0 is_ready: 0 time_until_trigger: 0.775857
[fleet_adapter-16] timer: 0x7fb0901a90c0 is_cancelled: 0 is_ready: 1 time_until_trigger: -0.223956
[fleet_adapter-16] timer: 0x7fb0901a90c0 is_cancelled: 0 is_ready: 1 time_until_trigger: -1.22419
[fleet_adapter-16] timer: 0x7fb0901a90c0 is_cancelled: 0 is_ready: 1 time_until_trigger: -2.22389
[fleet_adapter-16] timer: 0x7fb0901a90c0 is_cancelled: 0 is_ready: 1 time_until_trigger: -3.22397
[fleet_adapter-16] timer: 0x7fb0901a90c0 is_cancelled: 0 is_ready: 1 time_until_trigger: -4.22382
[fleet_adapter-16] timer: 0x7fb0901a90c0 is_cancelled: 0 is_ready: 1 time_until_trigger: -5.224
[fleet_adapter-16] timer: 0x7fb0901a90c0 is_cancelled: 0 is_ready: 1 time_until_trigger: -6.22379
[fleet_adapter-16] timer: 0x7fb0901a90c0 is_cancelled: 0 is_ready: 1 time_until_trigger: -7.22402
[fleet_adapter-16] timer: 0x7fb0901a90c0 is_cancelled: 0 is_ready: 1 time_until_trigger: -8.2239
[fleet_adapter-16] timer: 0x7fb0901a90c0 is_cancelled: 0 is_ready: 1 time_until_trigger: -9.2239

and the time_until_trigger just keeps getting more and more negative forever, implying that rclcpp should know that it hasn’t fired it in a long time despite it being ready.

So the problem resides somewhere in rclcpp, and I don’t think there’s anything that can be done in Open-RMF to resolve this.

The good news is, I just tested this scenario on the latest rolling version of ROS 2 and the problem went away. Presumably whatever was causing this bug has been fixed in the latest rclcpp.

If you’re interested in testing this out with rolling, you’ll need the recreate_vel_cmds branch of rmf_simulation in order to be compatible with the latest gz-sim.

Edited by @mxgrey at 2025-04-25T12:57:12Z

Topic		Replies	Views
RMF Traffic occasionally fails to manage multiple robots headed to the same destination (#162) Open-RMF General rmf-github-discuss	16	3	January 19, 2023
Difficulty in visualizing Tasks in Open-RMF (#388) Open-RMF General rmf-github-discuss	7	6	October 13, 2023
Robot Fleet Adapter does not submit bid when submitting Automatic Dispatch Type from RMF Dashboard (#673) Open-RMF General rmf-github-discuss	10	5	May 5, 2025
Generalized Task Composition (#169) Open-RMF General rmf-github-discuss	10	0	May 29, 2024
Lanes in Free Fleet (#675) Open-RMF General rmf-github-discuss	24	10	June 4, 2025

Dynamic Event (#632)

Related topics