I have a fleet with one robot operating with rmf_ros2 and rmf-web Humble.
The fleet adapter uses the FullControl API (RobotCommandHandle)
The full chain from rmf to robot is
rmf_ros2 → fleet_adapter→fleet_server → fleet_client (old DDS approach for legacy reasons)
The issue is that when a robot is shutdown (e.g. fleet_client is brought down), the api_server’s /fleet end-point keeps the robot in the fleet. The web dashboard keeps showing the stale robot and the last Updated time (Another bug) keeps ticking into current time.
[
{ "name": "omnit_ATR2", "robots": { "SIMULATION": { "name": "SIMULATION", "status": "idle", "task_id": "", "unix_millis_time": 1762423260202, "location": { "map": "L1", "x": 31.780725479125977, "y": -38.208396911621094, "yaw": -1.5786551237106323 }, "battery": 0.948580322265625, "issues": [] } } } ]
I’ve fixed the fleet_server to remove the stale robot from /fleet_states when the fleet_client disconnects or times out.
When a robot is available, it appears in /fleet_states . See below:
-
name: SIMULATION model: ATR2 [removed fields] sec: 14 nanosec: 750000000 x: 31.769018173217773 y: -37.97697448730469 yaw: -1.5786962509155273 obey_approach_speed_limit: false approach_speed_limit: 0.0 [removed fields]
When the robot is shutdown, it is removed from /fleet_states
name: omnit_ATR2
robots: []
With code below, I’m able to remove the robot in the fleet_adapter, but cannot add it in again and get a segfault.
connections->fleet_state_sub = node->create_subscription<
rmf_fleet_msgs::msg::FleetState>(
rmf_fleet_adapter::FleetStateTopicName,
rclcpp::SystemDefaultsQoS().keep_last(10),
[c = std::weak_ptr<Connections>(connections), fleet_name](
const rmf_fleet_msgs::msg::FleetState::SharedPtr msg)
{
if (msg->name != fleet_name) {
return;
}
const auto connections = c.lock();
if (!connections) {
return;
}
// Track which robots are present in the latest FleetState message
std::unordered_set<std::string> present;
present.reserve(msg->robots.size());
for (const auto & state : msg->robots) {
present.insert(state.name);
const auto insertion = connections->robots.insert(
{state.name,nullptr});
const bool new_robot = insertion.second;
if (new_robot) {
// We have not seen this robot before, so let's add it to the fleet.
connections->add_robot(fleet_name, state);
}
const auto & command = insertion.first->second;
if (command) {
// We are ready to command this robot, so let's update its state
command->update_state(state);
}
}
// Remove any robots that are no longer present in the FleetState.
// This ensures stale robots are fully deregistered from RMF, including
// their negotiators, using FleetUpdateHandle::remove_robot(name).
std::vector<std::string> to_remove;
{
auto lock = connections->lock();
to_remove.reserve(connections->robots.size());
for (const auto & [name, _] : connections->robots) {
if (present.find(name) == present.end()) {
to_remove.push_back(name);
}
}
}
for (const auto & name : to_remove) {
RCLCPP _INFO(connections->adapter->node()->get_logger(),"Robot [%s] absent from latest FleetState for fleet [%s]; removing from RMF",
name.c_str(), fleet_name.c_str());
// Best-effort cleanup: clear the participant's itinerary so it is
// removed from the traffic schedule. Full unregistration is not
// available in this RMF version.
{
auto lock = connections->lock();
const auto it = connections->robots.find(name);
if (it != connections->robots.end()) {
if (const auto updater = it->second ? it->second->get_updater() : nullptr) {
if (auto participant = updater->unstable().get_participant()) {
// Clear itinerary to remove routes from the schedule.
try {
participant->clear();
} catch (...) {
// Ignore errors from clear and continue.
}
}
}
}
}
}
});
Ran a back-trace and found this:
terminate called after throwing an instance of ‘std::runtime_error’
what(): [rmf_traffic_ros2::schedule::Negotiaton] Attempt to register a duplicate negotiator for participant [3]
Any recommendations to safely remove and re-add the same robot? I imagine its a very common scenario in real-world use cases.
Has this been fixed in an upstream branch?
