Analysis on FusionCore vs robot_localization

A few days ago I shared a benchmark where FusionCore beat robot_localization EKF on a single NCLT sequence. Fair enough… people called out that one sequence can easily be cherry-picked. Someone also mentioned that the particular sequence I used is known to be rough for GPS-based filters. Others asked if RL was just badly tuned, or how FusionCore could outperform it that much if both are just nonlinear Kalman filters… etc


All good questions.

So I went back and ran six sequences across different weather conditions. Same config for everything. No parameter tweaks between runs. The config is in fusioncore_datasets/config/nclt_fusioncore.yaml, committed along with the results so anyone can check.


FusionCore wins 5 of 6. RL-UKF diverged with NaN on all six.


Now, the obvious question: what happened with November 2012? That’s the one where RL wins.

That sequence has sustained GPS degradation… this isn’t just occasional noise. The NCLT authors themselves mention elevated GPS noise in that session. Both filters are seeing the exact same data, so the difference really comes down to how they handle it.

Here’s what’s going on:

FusionCore has a gating mechanism. When GPS looks bad, it rejects those measurements. That’s usually a good thing… but in this case, the degradation is continuous. So, Fusioncore rejects a few GPS fixes → the state drifts → the next GPS measurement looks even worse relative to that drifted state → it gets rejected again → and this repeats. It kind of traps itself rejecting the very data it needs to recover.

RL, on the other hand, just accepts every GPS update. No gating, no rejection. That means it gets pulled around by noisy GPS, but it also re-anchors itself as soon as the signal improves. So in this specific case, that “always accept” behavior actually helps.

After discussing this with some hardware folks here in Kingston, we decided to add something we’re calling an inertial coast mode. The idea is simple:

  • If FusionCore sees N consecutive GPS rejections, it increases the position process noise (Q)

  • That causes the covariance (P) to grow

  • As P grows, the Mahalanobis gate naturally becomes less strict

  • Eventually, incoming GPS measurements are no longer “too far” and get accepted again

  • Once GPS is accepted, Q resets back to normal

Basically, instead of getting stuck rejecting everything, the filter “loosens up” over time and lets itself recover.

On the November 2012 sequence, this drops the error from 61.4 m → 28.7 m. RL still wins, but the gap is much smaller now, and everything is documented in the repo.

If your robot drives through tunnels, underpasses, agricultural land, and/or urban canyons with brief GPS dropouts, FC’s gate is a strength… it doesn’t get corrupted by the bad fixes during the outage. If you have GPS that is consistently mediocre (cheap receiver, cheap module, always noisy but never totally wrong), RL’s accept-everything approach is probably safer at least until coast mode gets smarter?

If you’ve got ideas on improving this… especially around re-acquisition or better fallback behavior… I’m all ears. Suggestions, config tweaks, PRs… all welcome.

Reproducing a run is straightforward

git clone https://github.com/manankharwar/fusioncore.git
# Download NCLT sequence from http://robots.engin.umich.edu/nclt/
ros2 launch fusioncore_datasets nclt_benchmark.launch.py \
  data_dir:=/path/to/nclt/2012-01-08 \
  output_bag:=./bag
python3 tools/evaluate.py --gt ground_truth.tum \
  --fusioncore fusioncore.tum --rl rl_ekf.tum \
  --sequence 2012-01-08

Full pipeline in benchmarks/README.md. Results per sequence under benchmarks/nclt/*/results/BENCHMARK.md.

November 2012 is an open problem. Coast mode cuts the error by 53% but RL’s no-gate approach still wins under sustained GPS degradation. Fully closing the gap requires either a smarter re-acquisition strategy or a tunable fallback threshold. Pull requests are welcome.

If you’ve got a dataset you want me to try, just send it over (or drop a link), and I’ll run it and share the results.

FusionCore accepts nav_msgs/Odometry from any source including slam_toolbox, MOLA, ORB-SLAM3, and even VINS-Mono. Same interface as wheel odometry.

-> GitHub - manankharwar/fusioncore: ROS 2 sensor fusion SDK: UKF, 3D native, proper GNSS, zero manual tuning. Apache 2.0. · GitHub

Happy Building!

3 Likes

Adding a note for anyone who finds this: I’ve since added a migration guide (docs/migration_from_robot_localization.md) that maps every RL parameter to its FusionCore equivalent. The gravity flag inversion is the most common gotcha: worth reading that section specifically if you’re switching.

I am all for these open experiments. But it feels like you’re not being completely honest.

This statement, made repeatedly in your post, is simply not true. robot_localization has had Mahalanobis rejection thresholds for…a decade? The sample configuration file in the package clearly shows them. Moreover, it looks like you attempted to specify Mahalanobis rejection thresholds in your EKF config for your experiments, but somehow managed to use a parameter name that doesn’t even exist. This suggests that:

  • You knew that r_l supported gating measurements based on Mahalanobis distance thresholds, but intentionally stated in your post that it doesn’t support them
  • You didn’t read the r_l code or configuration files before making this statement

In any case, I’m not precious about r_l. I barely have time to look at the PRs that come through these days. But I think that being transparent and honest is important when doing these comparisons, especially when it seems like there’s little reason to be disingenuous - your package seems to produce good results.

3 Likes

@automatom: thank you for the detailed corrections. You caught real problems and I want to respond to each one properly. I’ve re-run the full benchmark with corrected configs and updated the docs.


On the benchmark config bug:

You were right. The RL config used odom0_mahal_threshold and odom1_mahal_threshold, which don’t exist in robot_localization. RL silently ignored them and ran with no outlier rejection at all. The correct parameters are odom0_twist_rejection_threshold and odom1_pose_rejection_threshold.

To confirm the correct values, I read measurement.hpp in the Jazzy RL install directly:

“The Mahalanobis distance threshold in number of sigmas”

That means the threshold is the unsquared Mahalanobis distance. The corrected values:

  • odom0_twist_rejection_threshold: 4.03 → chi²(3, 0.999) = 16.27, sqrt = 4.03

  • odom1_pose_rejection_threshold: 3.72 → chi²(2, 0.999) = 13.82, sqrt = 3.72

These are set to the same statistical confidence level (99.9%) as FusionCore’s chi-squared gates. I re-ran all 6 NCLT sequences with the corrected config. FusionCore still wins 5 of 6:

The original “wins 5/6” claim stands, though the margins changed significantly.

There’s an interesting finding in the re-run: with proper gating, RL-EKF degraded on 4 of 6 sequences. On 2012-03-31 for example it went from 10.8 m (no gating) to 54.3 m (correct gating). This is because RL takes GPS measurement covariance directly from the NavSatFix message via navsat_transform. NCLT’s GPS receiver reports covariances tighter than the actual noise: so at chi²(2, 0.999), valid fixes get rejected as outliers. FusionCore uses a user-specified gnss.base_noise_xy that’s tuned to match real sensor behavior, giving better-calibrated innovation statistics under the same threshold. This is a genuine architectural difference, not a tuning artifact.

With proper gating, RL-EKF got worse on 4 of the 6 sequences. FusionCore uses a user-specified noise floor (gnss.base_noise_xy) tuned to real sensor behavior, so it doesn’t have this problem under the same threshold. Curious if you have a different read on this.


All config files: here’s everything so you can validate

The complete benchmark pipeline is in the repo. The relevant files:

RL-EKF config (the one with the corrected gating):
https://github.com/manankharwar/fusioncore/blob/main/fusioncore_datasets/config/rl_ekf.yaml

RL-UKF config:
https://github.com/manankharwar/fusioncore/blob/main/fusioncore_datasets/config/rl_ukf.yaml

navsat_transform config:
https://github.com/manankharwar/fusioncore/blob/main/fusioncore_datasets/config/navsat_transform.yaml

FusionCore config used for the NCLT run:
https://github.com/manankharwar/fusioncore/blob/main/fusioncore_datasets/config/nclt_fusioncore.yaml

The launch file that runs all three filters simultaneously (FusionCore + RL-EKF + RL-UKF + navsat_transform, all on the same data at the same time):
https://github.com/manankharwar/fusioncore/blob/main/fusioncore_datasets/launch/nclt_benchmark.launch.py

Per-sequence results (each has the ATE/RPE table and methodology):

Full reproduce instructions (download NCLT data, run benchmark, evaluate):
https://github.com/manankharwar/fusioncore/blob/main/benchmarks/README.md

Everything runs on ROS 2 Jazzy. The NCLT data is public at http://robots.engin.umich.edu/nclt/


On RL’s outlier rejection: you were right, I was wrong

The claim “RL-EKF has no rejection gate” was completely false. RL has had Mahalanobis rejection thresholds for a decade. I’ve removed that claim from the README, docs/index.md, and benchmark.md.


On RL-UKF: no change needed

The NaN divergence on all 6 sequences is confirmed by you independently (numerical instability in the UKF implementation). That part of the results stays as-is.


What’s been updated:

  • fusioncore_datasets/config/rl_ekf.yaml: correct parameter names

  • All 6 benchmarks/nclt/*/results/BENCHMARK.md: re-run results

  • docs/reference/benchmark.md: updated table + explanation of gating behavior

  • README.md: updated table

The benchmark repo and full reproduce instructions are at: https://github.com/manankharwar/fusioncore/tree/main/benchmarks

I appreciate you taking the time to look at this carefully. If you see anything else wrong, please keep flagging it.

1 Like

Is this something like “minimum covariance” in case the user knows the sensor can be over-confident? Amd how is it applied? You create a sphere and then scale the reported covariance until the min sphere fits in? Or do you somehow distort the real covariance?

1 Like

base_noise_xy is only used in the fallback path, not as a floor on top of the sensor covariance. The logic is:

  1. If the NavSatFix has a valid full 3×3 covariance (all diagonal elements positive) → FusionCore uses it directly, untouched.

  2. If not (receiver only reports HDOP/VDOP, or covariance is zero/negative) → FusionCore builds the matrix from scratch: sigma_xy = base_noise_xy × hdop, sigma_z = base_noise_z × vdop, then R = diag(sigma_xy², sigma_xy², sigma_z²).

So, it’s neither a sphere clamp nor a distortion of the real covariance. It’s a scale factor for the HDOP-based fallback… essentially "how many metres of 1-sigma noise do you trust per unit of HDOP? The NCLT receiver apparently doesn’t populate a valid covariance, so FusionCore fell through to this path and used the HDOP-scaled estimate throughout the benchmark.

1 Like