Ferronyx – Real-Time ROS2 Observability & Automated RCA

We’ve been building robots with ROS2 for years, and we hit the same wall every time a robot fails in production:

The debugging process:

  • SSH into the machine

  • Grep through logs

  • Check ROS2 topics (which ones stopped publishing?)

  • Replay bag files

  • Cross-reference with deployment changes

  • Try to correlate infrastructure issues with ROS state

This takes 3-4 hours. Every time.

The problem: ROS gives you raw telemetry, but zero intelligence connecting infrastructure metrics + ROS topology + deployment history. You’re manually stitching pieces together.

So we built Ferronyx to be that intelligence layer.

What we did:

  • Real-time monitoring of ROS2 topics, nodes, actions + infrastructure (CPU, GPU, memory, network)

  • When something breaks, AI analyzes the incident chain and suggests probable root causes

  • Deployment markers show exactly which release caused the failure

  • Track sensor health degradation before failures happen

Real results from our beta customers:

  • MTTR: 3-4 hours → 12-15 minutes

  • One customer caught sensor drift they couldn’t see manually

  • Another correlated a specific firmware version with navigation failures

We’re looking for 8-12 more teams to beta test and help us refine this. We want teams that:

  • Run ROS2 in production (warehouses, humanoids, autonomous vehicles)

  • Actually deal with downtime/reliability issues

  • Will give honest feedback

Free beta access. You help shape the product, we learn what breaks.

If you’re dealing with robot reliability headaches, reply here or send a DM. Would genuinely love to hear your toughest debugging stories.

Links:
https://ferronyx.com/

First off, great problem to work on! Also, your link is broken.

Can you share a bit more about your stack? Is it open-source? Is it build on open-source, if so, which libraries and tools? Do you somehow sandbox the code running on the robot, if not, does it at least run without sudo? You say it is free during beta, any sense what it will cost long-term? Correlating logs from multiple sources, incl. robot + infra, is useful in it’s own right. Will it be possible to use that without AI as well? If so, what’s the data store and what querying capabilities will you expose? How would you compare your solution to existing ones in the market – put differently: why do you think we need another one?

Thanks for the questions — happy to clarify.
Ferronyx runs a lightweight C++ ROS 2 daemon on the robot. It currently runs with sudo to access system-level metrics, but is strictly read-only and performs no actuation or code execution. All heavy processing and correlation happens on our server, not on the robot.
The product is UI-first today — we’re not exposing public query APIs yet. Robot + infrastructure correlation works without AI; AI is layered on top for insights and summaries.
Ferronyx is free during beta, with per-robot pricing planned post-beta. It’s built ROS-native and robot-first, focused on real failure debugging and the R&D → production transition.