Thanks to everyone who attended today!
Meeting Recording
Meeting Notes
SROS2 improvements suggestions
During the meeting, we went through the different ideas and potential designs. Most of the content is captured in the tickets, but there was some additional questions / comments (see below).
Security Profile Library for AppArmor
opened 11:43PM - 03 Apr 19 UTC
ready
Many software projects releasing for Linux ship with included AppArmor profiles … to secure their applications at runtime. To enable ROS package developers to succinctly write and maintain similar app policies, I previously wrote an AppArmor library to pre-configure default tuneables and incluable permission abstractions to simplify authored profiles.
I’d like to suggest we package this AppArmor library for release in SROS2; perhaps calling `sros_apparmor`? This package would mainly serve as a delivery mechanism to install the necessary `include` files in `/etc/apparmor.d/`. Downstream ROS2 package developers could cite `sros_apparmor` as a runtime dependency so that the customer profiles they install to `/etc/apparmor.d/` could include the required permission abstractions.
---
## Background
> AppArmor ("Application Armor") is a Linux kernel security module that allows the system administrator to restrict programs' capabilities with per-program profiles. Profiles can allow capabilities like network access, raw socket access, and the permission to read, write, or execute files on matching paths. AppArmor supplements the traditional Unix discretionary access control (DAC) model by providing mandatory access control (MAC).
https://wikipedia.org/wiki/AppArmor
I initially started a profile library appstration for AppArmor back with SROS1. However, given the library had to be manually installed, adoption in ROS1 was rather limited:
https://github.com/ros-infrastructure/apparmor_profiles
http://wiki.ros.org/SROS/Tutorials/AppArmorAndROS
## Progress
I’ve recently been updating the library for ROS2, and have cpp and python nodes working while under enforced profiles, such as this for the demo node examples shown here:
https://github.com/ros-infrastructure/apparmor_profiles/pull/6
```
# opt.ros.distro.lib.demo_nodes_cpp
include <tunables/global>
include <tunables/ros>
profile demo_nodes_cpp/talker @{ROS_INSTALL_LIB}/demo_nodes_cpp/talker {
include <ros/base>
include <ros/node>
}
profile demo_nodes_cpp/listener @{ROS_INSTALL_LIB}/demo_nodes_cpp/listener {
include <ros/base>
include <ros/node>
}
```
## Next steps
Going forward, I’d like to add more permission abstractions for other ROS2 primitives events, like process signal handling between launch tools or allowing shared memory access between nodes, while at the same time auditing the existing policy footprint: to verify which rulesets are truly necessary and where we may want to relax restrictions to generalize for common use case patterns. Iterating further by writing more example profiles for the rest of the ROS2 demos might useful for unearthing any remaining conor cases. I’d apprchate help from folks who may also be familiar either AppArmor debuging or testing advanced ROS2 features.
While rewriting the library, it may also be a good opportunity to re-license the tunables and abstraction files to coincide with the rest of ROS2’s development.
I’d like to eventually host this AppArmor library here with the rest of the sros2 packages, so that downstream users can redaly install it and build on top of it. Perhaps packaging the installer (mostly copying files to `/etc/apparmor.d/`) using CMake would be enough of a start?
One challenge for this task is to define an appropriate strategy to bundle those files.
What is @{ROS_INSTALL_LIB} in the profile config files?
This is an AppArmor variable, expended to the ROS install directory. This is not e.g. an environment variable.
Are profiles always read from the system?
Additional profiles can also be injected using CLI tools.
Replace openssl subprocess calls with python cryptography library
opened 01:02AM - 06 Apr 19 UTC
closed 01:05AM - 26 Jun 19 UTC
enhancement
backlog
Currently, the sros2 api makes subprocess calls to the systems OpenSSL CLI for g… enerating public/private x509 key pairs and signing for SMIME. This is a bit hacky, as it relies upon the OpenSSL CLI to be consistent across runtime targets, making it quite fragile should the CLI or implicit defaults change. Instead, we should seek to use a proper cryptography library API so we may have finer control over key material provisioning and error handling for the user.
For example, users wishing to use Connext for DDS Security must swap the environment to point to a version of OpenSSL shipped by RTI. As RTI’s OpenSSL installer does not fully configure the CLI, as it does instead for the shared library, nor does the install respect the system config defaults for OpenSSL; this results in warnings produced by the RTI OpenSSL CLI binary that are susiquentl silenced by sros2 use of subprocess, obscuring potential errors or deviating crypto settings.
---
## Background
> `cryptography` is a package which provides cryptographic recipes and primitives to Python developers. Our goal is for it to be your "cryptographic standard library". `cryptography` includes both high level recipes and low level interfaces to common cryptographic algorithms such as symmetric ciphers, message digests, and key derivation functions.
https://cryptography.io
By far, my favorite python library for this is `cryptography`, which has a pythonic API, excellent documentation, and readily available for most targets. This what I also used back for SROS1 when developing its keyserver:
https://github.com/ros/ros_comm/blob/f95b4a5de2acb4fb53f0e9a4cff47dcef928eac5/tools/rosgraph/src/rosgraph/key_helper.py
## Progress
While sros2 CLI was still in its early stages, I previously prototyped a more advanced keystore workspace tool called [Keymint](https://github.com/keymint/keymint_tools), again using the python `cryptography` library:
https://github.com/keymint/keymint_keymake/tree/adc38e07ce5f16d6ba4b36294d7d2e8a361153f0/keymint_keymake/pki
This allowed me to abstract a bit more allowing users to finally configure the asymmetric key algorithm/size/format, CA hierarchy, period of validity and expiration, file protection, etc. I’d like to port over most of these features directly into sros2, but would like to just start by swapping out the subprocess calls and establishing `cryptography` as a primary depency.
The only road bump is foresee is that of supporting SMIME signatures when notarizing DDS Security governance and permission documents. As of writing, the `cryptography` library doesn’t yet seem to have a simple API for producing smime signature. This has been an open ticket for a while:
https://github.com/pyca/cryptography/issues/1621
For Keyment, I worked out out this by using `M2Crypto` for this one purpose instead, allow me to replicate the same valid SMIME signatures in pure-ish python. However `M2Crypto` isn’t as simple to install (this may have improved since I last checked a year ago), and its API is more like OpenSSL (think: giant hairball):
https://github.com/keymint/keymint_keymake/blob/adc38e07ce5f16d6ba4b36294d7d2e8a361153f0/keymint_keymake/smime/sign.py
Still, I think we can avoid `M2Crypto` as a dependency if we temporarily drop down a little into the backend API and add some wrappers around the lowevel interfaces for SMIME signing:
https://stackoverflow.com/questions/52780716/signing-s-mime-content-from-python
Dropping OpenSSL could mean dropping support for third-party OpenSSL engines providing access to the keys in HSMs?
Private keys could be flashed on HSMs by manufacturers.
Integration for DDS Security Builtin Logging Plugin
opened 11:32PM - 09 Apr 19 UTC
When designing, auditing, or monitoring security policies for complex applicatio… ns, comprised of many internal and external interfaces, as with distributed computation graphs in ROS2, feedback in form of security event logging can provide valuable insight into potential misconfigurations or abnormalities during development, testing, and deployment.
To further improve our security assistive tooling for SROS2, it would be pertinent to integrate security event logging to facilitate advanced features such as: Log-driven profile generation of access control policies, interoperability testing between multiple integrated systems, as well as continuous monitoring for security abnormalities. To an extent, much of this may be readily achievable by leveraging existing DDS Security Builtin Logging Plugin Interfaces.
---
## Background
> The `Logging` plugin provides the capability to log all security events, including expected behavior and all security violations or errors. The goal is to create security logs that can be used to support audits. The rest of the security plugins will use the logging API to log events.
>
> The `Logging` plugin will add an ID to the log message that uniquely specifies the `DomainParticipant`. It will also add a time-stamp to each log message.
>
> The `Logging API` has two options for collecting log data. The first is to log all events to a local file for collection and storage. The second is to distribute log events securely over DDS.
>
> Section 8.6.1, DDS Security specification (v1.4)
> https://www.omg.org/spec/DDS-SECURITY/About-DDS-SECURITY
The DDS Security specification from OMG defines 5 Service Plugin Interfaces:
#### Table 43 – Summary of the Builtin Plugins
| SPI | Plugin Name | Description |
|----------------|---------------------------|-------------------------------------------------------------|
| Authentication | DDS:Auth:PKI-DH | Uses PKI with a pre-configured shared Certificate Authority |
| AccessControl | DDS:Access:Permissions | Permissions document signed by shared Certificate Authority |
| Cryptography | DDS:Crypto:AES-GCM-GMAC | AES-GCM for encryption. AES-GMAC for message authentication |
| DataTagging* | DDS:Tagging:DDS_Discovery | Send Tags via Endpoint Discovery |
| Logging* | DDS:Logging:DDS_LogTopic | Logs security events to a dedicated DDS Log Topic |
> (*) denotes that default implementations for SPI is optional
The most notable SPI here being that for Logging, as explained in sections 8.6, and 9.6 in the DDS Security spec. These sections detail the plugin model, parameter options, message IDL, and runtime behavior. A conforming logging plugin essently captures and records security events that may crop up at runtime in other security plugins, along with helpful identifying contexts to triage the event.
Recorded events may either be distributed by writing them to disk locally via log file, or publishing them over the DDS network, specifically over the topic name `DDS:Security:LogTopic`, with the following governance setting:
``` xml
<topic_rule>
<topic_expression> DDS:Security:LogTopic</topic_expression>
<enable_discovery_protection>FALSE</enable_discovery_protection>
<enable_read_access_control>TRUE</enable_read_access_control>
<enable_write_access_control>FALSE</enable_write_access_control>
<metadata_protection_kind>SIGN</metadata_protection_kind>
<data_protection_kind>ENCRYPT</data_protection_kind>
</topic_rule>
```
> The above rule states that any `DomainParticipant` with permission necessary to join the DDS `Domain` shall be allowed to write the `BuiltinLoggingTopic` but in order to read the `BuiltinLoggingTopic` the `DomainParticipant` needs to have a grant for the `BuiltinLoggingTopic` in its permissions document. (Section 9.6)
## Progress
This was something originally I had started in SROS1 (distributed security logging, no DDS plugins), but found the ROS1 logging mechneme deficient of rich context that would render such events debuggable for large graphs, nor did it have the message structure to interject supplementary context. Additionally, the all or nothing of TLS hindered SROS1’s ability securely introspect distributed events from nodes failing to join the network to begin with.
The Logging SPI in DDS Security largely resolves these issues, via granular governance settings, structured IDL formats, abstraction over distribution (be it via files or topics), and perhaps most importantly a standard interface to introspect the internal state of the accompanying security plugins.
However, from my interpretation, while the specification is explicit in what supplementary context must be included given the message IDL, it's still a bit handwavy with respect to what the `Free-form message` string should contain, or how its formatted. I also suspect the meat of the info we’d like to parse out will reside therein, e.g:
* QoS and topic-name that was denied due to insufficient permissions?
* Requested permission type that was denied (read, write, relay)?
* Invalid certificate how (expired validity, broken CA chain, void signature)?
I haven’t yet tested any Logging plugins from DDS vendors, primarily as I don’t think Fast-RTPS shipps with a default plugin implementation given Logging is listed as optional plugin. I’m not so sure Fast-RTPS even has a logging SPI so that we could write our own plugin. Also, given that RTI recently released a newer version, I’ve been stalling until ROS2 compiles with Connext v6.0 before I commit any more time investigating the state of logging there. Does anyone know the progress for Security Logging SPI on Opensplice (Vortex or I guess Cyclone now?). My fear is that each vinder will have some quirky way of conveying the same info, like they already do with warning messages, making the inferencing downstream tedious.
Perhaps some readily addressable tasks for improving integration could include supporting more advance `governance.xml` file templating, as we’ll have to amend it to include the relevant permissions for `DDS:Security:LogTopic` topic when uses wish to distribute log events over DDS. Another task could be investigating the logfile format, as parsing/translating static log files seems a bit less involved than subscribing to raw DDS topics and IDL. However, any parsing interfaces in exposing security events for downstream tooling, e.g. log-driven profile generation, should probably be abstract enough for both file or network stream sources.
Subscribe to a secure logging topic with permissions issues, etc. Can be used to improve the policy.
Is there any overlap between ROS 2 logging and DDS logging?
Actually not that much overlap, separation between application layer and transport layer is good to have.
Auto generation interface for Access Control Profiles
opened 11:21AM - 01 May 19 UTC
enhancement
Formulating access control profiles for large or complex ROS applications that s… atisfy [*principles of least privilege*](https://en.wikipedia.org/wiki/Principle_of_least_privilege) is at a present manual endeavor for robotics. Given the tedium of such tasks and the rigor it demands to be fully effective, instances of human error become both more probable and problematic; as accidental over provisioning of permissions presents issues in system security, while under provisioning may similarly effect system stability.
To ease the development of complete and correct access control, automated tooling for profile generation must be developed. Such tooling could consume system measurements such as security event logs or discovery traffic to assist in accurately profiling all permissions required. Subsequently, the acquired permission model could then be exported into profile formats used in SROS2 keystore tooling; e.g. a CLI exposing an interactive session for amending work-in-progress profiles by prompting the user about discrepancies between the existing xml policy file vs. the acquired model.
---
## Background
> SROS1 provides a varying degree of run-time modes, including audit, enforce and complain, again borrowing feature designs from AppArmor. This provides developers a method to auto generate, or amend profiles through granular logging of access events and violation attempts.
>
> “SROS1: Using and Developing Secure ROS1 Systems”
> https://doi.org/10.1007/978-3-319-91590-6_11
Auto generation of profiles was used in SROS1 to simplify the bootstrapping process for enabling fine-grained API access control over ROS1. As ROS1 was centralized with respect to the master node, roscore was extended to support several run-time modes; specifically an audit mode that could amend any work-in-progress profile to include missing permission for API accesses observed during training time. This profile was then used in conjunction with a keyserver process the provisioned the certificates and tokens in keystore accordingly.
> Security of robotics systems, as well as of the related middleware infrastructures, is a critical issue for industrial and domestic IoT, and it needs to be continuously assessed throughout the whole development lifecycle... In this work, we introduce a framework for procedural provisioning access control policies for robotic software, as well as for verifying the compliance of generated transport artifacts and decision point implementations.
>
> “Procedurally Provisioned Access Control for Robotic Systems”
> https://doi.org/10.1109/IROS.2018.8594462
In later work, some of this bootstrapping was reproduced for ROS2 to assist in the automation for the verification and testing of security artifacts used for secure transports. Specifically, given decentralized networking of ROS2 with DDS, observed discovery traffic during training time was substituted to train the model of necessary API permission.
Both approaches made efforts to streamline and fortify access control development via greater automation. The future works of both also detail areas of improvement in regards to developer interfaces, citing AppArmor `aa-genprof` CLI as a particularly useful example tool pattern to adopt for ROS policy development.
The `aa-genprof` utility is CLI that provides users with an interactive session to create and/or amend application security profiles. By running an application in audit mode, `aa-genprof` sequently prompts the user on each new and unaccounted security issue such as missing permissions or policy imports by displaying both the observed security violation in concert with the appropriate excerpt in the present profile along with several choices of applicable amendments to resolve or ignore the violation. I’d recommend trying out the utility yourself to get a better feel for the profile development workflow.
http://manpages.ubuntu.com/manpages/bionic/man8/aa-genprof.8.html
Similarly, the `aa-logprof` utility provides a equivalent and consistent interface for using event measurements from log files rather than a running audit mode process:
http://manpages.ubuntu.com/manpages/bionic/man8/aa-logprof.8.html
#### Screen capture of aa-logprof
<details>
```
$ sudo aa-logprof
Reading log entries from /var/log/syslog.
Updating AppArmor profiles in /etc/apparmor.d.
Complain-mode changes:
Profile: ros/rosmaster
Access mode: receive
Signal: int
Peer: ros/roslaunch
[1 - signal receive set=int peer=ros/roslaunch,]
(A)llow / [(D)eny] / (I)gnore / Audi(t) / Abo(r)t / (F)inish
Adding signal receive set=int peer=ros/roslaunch, to profile.
...
= Changed Local Profiles =
The following local profiles were changed. Would you like to save them?
[1 - ros/rosout]
2 - ros/talker_listener_py
3 - ros/rosmaster
(S)ave Changes / Save Selec(t)ed Profile
[(V)iew Changes] View Changes b/w / (C)lean profiles / Abo(r)t
Writing updated profile for ros/rosmaster.
Writing updated profile for ros/rosout.
Writing updated profile for ros/talker_listener_py.
```
</details>
## Progress
With the addition of the new policy markup format in #72 , it is now redeadly simple to write composable sub profiles. Using XInclude schema in XML, profiles and permission may be broken into separate files for reuse or versioning and subsequently imported as needed. This allows for sharing of common permission primitives across different node profiles, reducing the chances of duplicated errors or divergent policy behaviors.
For automated tools to appropriately amend or constructively suggest changes to policies under audit, the underlying structure of the composed profiles must be conveyed. Parsers supporting XInclude can reconstruct the entire DOM of the policy, along with annotations of expanded includes via [`xml:base`](https://www.w3.org/TR/2001/REC-xmlbase-20010627/#syntax) denoting the relative URI imported. Thus for instances where promoted feedback from the user necessitates alterations to a nested permission or profile, the tool may trace back the URI to determine the files to modify.
#### Original policy file
``` xml
<?xml version="1.0" encoding="UTF-8"?>
<policy version="0.1.0"
xmlns:xi="http://www.w3.org/2001/XInclude">
<profiles>
<profile ns="/" node="talker">
<xi:include href="common/node.xml"
xpointer="xpointer(/profile/*)"/>
<topics publish="ALLOW" >
<topic>chatter</topic>
</topics>
</profile>
</policy>
```
#### Expanded policy DOM
``` xml
<policy version="0.1.0">
<profiles>
<profile node="talker" ns="/">
<services reply="ALLOW" xml:base="common/node/parameters.xml">
<service>~describe_parameters</service>
<service>~get_parameter_types</service>
<service>~get_parameters</service>
<service>~list_parameters</service>
<service>~set_parameters</service>
<service>~set_parameters_atomically</service>
</services>
<topics subscribe="ALLOW" xml:base="common/node/parameters.xml">
<topic>parameter_events</topic>
</topics>
<topics publish="ALLOW" xml:base="common/node/parameters.xml">
<topic>parameter_events</topic>
</topics>
<topics publish="ALLOW" xml:base="common/node/log.xml">
<topic>rosout</topic>
</topics>
<topics subscribe="ALLOW" xml:base="common/node/time.xml">
<topic>clock</topic>
</topics>
<topics publish="ALLOW">
<topic>chatter</topic>
</topics>
</profile>
</profiles>
</policy>
```
For the interactive interface and generation API, we could probably start by taking a look at AppArmor code base for the same respective utilities, as they are simalay written in python. However, given the custom markup for AppArmor profile language, the library design exhibits a number of quirks to accommodate the less structured format. Still, I’d like to replicate the CLI frontend experience, or at least abstract the API to support equivalent GUI tools as well.
https://gitlab.com/apparmor/apparmor/tree/master/utils/apparmor
#### Example CLI interface
<details>
```
$ ros2 security logprof
Reading log entries from /home/user/.ros/log/
Updating SROS profiles in /home/user/.ros/sros.d/
Complain-mode changes:
Profile: ros/wheatley
Access mode: publish
Topic: /chatter/foo
[1 - topic /chatter/foo p,]
(A)llow / [(D)eny] / (I)gnore / Audi(t) / Abo(r)t / (F)inish
Adding topic to profile:
<topics publish="ALLOW">
<topic>/chatter/foo</topic>
</topics>
...
= Changed Local Profiles =
The following local profiles were changed. Would you like to save them?
[1 - ros/wheatley]
2 - ros/listener
3 - ros/navigation2
(S)ave Changes / Save Selec(t)ed Profile
[(V)iew Changes] View Changes b/w / (C)lean profiles / Abo(r)t
Writing updated profile for ros/listener.
Writing updated profile for ros/navigation2.
Writing updated profile for ros/wheatley.
```
</details>
Finally, with respect to measurements for permission modeling, the acquisition of which has proposed in #110 via Integration with DDS Security builtin logging plugin. However, supporting additional measurements source alternative to log events, such as DDS discovery data mentioned above, may prove much quicker to implement. For example, in the Procedurally Provisioned Access Control publication above, this was done simply via an XML transform between RTI Connext DDS discovery export format to SROS2 policy file.
Having a tool similar to aa-genprof / aa-logprof in AppArmor world,
The application is audited first and a “template” config file is generated by the tool, asking the
user to confirm if each rule is OK. It also can server as a “linter” / suggest improvements.
We could implement the same for DDS ACL policies.
Could be implemented either using the DDS logging modules or by scraping DDS discovery information.
There are also some idea about allowing ROS 2 nodes to declare IDLs containing the list of topics they public/subscribe to, actions, services, parameters, etc.
Thomas: this would be interesting as it would allow us to run linting / static analysis on applications (are all IDLs compatible), add additional properties to topics like rate limitation and automate e2e testing up to a certain point (IDL declare this node listens to /tf , a test could be generated to check this is actually the case).