Classic version for the grafana-agent snap

Hello,

I would like to request permission to create a track for the grafana-agent snap which uses classic confinement. Classic confinement is needed because the strictly confined snap causes issues with systems that have auditing enabled. The audit log ends up filling the disk too quickly.

See:
https://github.com/canonical/grafana-agent-operator/issues/52 https://github.com/canonical/grafana-agent-operator/issues/83

+1. Classic mode is needed for multiple reasons (like the dynamic, probing nature of the grafana agent), but especially for cis-hardening and such where strict confinement is causing issues.

As per the Process for reviewing classic confinement snaps can you please outline which category of supported application type the grafana-agent snap fits into? Note that page explicitly says an unsupported use-case is “difficulty making strict confinement work” so the current reasoning provided about logging is not really sufficient. Perhaps in this case you should modify grafana-agent to be snap-aware and not try to access these logs.

Admittedly, it does not fall in to any of those categories.

Grafana-agent needs to be able to read the logs. It is what is fundamentally does. Details of the issue can be found here. We can not figure out a way to stop the audit.log messages in a strictly confined snap. If you have any idea how we can do that, it would of course be preferred.

well, the log-observe interface actually gives you full read access to /var/log and all its sub-directories, see:

this seems to be more related to trying to read logs across container boundaries than to the snap itself …

the second issue above actually points to missing dac_read_search capabilities, perhaps the security team could assess if adding this capability to the log-observe interface might be possible, it currently only allows dac_override

(I’m not part of the grafana-agent snap/charm development team)

Another challenge with the grafana-agent snap is that many read operations are blocked since those do not fall under existing interfaces. The node-exporter piece of the grafana-agent pokes /proc and /sys heavily and tries to read the status of systemd units by the nature of Linux system monitoring agent.

Oct 26 12:11:57 up-snail kernel: audit: type=1400 audit(1698322317.541:1674): apparmor="DENIED" operation="open" profile="snap.grafana-agent.grafana-agent" name="/proc/spl/kstat/zfs/" pid=104040 comm="agent" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
Oct 26 12:11:57 up-snail kernel: audit: type=1400 audit(1698322317.541:1675): apparmor="DENIED" operation="open" profile="snap.grafana-agent.grafana-agent" name="/sys/fs/btrfs/" pid=104040 comm="agent" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
Oct 26 12:11:57 up-snail kernel: audit: type=1107 audit(1698322317.545:1676): pid=1053 uid=102 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/login1" interface="org.freedesktop.login1.Manager" member="ListSeats" mask="send" name="org.freedesktop.login1" pid=104040 label="snap.grafana-agent.grafana-agent" peer_pid=1064 peer_label="unconfined"
Oct 26 12:11:57 up-snail kernel: audit: type=1107 audit(1698322317.549:1677): pid=1053 uid=102 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/systemd1" interface="org.freedesktop.DBus.Properties" member="Get" mask="send" name="org.freedesktop.systemd1" pid=104040 label="snap.grafana-agent.grafana-agent" peer_pid=1 peer_label="unconfined"
Oct 26 12:11:57 up-snail kernel: audit: type=1107 audit(1698322317.549:1678): pid=1053 uid=102 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_method_call"  bus="system" path="/org/freedesktop/systemd1" interface="org.freedesktop.systemd1.Manager" member="ListUnits" mask="send" name="org.freedesktop.systemd1" pid=104040 label="snap.grafana-agent.grafana-agent" peer_pid=1 peer_label="unconfined"
Oct 26 12:11:57 up-snail kernel: audit: type=1400 audit(1698322317.593:1679): apparmor="DENIED" operation="open" profile="snap.grafana-agent.grafana-agent" name="/proc/sys/kernel/threads-max" pid=104040 comm="agent" requested_mask="r" denied_mask="r" fsuid=0 ouid=0

As a result, the grafana-agent snap cannot offer enough metrics necessary to monitor Linux systems although there is no such challenge with the prometheus-node-exporter package in universe.

https://github.com/canonical/grafana-agent-operator/issues/23

We had discussed using a deb for this but decided a snap would be more “proper”. Perhaps we may need to look in to making a deb.

My understanding is that system-observe is supposed to allow access to /proc and /sys but it does not seem to be working correctly.

well, some bits might only be covered by hardware-observe … in general it would make sense to ask users that hit denials to run snappy-debug alongside the snap to see if there are interfaces missing… we also should look if interfaces are lacking features and we might need to extend them (though i’m not sure we’ll find an easy solution for reading systemd units in the short term)

Maybe so, but something is different, as it’s not being tripped by auditd when using classic mode, but is when using confined mode.

FWIW this falls in a very similar camp to parca-agent, for which the discussion can be found here. I think this probably should qualify for classic, and it’s likely that other low-level observability collectors will too?

Edit: perhaps we need to consider having “Observability agents” as a category? They’re in a similar camp to public cloud agents, I think.

1 Like

I agree with @jnsgruk. This is a telemetry agent that requires a wide array of different permissions - and as @nobuto_m is pointing out, many of them don’t have any interfaces currently.

@alexmurray, how do we continue from here?

So I think it is not too much of a stretch to consider grafana-agent as fitting within the existing category of debug tools within the classic confinement process. As such, it requires unrestricted access to the host system directly. Thus, the requirements for classic confinement for grafana-agent are understood. I have vetted the publisher (a Canonical employee) - this is now live.

1 Like

Thanks for getting this approved!

@alexmurray Can you also create a 0.40-classic track?

@dstathis unfortunately @reviewers do not have these permissions - only folks from the snap store team. @odysseus-k @lofidevops @verterok can you please assist with the track creation?

I need to move forward with this. Can we create the track @odysseus-k @lofidevops @verterok?

Hi,

The 0.40-classic track has now been created for the grafana-agent snap.

Thanks,

Odysseus