A desktop notifications client API for snapd

jamesh · June 18, 2021, 11:48am

Here’s a dump of my thoughts on a notification client API for internal use within snapd. This is mostly for the benefit of @pstolowski who is working on snapd’s refresh awareness code.

Before the current github.com/snapcore/snapd/desktop/notification code from @zyga, I had written another prototype package that tries to provide a single API that can talk to both the FDO and GTK notification APIs:

At present, the GTK API is only supported by gnome-shell and GNOME also supports the old FDO API. But I think it is worth using the GTK API when it is available:

The default desktop on most major distros is based on GNOME, so it represents a large chunk of our desktop user base.
The GTK API uses client provided IDs to identify individual notifications. So it is possible to post a notification, exit, and have a new instance of the app withdraw or replace that old notification.
When the user clicks an action in a notification, only the associated application gets notified rather than every application using desktop notifications. If the associated application is not currently running, it will be started via D-Bus service activation.

I think (2) is particularly important given some of the feedback we’ve received about the current state of refresh awareness: if we’re filling users notification trays with repeated notifications about the same snap, then they will block snapd from sending further notifications (and they’d be right to). If we use the snap instance name to build our IDs, then we’ll automatically replace the previous notifications.

Minimal API for Notifications

If we temporarily ignore notifications with actions, I think we could handle this through three methods that would be implemented for both the GTK and FDO notifications API:

SendNotification(id string, notification Notification) error to send or update desktop notifications.
WithdrawNotification(id string) error to withdraw an existing notification using the ID passed to SendNotification
IdleDuration() time.Duration to indicate how long it is since we were “busy”, in the same sense as the existing idleTrackertype in the session agent.

The Notification type would be a struct describing the notification: title, body, icon, etc. The definition I used in my prototype had the benefit that go-dbus serialised it to the format the GTK notification API expects.

GTK implementation

The first two APIs map directly to the equivalent D-Bus calls. As the GTK notifications API is stateless on the client side, IdleDuration can just return the time since we first started using the notification API.

FDO implementation

We need to keep an internal mapping of our own notification IDs to the ones returned by the server. If we try to send a notification with an ID that has already been used, we need to pass along the old server ID to have it replace the old notification, and then record the new notification ID in our mapping.

To withdraw notifications, we again need to use the internal mapping to decide which server generated ID to call CloseNotification on.

In addition to this, we need to have a goroutine watching for NotificationClosed signals from the notification server so we can prune our mapping table.

The IdleDuration function should return 0 if our ID mapping table is not empty. If it is empty, it should return the amount of time since the last entry in the table was removed. This will make sure the session agent remains active for as long as any of its notifications remain available.

Handling Actions

Notigfication actions are handled fairly differently between the two APIs, but I think we could implement something fairly sane.

The GTK API builds on top of D-Bus activatable desktop entry specification. We’ve got the building blocks of this already implemented for the session agent: it connects to the session bus when starting, and we install a desktop file matching the D-Bus name. All that’s needed is for the session agent to export the appropriate API on the session agent’s D-Bus connection.

On the FDO side, the notification server broadcasts an ActionInvoked signal. The previously described IdleDuration API should make sure the session agent remains active long enough to receive any relevant signals. We can use the ID mapping to determine whether the action corresponds to a notification we created. One possibility would be to simply call the ActivateAction method on whatever value we exported on the bus for the GTK API, so that the same code can try to process the action.

pstolowski · June 21, 2021, 12:08pm

Thank you for summarizing and sharing this, I think this is clear and straightforward (maybe except IdleDuration, but I’m sure it will become clear once I dig into the implementation).

jamesh · June 21, 2021, 12:25pm

The idleDuration thing is used by a goroutine in the session agent to implement the exit on idle behaviour (which we want, both to minimise resource consumption and make sure we pick up new versions of the session agent when snapd is upgraded).

In essence, the goroutine does:

Sleep for the idle timeout (30 seconds).
Ask the server how long it has been idle. If it has been greater than or equal to the idle timeout, then initiate shutdown of the server.
If it has been idle for less time, sleep for the idle timeout minus the duration we’ve already been idle and go to step (2).

This avoids synchronisation problems that would come from trying to actively signal the the idle checker goroutine when new activity occurs. And by varying how long the time the goroutine sleeps for, it can still reliably shut down after a fixed amount of inactivity.

Currently it is only considering HTTP requests to decide whether the server is idle, but it will probably need to check a few more sources in future:

in-flight D-Bus method calls (either direction)
active desktop notifications (for the FDO case)

Having the goroutine consider extra idle sources is a fairly simple extension: just ask each how long they’ve been idle and use the minimum value.

jamesh · July 6, 2021, 3:01pm

@pstolowski was asking about how icons are represented in the GTK notification API. The main non-code documentation of the API is here:

https://wiki.gnome.org/Projects/GLib/GNotification

… which unhelpfully just says it is a “serialized GIcon”. In the GLib source, GIcon is an interface with multiple implementations, with each providing a serialize() method that returns a (sv) struct. The first member of the struct is a string identifying the type, and the second is a variant whose interpretation is dependent on the particular implementation.

In terms of go-dbus, this could be represented as:

type Icon struct {
    Type  string
    Value dbus.Variant
}

An icon referencing an image file on disk could be created as Icon{Type: "file", Value: dbus.MakeVariant("/path/to/image.png")}.

Other possibly useful cases include:

Image data stored in memory: Icon{Type: "bytes", Value: dbus.MakeVariant(byteArray)}
A named icon from the current icon theme: Icon{Type: "themed", Value: dbus.MakeVariant(stringArray)}, where stringArray is a list of names to use in priority order.

Like other values in the a{sv} notification dictionary, the icon struct will need to be wrapped in a variant.