Disabling automatic refresh for snap from store

Another good reason to allow system administrator to do updates when they need.
I’ve got a lxc cluster with lxd installed from snap, because it’s easy and the official documentation says it’s a good way to get the latest stable version.
Today lxd from snap refreshed at 11.44am and all my lxc containers crashed. On a production server. When I really didn’t need it to happen.
Now who’s fault is that? snap for not allow updates when I say I need them? lxd for releasing an update poorly tested? Mine for trusting snap on a production environment?
Sorry for this rant, but it’s ridiculous that you force updates to people perfectly capable to do their own testing and then schedule it when it suits them.

4 Likes

As a professional system administrator this is very easily solved by simply setting the update schedule to a date after your next regular maintenance window.

then you simply manually run snap refresh lxd while doing your regular announced maintenance and make sure to set the schedule to a date after your next maintenance window again.

snapd allows very flexible scheduling (and delays for up to 60 days) to integrate it into your regular scheduled maintenance.

indeed, if you do not use or set up this feature it will simply assume that you did not want to delay the update and do its duty to not leave you with security holes or unfixed bugs on the machine …

1 Like

As a professional system administrator of a small company I sometimes don’t have time to update servers every 60 days. Also as far as I understand from reading other threads, these updates can be postponed only a certain number of time, after that they get forcefully upplied.
I appreciate the effort to make the internet more secure, but you can’t shove security down other people’s throat.
I’m happy to face the consequences of my lack of security, since it depends entirely on my actions. But I wouldn’t want any more complaints from clients about services stopping unexpectedly.
Said so, I’ll stop asking/spamming. I understand there’s no agreement to be reached here.
Thanks for you reply anyway. I’ll try to do my job better from now on.

1 Like

i didnt mean to attack you, really :slight_smile:

i guess it is us who is doing a really bad job in advertising how to properly deal with snaps in such use cases (else this thread would probably not exist at all).

IMHO (and note that i was not involved in the decisions) the current behaviour is a good compromise to give you enough control while still making sure your install can not become harmful, because even though you are:

…us others are probably not that happy if your lxd cluster becomes part of the next botnet that DOSes our webservers, spreads the encryption trojan that makes us loose all our data if we dont pay some blackmailer etc :wink:

The internet is the biggest community project of mankind, each of us has a responsibility, most of us do not care though …

The behaviour of snaps is a little like the friendly policeman that regulary nudges you about that wide open weapon cabinet next to your wide open garage door so that your neighbor doesnt get shot with one of your guns by the theive that dropped by in your street.

It is annoying, no question, but you have control and it should be our job to teach people about how to exercise this control so it does not catch you by surprise and you can actually plan with it …

I am genuinely curious if someone more familiar with security research than I am could comment on this. It seems to be taken as a forgone conclusion that automatic updates result in better security. Of course, most of us have seen widely publicized stories of servers being attacked using known vulnerabilities that should have been patched months prior but for human error. Are there no cases of servers being attacked using vulnerabilities that were introduced through automatic updates? How do we know the latter case is less probable, or is that a hypothesis that snap aims to test? The pessimist in me assumes that software changes bring new bugs. I appreciate the contributions on all sides of this dialog.

1 Like

@lance. Yes, there is a trade-off. The bleeding-edge introduces bugs, which is the advantage to the Debian, etc. approach of back-porting security fixes into older, better-vetted versions. However, as some will point out, Debian, RHEL, etc. usually only back-port significant security fixes and minor ones will remain un-patched. Put another way, the answer is complex, but you are correct.

To some degree automatic updates offer a false sense of security. It is true that automatic updates increase security for users who never update. However, it is also true that automatic updates may decrease security for those who are more security minded. This whole chain poses some uses-cases. My problem is that I posted such a use-case and it was completely ignored. I decided to give it another shot and started a new thread with a reasonable solution: again no interest. (See Hook to run scripts before and after refresh) Had someone, anyone, shown an interest, I might believe that automatic updates in snaps are totally about security. Given the total lack of interest in cases where automatic updates harm security, I don’t buy the story. Sorry devs, but you need to listen a bit better or, at-least, fake some interest.

2 Likes

@niemeyer since, with the above quote, you effectively promised to address cases where the status quo is not working well, could you have a look at tony’s suggestions? If the snappy team is not able to address a lot of use-cases with the status quo (either because the status quo can’t address them or because the snappy team doesn’t have enough time) then I guess it’s time to introduce the global off switch? :slight_smile:

@Ads20000 Thanks for pushing.

In my view, the devs have things backwards. Some day, snapd (and deps) will be heavily vetted. The chance of a breakout will be slim, signature handling, etc. will be solid. Tampering detection will be trustworthy. The snap store will be trustworthy. When that happens, the need for a global off switch will be much, much, much less. However, that day is not today. A global off switch is needed in the beginning, when snapd (etc.) can’t/shouldn’t be trusted and should be monitored.

Listen devs, if I didn’t think snaps have potential, I (and others) wouldn’t waste my (our) time posting. We are trying to help. Meet us halfway.

2 Likes

That’s exactly what we’ve been doing for years now. The first releases of snapd could not block refreshes at all, period. Nowadays there are several mechanisms that allow postponing them in specific circumstances, from metered connections, to explicit holding, to delayed results at boot, health checks are coming, etc. We take this seriously and have been demonstrating that with actual development time.

On the other hand, I hope you can also meet half way, and realize that some of your assumptions might not be entirely valid. For more than a decade we’ve been responsible for a system that depends on people updating their software manually. We have a reasonable understanding of many of the involved issues, including the fact that once such a switch exists, the dynamic of the whole ecosystem changes, and it’s hard or impossible to go back to automated updates.

As for the topic you raised, these hooks already exist today, in your local snapd. They are called pre-refresh and post-refresh, and that’s documented.

I do apologize for not having read and responded timely to your message in the forum, though. For reasons which are both personal and professional I’ve got a backlog in the forum that I still need to go through, and it was unfortunate that nobody else did reply to your topic timely.

If you want to know more about many of the additional upcoming features, some of them related to automatic refreshes, we’ve just finished a sprint yesterday, and you can read the full notes here in the forum.

2 Likes

i bet there are … but i also bet there are a magnitude more attacks that are using well known and unfixed vulnerabilities …

typically a security update closes something that is more or less widely known and while a new feature that comes with an update will surely introduce new unknown bugs, they are exactly that … unknown and will hopefully be fixed with the next security update after they moved into the known state. automatic updates can not protect you from newly introduced bugs, but they can keep the window of being vulnerable by known security issues very small.

1 Like

@niemeyer Thanks for the response. In the thread that I started and linked above (Hook to run scripts before and after refresh), I responded. For summary for those reading here, I am asking for something different.

Here, I want to say thanks for the sprint info. It looks good. I want to strongly encourage the “Prevent refreshes while applications are in use” item. For long running jobs, think research and scientific computing, an errant automatic update could cause the loss of days of compute. I also mentioned this in one of my earlier posts (months ago).

1 Like

I agree, and that makes sense in the traditional package management ecosystem. @tony also brought up Debian and other distribution maintainers backporting security fixes, which is a hugely valuable service. However, it seems to me that snapd is forcing automatic updates not just for critical security fixes but for features and (sometimes) breaking changes, and that is surprising users as is evidenced by the posts here. What channel should users subscribe to if they want critical security fixes but do not need the latest features? Stable is not currently providing that functionality despite what the name may imply.

I think tracks may have been intended to fill that role, but I do not think many projects are using them consistently right now. Perhaps tracks just need a little more support in order to be widely adopted, e.g. allowing snap maintainers to create and manage tracks through snapcraft or snapcraft.io.

Do we know if the update that crashed @Syco’s containers was a security patch, or was it just an update for update’s sake? I think the answer itself is not so important, but the possibility of non-critical patches causing service disruption is problematic in my opinion.

1 Like

This problem stems from the fact that there is very high mental overhead in implementing snaps custom solutions instead of enabling/disabling updates per a custom process that fits some uncommon user’s needs. (I am assuming here that a common user will be well served by the existing defaults in place.) Why would I, as a snap user, need to spend time to learn a new update paradigm in order to use snaps?

Such a switch can be implemented by the user as has already been mentioned here. To speak for myself I am still posting here because I appreciate the work being done by the snap team. I want a solution that comes with the platform instead of fighting the platform in order to have my computer fill my needs. As a rule of a thumb, whenever you have to fight the platform in order to work for you, you should be looking for an alternative.

People not upgrading their software is a social problem not a technical one, you cannot solve a social problem with technical solutions.

4 Likes

If changes break the previous data format (a particular type of breaking change that may make it impossible to use older files with the snap - quite a serious issue!) then, when epochs are implemented, the application won’t be automatically upgraded to it, at least, that’s how I understand epochs work? @niemeyer can confirm. I don’t see why ‘breaking’ changes in terms of API or UI changes need to be held back (though they can be disruptive if someone is doing a mission critical task, they launch the app, and it takes them longer to navigate the UI/API, does the snappy team recognise this as a potential issue? Maybe the application author is allowed to specify a new epoch for this kind of breaking change too?)

This is risky because snap maintainers may not be using them correctly and people expect a consistent experience when using tracks… I think this is an element of control which snappy may be justifiably reluctant to give to developers, can you elaborate on why handing over control of these is justified? I mean, it should be possible perhaps for a track command in snapcraft to automatically create a new thread on the forum requesting a new track? You’re right that tracks help to resolve this problem, if a user specifically requests to avoid breaking changes, then assuming the tracks are used correctly and the project uses Semantic Versioning, tracking e.g. the 2.x track (probably just called ‘2’) will not automatically introduce breaking changes and you’d be able to manually choose when you want to switch to a more recent track.

Yes, the snappy team say that devs should be running automated tests etc to ensure that updates don’t cause these problems, but in case they do occur, those using snappy in mission-critical situations should be using the refresh timer to ensure that they can update at a time that they expect. Updates can be delayed for up to a month (I think?) by using that method. Sysadmins are effectively not permitted by snappy to delay their updates for longer, because snappy reckons that to do so is to endanger their users to possible security vulnerabilities etc (you can say that ‘well they should only get automatic security updates’ but often security/bugfix updates are not applied to old releases and a minor or major update is the only way to get the security update) and snappy believes that it has a role to protect its users, even against sysadmins’ wishes. I’d say that snappy forced updates are much better than Windows updates, from a user’s perspective, because, like most other GNU/Linux updates, they must more often don’t have to be applied when restarting the system, so you can actually use your computer whilst the updates are being installed!

Why would I, as a GNOME (albeit the Ubuntu modified session) user, need to spend time to learn a (somewhat) new desktop paradigm in order to use my desktop? Well, I do, because the devs chose to change the desktop paradigm and I’m fine with that! If I absolutely despise it, I can get on GNOME’s GitLab and IRC and fight for change (as you can do for snappy in this thread), if I want more leverage I could contribute to the project elsewhere and hope that, by meritocracy, I would get more of a say on this issue, or if I absolutely despise the paradigm change, I can use a different desktop environment, or package manager.

The snappy team reckons that a paradigm change is needed when it comes to updating software, can you prove to them that it is not? And since this (presumably, like Ubuntu) is a meritocracy, not a democracy, they’re not obliged to listen, though I think the team have done a sterling job of at least replying to criticism in a good-natured way, despite their workload :smiley:

Yes, if you absolutely despise snappy’s forced refreshes you should switch to Flatpak, AppImage, Nix, or traditional packaging :stuck_out_tongue:

That’s a neat quote, but is it true? Can you give an argument in support of this statement, or are you intending it to be a tautology (because it doesn’t seem to be, I’m not convinced those two things are mutually exclusive)? I guess it’s on the snappy team (primarily @niemeyer since this strategy is his idea), and supporters of the current approach of the team, to find an example where a technical solution has indeed solved a social problem. Perhaps undesirable work can be considered a social problem which automation could potentially resolve? So one can see how technology can in fact solve social problems, and your statement is not true? We only have to find one example to prove that your statement is false, and I think my example works!

Also, what will create change here, as ever, is actual use cases (like the LXD one - preferably with logs) that show why the current solution isn’t working, and the minimal possible changes to fix the use cases, short of introducing an off switch, if possible. If an off switch is the minimal solution, then it needs to be demonstrated why that is the minimal solution, why other apparent minimal solutions don’t work.

1 Like

And what is next revision has the same bug? And if that bug causes kernel panic, so reverting might be not that simple as SSH into host? This is exactly case for me with LXD, currently. revision 8774 used to work, but later ones - end up with kernel panic once lxd is started. I reverted to 8774, but I afraid next revision will end up with the same pain.

3 Likes

And an hour ago automatic update again crashed my host with kernel_panic :frowning:

1 Like

Can you open a topic and provide the output of snap version, what distro you use, and anything specific about your LXD setup? If it’s going down with a kernel panic then it’d be great to debug that further.

2 Likes
2 Likes

The snap team is smart enough to understand how valid that quote is and how valid it is not. The snap team is in control of the technical solutions and that’s why they are using them. Social solutions would be much harder for them to implement in order to achieve the same result. One could argue that the reason it would be much harder to force, through social means, users to upgrade so forcefully it is because that is not the correct approach.

And I am arguing that their paradigm change is misplaced. They can speak with UX designers about the problems of having two separate upgrade methods of updating software within one installation. I 'd be surprised if a professional UX designer would argue that installing skype via deb from Microsoft’s repo or via snap from the snapstore should matter in how skype updates for the end user. I 'm pretty sure Canonical employs people with professional UX experience. If I were to guess the reason they think, as you say, that a paradigm change is needed, is because the UX has been designed with IoT deployment use cases in mind instead of linux desktop user use cases.

What makes you think that I haven’t?

Just like you suggested that I get involved in Gitlab and IRC if I don’t like something about GNOME, I am involved here because I don’t like something about snaps but I do like the overall technology and appreciate the effort put into it. I hope that as the platform matures, the developers will care more about allowing desktop users to have as much control over snaps and their updates, as they have over which kernel they run and when they update it.

1 Like

On the initial question of a developer’s own snaps, the Chrome store implements a setting “max deploy percentage”, which if set to zero, means that no extensions gets updated automatically.

Providing an option like this for snap publishers can help… (They can set it to 100% for critical security updates and quite low for risky feature updates, in order to prevent all devices from breaking at once) (For manual refreshes an option (enabled by default?) can be provided to update anyway to the latest release in the channel or stick to the normal rules)

(Kernel and Core snaps causing unexpected reboots are a different issue. I do know that validation assertions are used for some devices to control which version gets rolled out… (It will actually downgrade if you manually install a newer version))