Ensuring proper termination of services on errors

@mborzecki I suspect we might have an issue with the logic merged in PR #4594, which was potentially there before as well: we need to ensure the proper termination of services in the case of failure.

In other words, we have tasks that start up services in our change system, and that task can failure and cause the rest of the flow to be undone. On such errors, we do not call the undo of the service to stop it, by design, because the termination of the task itself was clean even if it was in error. In those cases, we actually want to undo just what was really done, and only if the task reaches its final intended state is that we ever run the undo node on reversals.

So, for the implemented logic, we need to make sure that this is behind handled properly now that we start multiple services at once.

Also, taking this chance, we may have a related issue on mounting operations, as reported here yesterday. We need to ensure that if the mounting operation error out for whatever reason, all of the modifications are undone, including any mount units that were created. This is already done correctly in the undo path, but we need to handle it as well in the error path, same as the services case presented above.

@mborzecki Can you please have a look at this latter issue as well after you verify the first one related to your PR that was just merged?

Thank you!

It’s not perfectly clear in the diff, but we go through the list of all apps in the snap and set up a deferred call to stop each one if the whole wrappers.StartServices() errors out. See the relevant code here: https://github.com/snapcore/snapd/pull/4594/files#diff-08088787360fb3ca74a0aed4c6103b22R119

I might have missed some corner case though, so I’ll be sure to go through the whole path again on Monday.

Thanks for double checking. Can you please make sure this is the case for mounts as well?

Thanks again.

Right, so the sockets were not getting disabled once they got enabled and the corresponding start failed. THe PR addressing this is https://github.com/snapcore/snapd/pull/4607.

I’ll look into the mounts next.

Another propsed branch, snap setup & mounts this time: https://github.com/snapcore/snapd/pull/4615 Turns out that if the mount unit fails for whatever reason, we will leave some files behind.