--use-lxd doesn't work with non-local remote

abitrolly · May 27, 2020, 7:41am

Need help with unmocking FakeContainer to fix this bug.

If local remote is not the default.

✗ lxc remote list
+----------------+------------------------------------------+---------------+-------------+--------+--------+
|      NAME      |                   URL                    |   PROTOCOL    |  AUTH TYPE  | PUBLIC | STATIC |
...
| local          | unix://                                  | lxd           | file access | NO     | YES    |

Then snapcraft --use-lxd will fail.

✗ snapcraft --use-lxd
Launching a container.
Waiting for container to be ready
Error: not found
An error occurred when trying to execute 'mv /var/tmp/L3Jvb3QvLmJhc2hyYw== /root/.bashrc' with 'LXD': returned exit code 1.

The problem is that snapcraft uses pylxd for everything except executing commands in a container. For executing it relies on lxc command line tool and the tool uses user config. Personally, I would like snapcraft to use my configured remote, but for a quick fix I ported the offending piece to pylxd.

I didn’t test it locally, yet, because I hoped for CI tests to pass. But they didn’t, and the reason is that they use FakeContainer object instead of real container, which don’t have execute() method, and I am not sure reimplementing pylxd methods for mocking is the right way.

sergiusens · May 27, 2020, 2:22pm

Mounts will also fail if the remote is not local, so getting rid of just our use of “lxc exec” will not be enough.

abitrolly · May 27, 2020, 5:13pm

I know that mount hack doesn’t work - https://github.com/yakshaveinc/linux/issues/32 - but there is no company to sponsor the solution with transparent network access.

jamesh · May 28, 2020, 1:59am

Does this properly describe your problem?

You’ve edited ~/.config/lxc/config.yml to set default-remote to something other than local
You are happy to have Snapcraft builds run locally.
Snapcraft’s pylxd calls successfully talk to the local LXD service …
… but when it shells out to lxc exec instancename it ends up talking to a remote LXD service

One way to solve this would be to continue shelling out to lxc exec but be explicit about which remote we want to talk to. Calling lxc exec local:instancename would probably do the trick.

abitrolly · May 28, 2020, 4:14pm

You’ve edited ~/.config/lxc/config.yml to set default-remote to something other than local

I used lxc remote switch <remote>.

You are happy to have Snapcraft builds run locally.

Not really. My laptop is only 8Gb memory and I’d prefer to offload all expensive build tasks to dedicated Linux station with LXD, containers and a lots of RAM.

Snapcraft’s pylxd calls successfully talk to the local LXD service …

Yes.

✗ lxc list local:
WARNING: cgroup v2 is not fully supported yet, proceeding with partial confinement
+----------------+---------+------+------+-----------+-----------+
|      NAME      |  STATE  | IPV4 | IPV6 |   TYPE    | SNAPSHOTS |
+----------------+---------+------+------+-----------+-----------+
| snapcraft-myip | STOPPED |      |      | CONTAINER | 0         |
+----------------+---------+------+------+-----------+-----------+

… but when it shells out to lxc exec instancename it ends up talking to a remote LXD service

Exactly.

One way to solve this would be to continue shelling out to lxc exec but be explicit about which remote we want to talk to. Calling lxc exec local:instancename would probably do the trick.

And make sure that other params from the user config for lxc are not used. To avoid surprises. But I think that just rewriting the tests for Use `pylxd` instead of `lxc exec` by abitrolly · Pull Request #3147 · canonical/snapcraft · GitHub should do the trick.

abitrolly · May 28, 2020, 4:24pm

And the reason to use remote LXD is that local LXD on Fedora still doesn’t work out of the box.

✗ lxc remote switch local
WARNING: cgroup v2 is not fully supported yet, proceeding with partial confinement
✗ lxc launch ubuntu:18.04 xxx      
WARNING: cgroup v2 is not fully supported yet, proceeding with partial confinement
Creating xxx
Starting xxx
✗ lxc exec xxx -- bash       
WARNING: cgroup v2 is not fully supported yet, proceeding with partial confinement
root@xxx:~# ping google.com
ping: google.com: Temporary failure in name resolution

jamesh · May 29, 2020, 1:36am

I don’t think it is just the tests that are a problem with your PR. It doesn’t actually seem to use the command passed in to the _run() method, so I’m not sure how it could actually work.

abitrolly · May 29, 2020, 6:46am

Fixed. What is easiest way to run this custom snapcraft?