In the conjure-up world we span multiple substrates to install big software. One of the most popular substrate is our Localhost(LXD) provider. It is also one of the main source of bug reports we get due to LXD configuration and installation issues. This is further complicated because on Ubuntu Xenial 16.04 you can have multiple LXD's running at the same time via
apt install or
A lot of our issues come in the form of the following:
A user has previously setup LXD and then attempted to run conjure-up against it. A problem we ran into here are with custom zfs storage options.
Users who have previously installed the initial debian package of conjure-up which setup some custom LXD bridges for use with our single machine OpenStack deployments which leads to the next point.
A user has upgraded from LXD 2.0.9 to the PPA version of 2.13 (at the time of this writing) and has several bridges and and the migration from using
/etc/default/lxd-bridge to managing that within LXD itself.
A user installs conjure-up and can do all things LXD
lxc list;lxc launch ubuntu:16.04 u1 but when it comes to juju it fails to connect to the default
8443 port. One issue was where someone used
lxc-gui to configure their environment.
A user enables ipv6 on LXD.
Now some of these items we can address in the LXD profiles themselves. This is our current approach where conjure-up
spells have the ability to perform LXD profile edits prior to deployment of big software. For example, mapping a custom default storage pool for use with conjure-up.
And in conjure-up we do some checking to validate a LXD environment:
- Install the latest LXD from snap store or PPA during conjure-up installation to make sure we do have latest LXD.
lxc network to create a custom network bridge for use with OpenStack single machine.
- Verify that a
lxdbr0 default bridge exists.
- Verify that
ipv6 is not enabled.
Even with the above checks we still continuously hit issues revolving around LXD. Additionally, conjure-up makes modifications to your existing LXD environment that may or may not be acceptable to users who rely on LXD for things outside of conjure-up. Even if these changes were reversable we are unable to do that because
snapd has no concept of uninstall hooks.
One solution we keep coming back to is to bundle LXD within conjure-up and isolate this environment from the rest of the system. Some of the pros with this is:
- We manage LXD and can control the known versions that will work with Juju. At times there have been api changes on either side that may affect the other.
- Any additions required for LXD to work with conjure-up can be done outside of the users normal LXD installation. We realize we can't get away with not making some modifications on the host system in order for us to provide a ootb experience, but, this would at least isolate those changes.
- Any conjure-up snap upgrades we can verify that LXD is configured properly and make better assumptions as to what conjure-up can expect when doing Localhost(LXD) deployments.
- Repeatable and known good releases with our testing harness.
- More maintenance burden
- Upgrading existing deployments would mean we'd have to attempt to migrate those over to the bundled LXD or socialize the fact that you would have to redeploy. This would be a one-time thing.
- Containers would only be visible via
- Probably a lot of other things..
Also, I think this problem would still surface regardless if we had interfaces to LXD or snap dependency resolution.
I wanted to get this out there to get peoples feedback and any alternative suggestions to solving this problem. The idea of packaging everything we need inside a single snap is appealing, however, we are definitely open to other avenues.