Hi,
I hit this when the libvirt service (part of microstack) tries to spawn a dnsmasq
process for a network and again when that calls libvirt_leaseshelpe
r.
$ cat > default.xml <<EOF
<network>
<name>default</name>
<forward mode='nat'>
<nat>
<port start='1024' end='65535'/>
</nat>
</forward>
<bridge name='virbr0' stp='on' delay='0'/>
<ip address='192.168.123.1' netmask='255.255.255.0'>
<dhcp>
<range start='192.168.123.2' end='192.168.123.254'/>
</dhcp>
</ip>
</network>
EOF
$ sudo uvtool-checkbox.virsh --connect qemu:///system net-define default.xml
ubuntu@focal-snaptest:~$ sudo microstack.virsh --connect qemu:///system net-start default
error: Failed to start network default
error: internal error: Child process (VIR_BRIDGE_NAME=virbr0 /snap/microstack/current/usr/sbin/dnsmasq --conf-file=/var/snap/microstack/common/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/snap/microstack/current/usr/libexec/libvirt_leaseshelper) unexpected exit status 127: /snap/microstack/current/usr/sbin/dnsmasq: error while loading shared libraries: libidn.so.11: cannot open shared object file: No such file or directory
There is a workaround in the yaml supposed to get the libraries for spawned subprocesses right:
=> https://opendev.org/x/microstack/src/branch/master/snapcraft.yaml#L1243
build-environment:
# Libraries under /snap/$SNAPCRAFT_PROJECT_NAME/current/usr/lib/x86_64-linux-gnu are not added to the
# runpath by default. This is OK for parent processes which get LD_LIBRARY_PATH set properly but not
# for the child processes they spawn since the environment variables are not passed down to children by default after execve(2).
# `readelf -d /snap/microstack/current/usr/libexec/virt-aa-helper` should return something like:
# (RUNPATH) Library runpath: [/snap/microstack/current/usr/lib:/snap/microstack/current/usr/lib/x86_64-linux-gnu:...]
- LDFLAGS: '$LDFLAGS -Wl,-rpath=/snap/$SNAPCRAFT_PROJECT_NAME/current/usr/lib -Wl,-rpath=/snap/$SNAPCRAFT_PROJECT_NAME/current/usr/lib/x86_64-linux-gnu -Wl,-rpath=/snap/$SNAPCRAFT_PROJECT_NAME/current/lib -Wl,-rpath=/lib/x86_64-linux-gnu -Wl,-rpath=/lib/'
And it indeed carries an extended runpath into the binaries that are built with it.
$ readelf -d /snap/microstack/current/usr/libexec/virt-login-shell-helper | grep runpath
0x000000000000001d (RUNPATH) Library runpath: [/snap/microstack/current/usr/lib:/snap/microstack/current/usr/lib/x86_64-linux-gnu:/snap/microstack/current/lib:/lib/x86_64-linux-gnu:/lib/]
But on one hand it isn’t working even with this in place and on the other hand there are other binaries like e.g. dnsmasq
, but that is from a .deb. I don’t want to rebuild
all binaries that my service might call and infuse them with rpaths.
(And as I mentioned it doesn’t always work with runpath anyway)
The call chain is like
libvirtd.service
-> dnsmasq
-> /snap/microstack/current/usr/libexec/libvirt_leaseshelper
To be clear, the lib is there in the snap
$ ubuntu@focal-snaptest:~$ sudo find /snap/microstack/ -name '*libidn.so.11*'
/snap/microstack/233/lib/x86_64-linux-gnu/libidn.so.11
/snap/microstack/233/lib/x86_64-linux-gnu/libidn.so.11.6.16
I found it to be related/similar to call things through sudo.
I was comparing real /usr/bin/sudo which was causing the same issue
ubuntu@focal-snaptest:/home/ubuntu$ /usr/bin/sudo /snap/microstack/current/usr/libexec/libvirt_leaseshelper
/snap/microstack/current/usr/libexec/libvirt_leaseshelper: error while loading shared libraries: libicuuc.so.66: cannot open shared object file: No such file or directory
And I found that microstack has a wrapper for that as well that stops sudo from being used
to avoid that
=> https://opendev.org/x/microstack/src/branch/master/snap-overlay/bin/sudo
With that mapped to bin it works for sudo:
ubuntu@focal-snaptest:/home/ubuntu$ /snap/microstack/233/bin/sudo /snap/microstack/current/usr/libexec/libvirt_leaseshelper
/snap/microstack/current/usr/libexec/libvirt_leaseshelper: try --help for more detail
So calling it through real or fake sudo is triggering/or-not the same issue.
How could I avoid that for the call path of these binaries that I have
libvirt -spawns-> dnsmasq -calls-> leasehelper
The lack of the LD_LIBRARY_PATH is exactly what makes those two differ even for the binary that has the runpath set.
ubuntu@focal-snaptest:/home/ubuntu$ sudo env | grep LD_LIBRARY_PATH
LD_LIBRARY_PATH=/var/lib/snapd/lib/gl:/var/lib/snapd/lib/gl32:/var/lib/snapd/void:/snap/microstack/233/lib:/snap/microstack/233/usr/lib:/snap/microstack/233/lib/x86_64-linux-gnu:/snap/microstack/233/usr/lib/x86_64-linux-gnu
ubuntu@focal-snaptest:/home/ubuntu$ /usr/bin/sudo env | grep LD_LIBRARY_PATH
<nothing>
And indeed if I pass the path manually along sudo then it works.
ubuntu@focal-snaptest:/home/ubuntu$ /usr/bin/sudo LD_LIBRARY_PATH=/var/lib/snapd/lib/gl:/var/lib/snapd/lib/gl32:/var/lib/snapd/void:/snap/microstack/233/lib:/snap/microstack/233/usr/lib:/snap/microstack/233/lib/x86_64-linux-gnu:/snap/microstack/233/usr/lib/x86_64-linux-gnu /snap/microstack/current/usr/libexec/libvirt_leaseshelper
/snap/microstack/current/usr/libexec/libvirt_leaseshelper: try --help for more details
After understanding the above and for the time being I’ve overcome it with a wrapper like
LD_LIBRARY_PATH="%%LDPATH%%" exec $0.orig ${@}
At the overlay part at override-build I set the placeholder to :
sed -i -e 's?%%LDPATH%%?/snap/$SNAPCRAFT_PROJECT_NAME/current/usr/lib:/snap/$SNAPCRAFT_PROJECT_NAME/current/usr/lib/x86_64-linux-gnu:/snap/$SNAPCRAFT_PROJECT_NAME/current/lib:/lib/x86_64-linux-gnu:/lib:\$LD_LIBRARY_PATH?' bin/ldpathwrapper
And organize each affected binary
usr/sbin/dnsmasq: usr/sbin/dnsmasq.orig
And link each of those in override-build
ln --force --relative --symbolic bin/ldpathwrapper "${bin}";
It isn’t perfect yet and my mind-set of this might be still incomplete.
But slightly frustrated I’m almost considering to apply this to ALL binaries in my snap as it is
painful to encounter them one by one.
After the above debugging I have searched the forum and found
- Effects of dynamic library caching on Snap startup performance - about caching, not really my case
- LD_LIBRARY_PATH for custom libraries in snap - mine is not about custom libs
- LD_LIBRARY_PATH in classic snap - reasonable as it mentiones that classic will use rpath
- Several libraries from stage-packages missing despite proper LD_LIBRARY_PATH - also not my case, direct calls work just call chains are my problem
- Not exporting `LD_LIBRARY_PATH` in desktop helpers when using classic confinement - this is about calling programs outside the snap
To me they all seem related, but none answers the puzzle I’m at which is:
"how to ensure globally (not case by case) that call chains will not miss libs
due to having LD_LIBRARY_PATH stripped"?
I’m expecting that I’m just not seeing the whole picture. There must be a better way than
using openvswitch (which is what microstack does). Something global that just
makes spawned subprocesses consider the extra paths correctly?
P.S. Worst case there is nothing to fix this better, but then at least this the discussion
will serve others hitting the same as a document that can be found with search-foo.