I was researching an issue in the command snap create-key. The issue is that the command create-key gets stuck and the test shows a timeout. For example here this test failed: https://travis-ci.org/snapcore/snapd/builds/241031225
I could reproduce it sporadically and found on the logs that there is a segfault logged by the kernel. I am working on research more about why this segfault is happening.
I manually tried to create a key and it gets stuck. This are the commands running in the machine.
root 18725 18554 0 03:48 pts/1 00:00:00 snap create-key testcachio
root 18736 18725 0 03:48 pts/1 00:00:00 /usr/bin/gpg --homedir /root/.snap/gnupg -q --no-auto-check-trustdb --batch --gen-key
Then, I setup the rngd command and called gpg and I could create a key. After that the snap create-key command started working again: https://paste.ubuntu.com/24838922/ . This seem to be a problem on how the random data generation is setup in the tests.
Hi Sergio, thanks for looking into this! There are some fix in place to mitigate the lack of entropy you mentioned, see Snap create-key timeouts, we also use pollinate https://github.com/dustinkirkland/pollinate to seed the generator. With this in place the frequency of the timeout is much lower, but we are still being hit by it.
which is too low, despite all our efforts. Another interesting data point we have observed is that, after pointing the generator to /dev/urandom and seeding it, the timeout seems to be only happening on ubuntu-16.04-32 (this is confirmed by your log, but maybe it could be a good thing to try to reproduce the problem on other systems to be extra sure), @pedronis suggested to exclude this system from testing until we understand the root cause of the problem.
do you have access to manipulate the kernel cmdline ? if so, putting “rng_core.default_quality=700” on there should help a lot (will force the in-kernel rng to properly push the entropy up)
@ogra, do you know if that segfault that appears in the logs could be making that the entropy is not generated correctly anymore and because of that we are getting what @fgimenez pointed?
kernel.random.entropy_avail = 32
i’ll see if I can add "rng_core.default_quality=700, not sure if it is possible.
This is weird because this problems appears just the 5% of the executions.
yes, that would be a possible explanation (and also one of the reasons why we do not ship userspace rng tools in ubuntu core but force the above in kernel number generator to actually have a proper entropy instead)
@cachio As mentioned there, I’m merging it as it’s a step forward, but can’t we do this by default in the project prepare for all cases? There’s no reason for us to want real entropy for anything generated in those tests, as no artifacts are used. We can only ever get blocked by the usual semantics.