Issues with the Fedora mirror network

The snapd CI system is observing a large number of failures caused by inability to setup our fedora systems for testing. Fedora was switched to manual runs for now, to unblock landing patches.

In contact with Linode operators I confirmed the following things, based on experiments on our F26 images:

  • There is no special mirror for the Fedora archive available on Linode
  • We are using vanilla configuration for DNF/YUM
  • There’s no mirror registered that would cover Linode using the default automatic redirector used by Fedora

A run of dnf -v clean all && dnf -v install -y mc && dnf -v remove -y mc failed quickly with the following error (full log: https://pastebin.ubuntu.com/26365704/)

Cannot download ‘https://mirrors.fedoraproject.org/metalink?repo=updates-released-f26&arch=x86_64’: Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried.
Error: Failed to synchronize cache for repo ‘updates’

Note that the first part of that error is not normally shown (it’s enabled by -v) but the second line is visible in all failed spread runs we’ve observed lately.

Linode operator sjacobs suggested that we can operate our own mirror inside Linode for complete control.

Further investigations are ongoing

I made an attempt to switch to ipv4 only mode by using dnf install -y -4 but it failed in the same way.

@Conan_Kudo suggested to inspect /var/log/dnf.librepo.log and I collected (fragment) of it here: https://pastebin.ubuntu.com/26365906/

I tried to force the main fedora archive but apparently that itself redirects to other places. At this time I’m giving up.

@Conan_Kudo suggested to use dl.fedoraproject.org instead of download. and this worked.

I ran a loop for a few hours where I would continuously purge all cache, install and remove a package and it hasn’t failed once (this was without ipv4-only limitation). I will prepare a PR that switches over to the main archive as it seems the mirror network is not, in general, reliable.

I proposed a PR with those changes https://github.com/snapcore/snapd/pull/4478

1 Like

The PR has landed. I will monitor the situation!

And immediately after a PR has failed on the same error. May be it is a network issue. I’ll keep investigating and disable the system again, if necessary.