Applicable to:
- SolusVM 2
Symptoms
After updating Docker packages to version 20 on Ubuntu 20/CentOS 8 CR nodes, SolusVM 2 stack cannot start properly. The errors like the following can be found in tail -n 40/var/log/syslog or tail -n 40/var/log/messages:
CONFIG_TEXT: Dec 10 13:48:35 server.solus.io dockerd[564826]: time="2020-12-10T13:48:35.350129897Z" level=error msg="fatal task error" error="container ingress-sbox: endpoint create on GW Network failed: failed to create endpoint gateway_ingress-sbox on network docker_gwbridge
Cause
The issue has been reported to Docker developers as #41775
Resolution
Here are workaround steps:
- Connect to affected SolusVM 2 CR node via SSH
-
Stop solus stack:
# docker stack rm solus
-
Remove swarm:
# docker swarm leave --force
-
Remove
docker-ce*andcontainerd.iopackages# apt purge docker-ce
# apt purge docker-ce-cli
# apt purge containerd.io -
Remove docker related rules from firewalld:
# firewall-cmd --zone=trusted --remove-interface=docker_gwbridge
# firewall-cmd --delete-zone=docker --permanent -
Create backup of network files:
# mkdir docker_files_bkp
# cp -av /var/lib/docker/network/files/* docker_files_bkp/and remove them
# rm -rf /var/lib/docker/network/files/
-
Remove
docker0anddocker_gwbridgeinterfaces# ip link del docker0
# ip link del docker_gwbridge -
Install Docker in accordance to their documentation
# apt-get install docker-ce docker-ce-cli containerd.io
-
Initialize stack
# docker swarm init --advertise-addr 127.0.0.1 --listen-addr 127.0.0.1:2377
-
Start solus stack
# docker stack deploy --with-registry-auth -c /usr/local/solus/config/stack.yml solus
- Connect to affected SolusVM 2 CR node via SSH
-
Stop solus stack:
# docker stack rm solus
-
Remove swarm:
# docker swarm leave --force
-
Remove
docker-ce*andcontainerd.iopackages# yum remove docker-ce
# yum remove docker-ce-cli
# yum remove containerd.io -
Remove docker related rules from firewalld:
# firewall-cmd --zone=trusted --remove-interface=docker_gwbridge
# firewall-cmd --delete-zone=docker --permanent -
Create backup of network files:
# mkdir docker_files_bkp
# cp -av /var/lib/docker/network/files/* docker_files_bkp/and remove them
# rm -rf /var/lib/docker/network/files/
-
Remove
docker0anddocker_gwbridgeinterfaces# ip link del docker0
# ip link del docker_gwbridge -
Install Docker in accordance to their documentation
# yum install docker-ce docker-ce-cli containerd.io
-
Initialize stack
# docker swarm init --advertise-addr 127.0.0.1 --listen-addr 127.0.0.1:2377
-
Start solus stack
# docker stack deploy --with-registry-auth -c /usr/local/solus/config/stack.yml solus
-
Make the runtime firewalld zones permanent:
# firewall-cmd --runtime-to-permanent
Comments
Please sign in to leave a comment.