The case of privileged systemd container and hardware clock

Today we debugged yet another issue which eventually turned out to a time synchronization problem with multiple machines.

Let us first look at the case we solved today:
The setup consisted of 3 nodes, 1 master OpenShift deployment. Whenever we started a privileged systemd container on a node, the node would enter into a NotReady state. These nodes were VMs that were hosted on a VMware EXSI server and the peculiarity was that the issue wasn't seen on a similar OpenShift setup hosted on a libvirt hypervisor.

To truly understand the issue, you need to know about hardware clock and system clock. Hardware clock is maintained on the chip and powered by a battery for the duration when the system is shutdown. System clock is maintained by OS in memory. When the computer boots, OS picks up the time from Hardware clock and sets the system clock. To mitigate clock skew, system clock is updated by OS at regular intervals using ntp or chrony or similar services. Now, hardware clock does not have a timezone attribute, hence it is duty of the admin to tell OS how to interpret it.

As you might have already figured, this hypervisor had wrong date/time set in hardware clock. It was also set to provide the hardware clock time as the hardware clock time to VM on its boot. Once the VM was up, it corrected system clock using NTP. However, when privileged systemd container started, it picked the wrong date/time from hardware clock and set system clock again. This made most of the services on the node to behave strangely and the node went into NotReady state.

You can avoid many errors like this if you take care to have a time sync service running on your machines.

This brings "time sync" in third position on the list of things to check when debugging a setup:

1. firewall
2. selinux
3. time sync





Comments

  1. However should you do encounter an issue first take a look at|have a look at} our bother taking pictures guides and if that does not resolve the difficulty contact our group to ask for help. Please tell us the type of|the sort of} gadget may be} using and the model of the app . This will happen inside 24 hours although you may have to clear your browser cache and restart your gadget for this modification to take effect. If you'll be able to|you probably can} still entry this url while 포커 BetBlocker is active, please clear your browser cache, restart your gadget and check out|and take a look at} again. And should you do not thoughts your story being printed (we won't do this with out your direct consent and we are going to remove any personal information), you'll be able to|you probably can} help to encourage other people who are themselves} contemplating BetBlocker.

    ReplyDelete

Post a Comment