5. Accurate Global Time Synchronization
To have accurate time in all your systems is as important as having a solid network security strategy (achieved by much more than simple firewall boxes). It is one of the primary components of a system administration based on good practices, which leads to organization and security. Specially when administering distributed applications, web-services, or even a distributed security monitoring tool, accurate time is a must.
5.1. NTP: The Network Time Protocol
We won't discuss here the protocol, but how this wonderful invention, added to the pervasivenes of the Internet, can be useful for us. You can find more about it at www.ntp.org.
Once your system is properly setup, NTP will manage to keep its time accurate, making very small adjustments to not impact the running applications.
People can get exact time using hardware based on atom's electrons frequency. There is also a method based on GPS (Global Positioning System). The first is more accurate, but the second is pretty good also. Atomic clocks require very special and expensive equipment, but their maintainers (usually universities and research labs) connect them to computers, that run an NTP daemon, and some of them are connected to the Internet, that finally let us access them for free. And this is how we'll synchronize our systems.
5.2. Building a Simple Time Synchronization Architecture
You will need:
A direct or indirect (through a firewall) connection to the Internet.
Choose some NTP servers. You can use the public server pool.ntp.org, or choose some from the stratum 2 public time servers on NTP website. If you don't have an Internet access, your WAN administrator (must be a clever guy) can provide you some internal addresses.
Have the NTP package installed in all systems you want to synchronize. You can find RPMs in your favorite Linux distribution CD, or make a search on rpmfind.net.
Here is an example of good architecture:
If you have several machines to synchronize, do not make them all access the remote NTP servers you chose. Only 2 of your server farm's machines must access remote NTP servers, and the other machines will sync with these 2. We will call them the Relay Servers.
Your Relay Servers can be any machine already available in your network. NTP consumes low memory and CPU. You don't need a dedicated machine for it.
It is a good idea to create hostname aliases for your local Relay Servers like ntp1.my.com and ntp2.my.com, and use only these names when configuring the client machines. This way you can move the NTP functionality to a new Relay Server (with a different IP and hostname), without having to reconfigure the clients. Ask your DNS administrator to create such aliases. |
5.3. NTP Configurations
- For Your Relay Servers
Edit /etc/ntp.conf and add the remote servers you chose:
Example 5. Relay machines' /etc/ntp.conf
. . server otherntp.server.org # A stratum 1 server at server.org server ntp.research.gov # A stratum 2 server at research.gov . .
Again, you can use the public server pool.ntp.org, or get a list of public stratum 2 time servers from NTP website.
- For Your Clients
Edit /etc/ntp.conf and add your Relay Servers with a standard name:
If your machine has a UTC time difference bigger than some minutes comparing to the NTP servers, NTP will not work. So you must do a first full sync, and I recommend you to do it in a non-production hour. You need to do it only when you are making the initial NTP setup. Never more:
Example 7. First sync
The last step is to start or restart the NTP daemons in each machine:
bash# service ntpd restart |
5.4. Watching Your Box Synchronizing
Now you have everything setup. NTP will softly keep your machine time synchronized. You can watch this process using the NTP Query (ntpq command):
Example 8. A time synchronization status
bash# ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== -jj.cs.umb.edu gandalf.sigmaso 3 u 95 1024 377 31.681 -18.549 1.572 milo.mcs.anl.go ntp0.mcs.anl.go 2 u 818 1024 125 41.993 -15.264 1.392 -mailer1.psc.edu ntp1.usno.navy. 2 u 972 1024 377 38.206 19.589 28.028 -dr-zaius.cs.wis ben.cs.wisc.edu 2 u 502 1024 357 55.098 3.979 0.333 +taylor.cs.wisc. ben.cs.wisc.edu 2 u 454 1024 347 54.127 3.379 0.047 -ntp0.cis.strath harris.cc.strat 3 u 507 1024 377 115.274 -5.025 1.642 *clock.via.net .GPS. 1 u 426 1024 377 107.424 -3.018 2.534 ntp1.conectiv.c 0.0.0.0 16 u - 1024 0 0.000 0.000 4000.00 +bonehed.lcs.mit .GPS. 1 u 984 1024 377 25.126 0.131 30.939 -world.std.com 204.34.198.40 2 u 119 1024 377 24.229 -6.884 0.421 |
The meaning of each column
- remote
Is the name of the remote NTP server. If you use the -n switch, you will see the IP addresses of these servers instead of their hostnames.
- refid
Indicates where each server is getting its time right now. It can be a server hostname or something like .GPS., indicating a Global Positioning System source.
- st
Stratum is a number from 1 to 16, to indicate the remote server precision. 1 is the most accurate, 16 means 'server unreachable'. Your Stratum will be equal to the accurate remote server plus 1. Never connect to a Stratum 1 server, use Stratum 2 servers! Stratum 2 servers are also good for our purposes, and this policy is good for reducing the traffic to the Stratum 1 servers.
- poll
The polling interval (in seconds) between time requests. The value will range between the minimum and maximum allowed polling values. Initially the value will be smaller to allow synchronization to occur quickly. After the clocks are 'in sync' the polling value will increase to reduce network traffic and load on popular time servers.
- reach
This is an octal representation of an array of 8 bits, representing the last 8 times the local machine tried to reach the server. The bit is set if the remote server was reached.
- delay
The amount of time (seconds) needed to receive a response for a "what time is it" request.
- offset
The most important value. The difference of time between the local and remote server. In the course of synchronization, the offset time lowers down, indicating that the local machine time is getting more accurate.
- jitter
Dispersion, also called Jitter, is a measure of the statistical variance of the offset across several successive request/response pairs. Lower dispersion values are preferred over higher dispersion values. Lower dispersions allow more accurate time synchronization.
The meaning of the signs before server hostname
- -
Means the local NTP service doesn't like this server very much
- +
Means the local NTP service likes this server
- x
Marks a bad host
- *
Indicates the current favorite
5.5. Configure to Automatically Run NTP at Boot
You may want to have NTP running all the time even if you reboot your machine. On each machine, do the following:
bash# chkconfig --level 2345 ntpd on |
This will ensure autostart.
If your machine is up and running for a long time (months, years) without rebooting, you'll find a big discrepancy between the inaccurate hardware clock and the (now very accurate) system time. Modern Linux distributions copy OS time to the HC everytime the system is shutdown, using a mechanism similar to the setclock command. This way, in the next OS boot, you'll get date and time almost as accurate as it was when you shutdown the machine.