HOWTO: Multi Disk System Tuning

13. Maintenance

It is the duty of the system manager to keep an eye on the drives and partitions. Should any of the partitions overflow, the system is likely to stop working properly, no matter how much space is available on other partitions, until space is reclaimed.

Partitions and disks are easily monitored using df and should be done frequently, perhaps using a cron job or some other general system management tool.

Do not forget the swap partitions, these are best monitored using one of the memory statistics programs such as free, procinfo or top.

Drive usage monitoring is more difficult but it is important for the sake of performance to avoid contention - placing too much demand on a single drive if others are available and idle.

It is important when installing software packages to have a clear idea where the various files go. As previously mentioned GCC keeps binaries in a library directory and there are also other programs that for historical reasons are hard to figure out, X11 for instance has an unusually complex structure.

When your system is about to fill up it is about time to check and prune old logging messages as well as hunt down core files. Proper use of ulimit in global shell settings can help saving you from having core files littered around the system.

13.1 Backup

The observant reader might have noticed a few hints about the usefulness of making backups. Horror stories are legio about accidents and what happened to the person responsible when the backup turned out to be non-functional or even non existent. You might find it simpler to invest in proper backups than a second, secret identity.

There are many options and also a mini-HOWTO ( Backup-With-MSDOS ) detailling what you need to know. In addition to the DOS specifics it also contains general information and further leads.

In addition to making these backups you should also make sure you can restore the data. Not all systems verify that the data written is correct and many administrators have started restoring the system after an accident happy in the belief that everything is working, only to discover to their horror that the backups were useless. Be careful.

There are both free and commercial backup systems available for Linux. One commercial example is the disk image level backup system from QuickStart offering a full function 30 day Linux demo available online.

13.2 Defragmentation

This is very dependent on the file system design, some suffer fast and nearly debilitating fragmentation. Fortunately for us, ext2fs does not belong to this group and therefore there has been very little talk about defragmentation tools. It does in fact exist but is hardly ever needed.

If for some reason you feel this is necessary, the quick and easy solution is to do a backup and a restore. If only a small area is affected, for instance the home directories, you could tar it over to a temporary area on another partition, verify the archive, delete the original and then untar it back again.

13.3 Deletions

Quite often disk space shortages can be remedied simply by deleting unnecessary files that accumulate around the system. Quite often programs that terminate abnormally cause all kinds of mess lying around the oddest places. Normally a core dump results after such an incident and unless you are going to debug it you can simply delete it. These can be found everywhere so you are advised to do a global search for them now and then. The locate command is useful for this.

Unexpected termination can also cause all sorts of temporary files remaining in places like /tmp or /var/tmp, files that are automatically removed when the program ends normally. Rebooting cleans up some of these areas but not necessary all and if you have a long uptime you could end up with a lot of old junk. If space is short you have to delete with care, make sure the file is not in active use first. Utilities like file can often tell you what kind of file you are looking at.

Many things are logged when the system is running, mostly to files in the /var/log area. In particular the file /var/log/messages tends to grow until deleted. It is a good idea to keep a small archive of old log files around for comparison should the system start to behave oddly.

If the mail or news system is not working properly you could have excessive growth in their spool areas, /var/spool/mail and /var/spool/news respectively. Beware of the overview files as these have a leading dot which makes them invisible to ls -l, it is always better to use ls -Al which will reveal them.

User space overflow is a particularly tricky topic. Wars have been waged between system administrators and users. Tact, diplomacy and a generous budget for new drives is what is needed. Make use of the message-of-the-day feature, information displayed during login from the /etc/motd file to tell users when space is short. Setting the default shell settings to prevent core files being dumped can save you a lot of work too.

Certain kinds of people try to hide files around the system, usually trying to take advantage of the fact that files with a leading dot in the name are invisible to the ls command. One common example are files that look like ... that normally either are not seen, or, when using ls -al disappear in the noise of normal files like . or .. that are in every directory. There is however a countermeasure to this, use ls -Al that suppresses . or .. but shows all other dot-files.

13.4 Upgrades

No matter how large your drives, time will come when you will find you need more. As technology progresses you can get ever more for your money. At the time of writing this, it appears that 6.4 GB drives gives you the most bang for your bucks.

Note that with IDE drives you might have to remove an old drive, as the maximum number supported on your mother board is normally only 2 or some times 4. With SCSI you can have up to 7 for narrow (8-bit) SCSI or up to 15 for wide (15 bit) SCSI, per channel. Some host adapters can support more than a single channel and in any case you can have more than one host adapter per system. My personal recommendation is that you will most likely be better off with SCSI in the long run.

The question comes, where should you put this new drive? In many cases the reason for expansion is that you want a larger spool area, and in that case the fast, simple solution is to mount the drive somewhere under /var/spool. On the other hand newer drives are likely to be faster than older ones so in the long run you might find it worth your time to do a full reorganizing, possibly using your old design sheets.

If the upgrade is forced by running out of space in partitions used for things like /usr or /var the upgrade is a little more involved. You might consider the option of a full re-installation from your favourite (and hopefully upgraded) distribution. In this case you will have to be careful not to overwrite your essential setups. Usually these things are in the /etc directory. Proceed with care, fresh backups and working rescue disks. The other possibility is to simply copy the old directory over to the new directory which is mounted on a temporary mount point, edit your /etc/fstab file, reboot with your new partition in place and check that it works. Should it fail you can reboot with your rescue disk, re-edit /etc/fstab and try again.

Until volume management becomes available to Linux this is both complicated and dangerous. Do not get too surprised if you discover you need to restore your system from a backup.

The Tips-HOWTO gives the following example on how to move an entire directory structure across:

(cd /source/directory; tar cf - . ) | (cd /dest/directory; tar xvfp -)

While this approach to moving directory trees is portable among many Unix systems, it is inconvenient to remember. Also, it fails for deeply nested directory trees when pathnames become to long to handle for tar (GNU tar has special provisions to deal with long pathnames).

If you have access to GNU cp (which is always the case on Linux systems), you could as well use

cp -av /source/directory /dest/directory

GNU cp knows specifically about symbolic links, hard links, FIFOs and device files and will copy them correctly.

Remember that it might not be a good idea to try to transfer /dev or /proc.

There is also a Hard Disk Upgrade mini-HOWTO that gives you a step by step guide on migrating an entire Linux system, including LILO, form one hard disk to another.

13.5 Recovery

System crashes come in many and entertaining flavours, and partition table corruption always guarantees plenty of excitement. A recent and undoubtedly useful tool for those of us who are happy with the normal level of excitement, is gpart which means "Guess PC-Type hard disk partitions". Useful.

In addition there are some partition utilities available under DOS.

13.6 Rescue Disk

Upgrades of kernel and hardware is not uncommon in the Linux world and it is therefore important that you prepare an updated rescue disk especially when you use special drivers to access your hardware. Rescue disks can be gotten off the net, from your distribution or you can put one together yourself. Do make sure the boot and root parameters are set so the kernel will know where to find your system.

If you don't have a recovery floppy you can use the GRUB boot loader to load from a Linux kernel somewhere on disk, with arguments.

Next Previous Contents

docs.sk