Linux And High I/o Wait
When you look at the CPU activity of your personal, one of the parameters is the iowait. This quality shows how much time your CPU wastes moment it is waiting for I/O operations for the reason that conclude. These contains disk see/write operations, network, IPC, etc. Is this performance a trouble and, if thus, what on earth causes it with how to fix it? One one of the popular Unix-related forums one “genius” wrote:
The iowait “problem” is funny. It’s like as people complain that Linux is “using all my memory”. Yeah, no shit. You should troth upset if you are duplicating archive and your personal is /not/ inside 100% iowait.
In reality, 100% iowait indicates that there is a problem with in Diverse cases – a big problem that may still lead to data loss. Essentially, there is a bottleneck somewhere the system. Maybe one of your disks is getting ready to die; or, perhaps, the NIC firmware is having predicaments with the latest kernel upgrade you installed. The troubleshooting process starts with the potentially a lot of serious likelihood: bad disk.
Receive a swift glance at /etc/messages, /etc/dmesg, /etc/boot.log also any other structure records files. You are looking for disk I/O errors, failed leaf finished/inscribe operations, bad sectors – anything that indicates a hardware dilemma with a disk. If you don’t find anything, look because IRQ with disk controller errors. Also glance because memory errors plus kernel panics. The three various probable culprits of high iowait are: ghastly disk, faulty memory furthermore network nags.
If you similarly see not anything relevant, it is season to trial your procedure. If potential, kick all the users off the box, shut down Web server, database and any other user application. Log via command line along with stop XDM.
Open three shell windows: run “top” in one, “iostat -x 1? in the other and “find /etc -style f -print” in the third. Make sure you may well picture all three windows at the same season. This is a simple test that should create some I/O game on the system disk. Repeat this course for variant disks. If you spot iowait hovering near 100%, chance are you experience a problem however we don’t get hold of no matter all without specification it is likewise. However, now we do discern that network is probably not the cause.
deathstar:/ # iostat -x 1
Linux 2.6.5-7.201-default (deathstar) 12/20/08
avg-cpu: %user %pleasant %sys %iowait %lazy
2.83 0.42 1.45 9.11 86.20
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
hda 40.63 66.34 27.45 6.04 936.50 581.23 468.25 290.61 45.32 2.42 72.16 2.22 7.42
hdc 0.01 0.00 0.01 0.00 0.03 0.00 0.02 0.00 4.02 0.00 1.17 1.17 0.00
sda 0.09 2.32 4.15 1.33 71.56 29.23 35.78 14.62 18.37 0.65 118.49 6.39 3.51
sdb 3.47 0.00 1.90 0.00 15.32 0.01 7.66 0.01 8.08 0.74 391.31 5.68 1.08
fd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 45.00 45.00 0.00
deathstar:/ # peak
pinnacle – 21:28:28 wide awake 1:22, 2 users, load average: 0.09, 0.14, 0.16
Chores: 77 figure, 1 running, 76 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.8% us, 1.3% sy, 0.4% ni, 86.2% id, 9.1% wa, 0.1% hi, 0.0% si
Mem: 508644k total, 503612k used, 5032k without charge, 34052k buffers
Surrogate: 1020088k total, 458980k cast off, 561108k free, 16012k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 16 0 640 56 28 S 0.0 0.0 0:05.14 init
2 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
3 root 5 -10 0 0 0 S 0.0 0.0 0:00.09 happenings/0
4 starting place 5 -10 0 0 0 S 0.0 0.0 0:00.00 khelper
Next step, lets strain out your CPU nevertheless not the disks. The command below will try to create an endless zip line in /dev/null. This generates no disk activity, but a lot the CPU. Continue flowing “top” and “iostat -x 1? in the abundant two windows.
tom cat /dev/naught | bzip2 -c /dev/null
If you catch a glimpse high CPU load but low iowait, we can take out CPU facet, IRQ conflicts, and faulty memory. Just to engagement on the safe nearby, let’s test memory anyway:
deathstar:/ # free
total used liberated mutual buffers cached
Mem: 508644 503504 5140 0 37036 48968
-/+ buffers/cache: 417500 91144
Swap: 1020088 516196 503892
This server has 508644Kb of RAM. Purpose the corresponding value for the beyond test:
deathstar:/ # dd if=/dev/hda2 bs=508644 of=/backups/memtest count=1050
1050+0 files in
1050+0 records out
deathstar:/ # md5sum /backups/memtest ; md5sum /backups/memtest ; md5sum /backups/memtest
04762ff36b2231aac75754ab9c1a564a /backups/memtest
04762ff36b2231aac75754ab9c1a564a /backups/memtest
04762ff36b2231aac75754ab9c1a564a /backups/memtest
The three MD5 values varied than should be identical. If they are not – your system has a faulty RAM chip.
When you labor under eliminated hardware predicaments for the reason that doable causes of expensive iowait, the after step is to review firmware also drivers. You are particularly interested in disk controller firmware: unstable demeanor as well as no confusion messages are the signs of a firmware problem. Quest really hard to remember if you made any system changes lately, especially something that required a reboot – like kernel upgrade, for example. If this is the case, flash back the upgrade or search for upgrade firmware. You should grab a back up of Sysinfo (free 30-day trial) to help you identify makes also models of your disks, controllers, etc.
Jiffy your disks along with controllers may be tip-top, your may have a problem with a filesystem. Even if you see high iowait whilst accessing any filesystem, you should still verify out the partition where /var is mounted furthermore restore – if there is a difficulty, it will manifest itself regardless of what your system is doing. But here you will run into a little drawback: fsck will not scan a mounted partition and you cannot unmount /var. Let’s say these are your partitions:
deathstar:/ # more /etc/fstab
/dev/hda2 / reiserfs acl,user_xattr 1 1
/dev/hda1 swap swap pri=42 0 0
You need to fsck /dev/hda2 because this is where your /var is mounted. Download KNOPPIX or Ubuntu LiveCD, boot from CD (without installing) and “fsck /dev/hda2? from there. If everything looks clean, shut sip your organization, take the CD out and boot normally. The then step is to inspect out swap. If you just run fsck on the restore partition, it will fail:
deathstar:/ # fsck /dev/hda1
fsck 1.34 (25-Jul-2003)
fsck: fsck.swap: not pool up
fsck: Error 2 while functioning fsck.swap for /dev/hda1
You need to disable swap on /dev/hda1 before you may well scan it. Before you would do this, you need to grant a contribution another restore aspect: you can’t run without any swap space. So, to add swap on the fly, produce a swap spat (1Gb inside this case in direct):
deathstar:/ # dd if=/dev/nothing of=/swapfile bs=1024 count=1048576
1048576+0 records in
1048576+0 records out
deathstar:/ # chmod 600 /swapfile
deathstar:/ # ls -lash /swapfile
1.1G -rw——- 1 basis cause 1.0G Dec 20 22:48 /swapfile
Now you can set conscious and activate the new swap file:
deathstar:/ # mkswap /swapfile
Setting up swapspace version 1, size = 1073737 kB
deathstar:/ # liberated
measure used free shared buffers cached
Mem: 508644 500996 7648 0 38912 147332
-/+ buffers/cache: 314752 193892
Swap: 1020088 521784 498304
deathstar:/ # swapon /swapfile
deathstar:/ # liberated
total cast off free reciprocal buffers cached
Mem: 508644 502232 6412 0 39400 147392
-/+ buffers/cache: 315440 193204
Swap: 2068656 521784 1546872
Now we need to deactivate the original swap partition. This operation might take a couple minutes to complete:
deathstar:/ # swapoff /dev/hda1
deathstar:/ # free
total hand-me-down free shared buffers cached
Mem: 508644 501624 7020 0 31712 10416
-/+ buffers/cache: 459496 49148
Improve: 1048568 167032 881536
The next step is to create a standard filesystem on the old swap partition, hence that fsck has something to scan:
deathstar:/ # mke2fs -c /dev/hda1
mke2fs 1.34 (25-Jul-2003)
Filesystem label=
OS type: Linux
Block size=4096 (dough=2)
Fragment size=4096 (documents=2)
127744 inodes, 255024 blocks
12751 blocks (5.00%) reserved for the wonderful user
First memoir obstruct=0
8 block groups
32768 blocks in step with array, 32768 fragments per group
15968 inodes according to range
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Checking for bad blocks (read-exclusively test): done
Writing inode tables: ended
Scripting superblocks and filesystem accounting information: done
The previous process already ran fsck and so, if you see no errors, you might now re-activate your primary swap space and dispose of the temporary swap you created:
deathstar:/ # mkswap /dev/hda1
Setting up swapspace version 1, size = 1044574 kB
deathstar:/ # swapon /dev/hda1
deathstar:/ # swapoff /swapfile
deathstar:/ # rm /swapfile
deathstar:/ # at no cost
total used free shared buffers cached
Mem: 508644 503172 5472 0 33668 9256
-/+ buffers/cache: 460248 48396
Swap: 1020088 156300 863788
Anothe dominate commonly era hand for analyzing system bottlenecks is vmstat. The following instance runs vmstat five times at 2-second intervals:
deathstar:~ # vmstat -S M 2 5
procs ———–memory———- —swap– —–io—- –system– —-cpu—-
r b swpd free fanatic cache si therefore bi bo cs us sy id wa
0 0 15 174 70 58 0 0 189 50 5 6 1 3 94 1
0 0 15 174 70 58 0 0 0 0 1005 35 4 0 96 0
0 1 15 174 70 58 0 0 0 258 1515 45 0 6 88 7
0 0 15 173 71 58 0 0 0 194 1083 24 0 1 83 16
0 0 15 173 71 58 0 0 0 0 1003 19 0 0 100 0
Explanation of vmstat columns:
(a) procs is the order-related fields are:
* r: The number of processes interference for run time.
* b: The number of processes in uninterruptible relax.
(b) memory is the memory-related fields are:
* swpd: the amount of virtual memory hand-me-slurp.
* liberal: the amount of idle memory.
* buff: the amount of memory cast off as buffers.
* cache: the amount of memory used as cache.
(c) swap is swap-related fields are:
* si: Amount of memory swapped inside from disk (/s).
* so: Amount of memory swapped to disk (/s).
(d) io is the I/O-related fields are:
* bi: Blocks received enjoys a block device (blocks/s).
* bo: Blocks sent to a block tool (blocks/s).
(e) system is the system-related fields are:
* in: The bulk of interrupts inside step with second, including the clock.
* cs: The number of context switches per second.
(f) cpu is the CPU-related fields are:
These are percentages of measure CPU time.
* us: Time worn-out going non-kernel code. (user time, including nice time)
* sy: Time spent running kernel code. (system season)
* id: Time spent languid. Prior to Linux 2.5.41, this includes IO-wait instant.
* wa: Time spent blockage since IO. Prior to Linux 2.5.41, shown as zero.
If you failed to identify the cause of the iowait problem, you should consider the possibility that there is no problem: perhaps your routine is handling extra load and running short on resources. Take a peep at the surging processes and see what’s downing up memory. Perchance you upgraded an demands and now it is by way of extra RAM, which leads to costly swapping, which leads to high disk tourney, which leads to expensive iowait.
The solutions are easy:
1. Install several RAM
2. Move swap to one more disk or – even superior – changed it to an additional disk on a peculiar controller.
3. Move user applications to another disk/controller and specify default log locations outside of the structure disk.