#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2006
    Location
    California
    Posts
    106
    Rep Power
    11

    Find process causing high iowait


    Hi,
    I manage a number of *nix boxes for a web hosting company. Periodically the load average on one of the servers jumps into the 20s or 30s, yet all of the running process are practically idle. The only thing out of the ordinary as far as I can tell is the iowait is around 90%. I know that's most likely the problem, but I can't figure out how to tell which process is using the disk so much.

    How do you find the source of high iowait on a Linux box?

    Thanks
  2. #2
  3. fork while true;
    Devshed God 1st Plane (5500 - 5999 posts)

    Join Date
    May 2005
    Location
    England, UK
    Posts
    5,538
    Rep Power
    1051
    You find out what's accessing a lot of files per second...

    If you have ionotify support compiled into the kernel, you could write a utility to track accesses, although a quick google doesn't reveal a ready made one.

    That said, I'd check things like apache access logs, is it serving more than 40 files a second? That could seriously push up the iowait.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2006
    Location
    California
    Posts
    106
    Rep Power
    11

    Probably Exim


    Thanks for the reply, I'll look into ionotify, but I doubt the support for it is compiled into the kernel right now.

    One thing I forgot to mention is that I have a suspicion the problem (at least sometimes) is with exim. A couple of times we have restarted exim when the load average was high like that and the problem went away. A quick look at the logs for that period showed that we were being hit by a large amount of spam.

    I'll definately stick with using the logs to try and find out what is loading down the box, but I was just hoping there was some way to quickly identify which process is swamping the disk.

    Comments on this post

    • LinuxPenguin agrees : You may have just helped me a great deal with a problem of my own...
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2003
    Posts
    56
    Rep Power
    13
    Here is a few that you can use to track it down:

    netstat -autpn | grep :80
    netstat -autpn | grep :3306
    netstat -na
    netstat -an|grep :80|sort|more
    netstat -an|grep ESTABLISHED
    netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -nr | more
    ps ax | awk '$3 ~ /^D/ { print $0 }'
    netstat -an | grep :80 | wc -l
    www.colorteck.com
    "fast affordable hosting with all the extra's"
  8. #5
  9. fork while true;
    Devshed God 1st Plane (5500 - 5999 posts)

    Join Date
    May 2005
    Location
    England, UK
    Posts
    5,538
    Rep Power
    1051
    None of those target the problem the OP is having.

    We appreciate your enthusiasm, but your post, while indirectly related, is not really answering the question.
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2003
    Posts
    56
    Rep Power
    13
    Hmm I beg to differ as I just solved this issue with one Ip having too many connections to apache causing the server load to go 20+ so how does this not help him as he can monitor the connection to Apache and try to track down the person or Ip responsable for this. Got a better suggestion let me hear it. You can not directly answer his question you have to locate the issue first. all you can do is guess at this point unless he does some research to find out why the load is so high on his server. Plus the Iowat is going to be high if too many connection to apache from one Ip is the issue. There are numerous things you can check I just gave him sone things to check plus they are useful when tracking down high loads on a server.

    I also thought it was Exim as I manage 20+ servers. The problem was one Ip that had over 100 connection to Apache.
    www.colorteck.com
    "fast affordable hosting with all the extra's"
  12. #7
  13. fork while true;
    Devshed God 1st Plane (5500 - 5999 posts)

    Join Date
    May 2005
    Location
    England, UK
    Posts
    5,538
    Rep Power
    1051
    Originally Posted by colorteck
    Hmm I beg to differ as I just solved this issue with one Ip having too many connections to apache causing the server load to go 20+ so how does this not help him as he can monitor the connection to Apache and try to track down the person or Ip responsable for this. Got a better suggestion let me hear it. You can not directly answer his question you have to locate the issue first. all you can do is guess at this point unless he does some research to find out why the load is so high on his server. Plus the Iowat is going to be high if too many connection to apache from one Ip is the issue. There are numerous things you can check I just gave him sone things to check plus they are useful when tracking down high loads on a server.

    I also thought it was Exim as I manage 20+ servers. The problem was one Ip that had over 100 connection to Apache.
    However you're failing to note that apache's cpu usage isn't jumping, which in that case it would...

    I did give a better suggestion, he just wasn't up for compiling inotify/ionotify into the kernel...

    The iowait has nothing to do with how many IPs files are being requested from, just how many requests, and as he claims all his software was practically idle...
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2003
    Posts
    56
    Rep Power
    13
    Originally Posted by LinuxPenguin
    However you're failing to note that apache's cpu usage isn't jumping, which in that case it would...

    I did give a better suggestion, he just wasn't up for compiling inotify/ionotify into the kernel...

    The iowait has nothing to do with how many IPs files are being requested from, just how many requests, and as he claims all his software was practically idle...
    The same thing was happening to us. No abnormal usuage on anything, then all of a sudden the load would skyrocket and we would shut down exim and it seemed to stop. But Exim was not the issue. When we finally found the Ip we blocked it and have not had any issues at all. I would suggest he atleast take a look because there where no abnormal processes on our end as well.

    Comments on this post

    • LinuxPenguin agrees : Okay, fair dos
    www.colorteck.com
    "fast affordable hosting with all the extra's"
  16. #9
  17. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2006
    Location
    California
    Posts
    106
    Rep Power
    11
    Good discussion. colorteck: Thanks for tip about the large number of connections. Even if that isn't my problem it's good to keep things like that in mind.

    Thanks.
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2006
    Location
    California
    Posts
    106
    Rep Power
    11

    Not connections


    Well I've been watching the connections when I get high iowait spikes and they look normal (lower than usual this last time in fact), so I'm still quite at a loss. I know ionotify is probably the way to go, but is that what everyone does?

    What steps do each of you go through when your server is slow and you run top and find out the load average is 26 and the iowait is 98%??
  20. #11
  21. fork while true;
    Devshed God 1st Plane (5500 - 5999 posts)

    Join Date
    May 2005
    Location
    England, UK
    Posts
    5,538
    Rep Power
    1051
    Like i said, inotify is the way forward (apologies, i incorrectly termed it ionotify earlier).

    A quick google should assist you in using it.

    Since I've never had this problem, i can't say we have a procedure for it, but i'd suggest checking your cpanel (you use it, right?) server load logs for various processes. We had another similar problem when we discovered a DDOS and it showed in the logs as mentioned. (Hacked PHPBB, no surprise there)

IMN logo majestic logo threadwatch logo seochat tools logo