Troubleshooting running systems with lsof

by Jeff on February 14, 2008

Overview

lsof or "List Open Files" is a favorite in my free software toolbox. It is so versatile there are few things you can't do without it.

Here are some examples of real world lsof usage and a few things things you might not know it is capable of doing.

What process is holding onto /var/log/messages? lsof is indispensable when troubleshooting what process is using a file or mount-point. Phil Dibowitz pointed out when troubleshooting hung mounts, there is a difference between lsof /mount/point and lsof /mount/point/.

root@terminated:~# lsof /var/log/messages
COMMAND   PID   USER   FD   TYPE DEVICE   SIZE  NODE NAME
syslogd 18623 syslog   15w   REG    8,6 870043 65289 /var/log/messages

Each process has a "fd" directory with symlinks to each open file-handle. Take a look under /proc/${PID}/fd to verify sysklogd has /var/log/messages open.

root@terminated:~# ls -l /proc/18623/fd | grep messages
l-wx------ 1 root root 64 2008-02-13 23:11 15 -> /var/log/messages

The top 10 processes with the most files open. Look for unusual processes with lots of open files. Thanks to Nathaniel Mccallum for an improved 1 liner. Script available here.

[root@terminated:~# lsof +c 15 | awk '{printf("%15s  (%s)\n", $1, $2)}' | sort | uniq -c | sort -rn | head
121          master  (3074)
48           snmpd  (16136)
33            sshd  (13715)
32           showq  (1906)
31            qmgr  (3084)
31          pickup  (2974)
28       syslog-ng  (2741)
26            ntpd  (5081)
25            sshd  (2977)
17           crond  (3085)

All tcp connections on port 22. By default, lsof only shows processes owned by the current user or all processes if ran as root. It also supports port aliases the same as in /etc/services. See services(5) for more information.

jeff@terminated:~$ lsof -i TCP:22
COMMAND   PID USER   FD   TYPE DEVICE SIZE NODE NAME
ssh     14559 jeff    3u  IPv4  44537       TCP terminated:44315->webhost:ssh (ESTABLISHED)
ssh     16637 jeff    3u  IPv4  53098       TCP terminated:37509->sentry.net:ssh (ESTABLISHED)

jeff@terminated:~$ sudo lsof -i TCP:ssh
COMMAND   PID USER   FD   TYPE DEVICE SIZE NODE NAME
sshd     5077 root    3u  IPv6  17429       TCP *:ssh (LISTEN)
ssh     14559 jeff    3u  IPv4  44537       TCP terminated:44315->webhost:ssh (ESTABLISHED)
ssh     16637 jeff    3u  IPv4  53098       TCP terminated:37509->sentry.net:ssh (ESTABLISHED)

Show processes with open files containing a link count less than 1. This is handy for seeing real system problems. An example would be poorly written applications that don't realize when the file being written to is yanked out from underneath them and continue on merrily writing to nothing. Modern versions of sysklogd realize this and will close the file handle until the file returns.

root@terminated:~# lsof +L1
COMMAND   PID  USER   FD   TYPE DEVICE SIZE NLINK  NODE NAME
mysqld  17016 mysql    5u   REG    8,8    0     0 48290 /tmp/ibYlPjD3 (deleted)
mysqld  17016 mysql    6u   REG    8,8    0     0 48291 /tmp/ib8Fo2za (deleted)
mysqld  17016 mysql    7u   REG    8,8    0     0 48292 /tmp/ibXG3Kwh (deleted)
mysqld  17016 mysql    8u   REG    8,8    0     0 48293 /tmp/ibIbs4wo (deleted)
mysqld  17016 mysql   12u   REG    8,8    0     0 48294 /tmp/ibfQqINv (deleted)

What files does pid 1 (init) have open?

root@terminated:~# lsof -p 1
COMMAND PID USER   FD   TYPE DEVICE    SIZE    NODE NAME
init      1 root  cwd    DIR    8,3    4096       2 /
init      1 root  rtd    DIR    8,3    4096       2 /
init      1 root  txt    REG    8,3   31216 1114187 /sbin/init
init      1 root  mem    REG    8,3   52400 1392748 /lib/libsepol.so.1
init      1 root  mem    REG    8,3  110984 1392653 /lib/ld-2.3.4.so
init      1 root  mem    REG    8,3 1526108 1394887 /lib/tls/libc-2.3.4.so
init      1 root  mem    REG    8,3   55000 1392711 /lib/libselinux.so.1
init      1 root   10u  FIFO   0,13            1108 /dev/initctl

For the curious, lsof gets this information from the maps and smaps files under /proc/$PID. Linux kernel 2.6.14 and newer  have smaps. See proc(5) for more information.

root@terminated:~# awk '/\//{print $NF}' /proc/1/maps | sort -u
/lib/ld-2.3.4.so
/lib/libselinux.so.1
/lib/libsepol.so.1
/lib/tls/libc-2.3.4.so
/sbin/init

As you can see, lsof(8) is a great tool in diagnosing system problems or just seeing whats going on. In my usage, lsof -i has mostly replaced netstat.

{ 2 trackbacks }

Linux open ports and owner applications in Linux | Chad Mayfield
06.02.08 at 4:09 am
Comandos Unix para administrar sistemas « Yvoictra Blog
10.12.08 at 11:58 am

{ 1 comment… read it below or add one }

1 Tomasz Chmielewski 05.05.08 at 12:41 pm

Sometimes, it makes sense to add “-n” after lsof to prevent it from resolving host names (which can be slow).

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Previous post:

Next post: