Keeping track of disk space
Posted by Steve on Fri 20 May 2005 at 07:26
If you don't wish to be suprised when you suddenly run out of disk space on your system(s) it's a good idea to regularly monitor how much free space you have, and which directories and files are taking up the most space.
The basic way to check on your current disk space is via the df command, where df stands for "disk free".
There are several ways you can have the output displayed, from the default:
skx@mystery:~$ df Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 256445 125412 117351 52% / tmpfs 242112 4 242108 1% /dev/shm /dev/hda9 30526204 19487308 9488216 68% /home /dev/hda8 366015 8326 338161 3% /tmp /dev/hda5 4806048 990076 3571836 22% /usr /dev/hda6 2883672 610948 2126240 23% /var
To make this more readable you can add the '-h', or "--human-readable" option:
skx@mystery:~$ df -h Filesystem Size Used Avail Use% Mounted on /dev/hda1 251M 123M 115M 52% / tmpfs 237M 4.0K 237M 1% /dev/shm /dev/hda9 30G 19G 9.1G 68% /home /dev/hda8 358M 8.2M 331M 3% /tmp /dev/hda5 4.6G 967M 3.5G 22% /usr /dev/hda6 2.8G 597M 2.1G 23% /var
Both of these commands will list the same information for each mounted partition, or device.
An alternative to the standard df command is the di package. Once it's installed (via apt-get install di) you can invoke it without any arguments to see a basic overview:
root@sun:~# di Filesystem Mount Megs Used Avail %used fs Type /dev/sda1 / 8450.3 1381.8 6639.3 21% ext3 tmpfs /dev/shm 250.8 0.0 250.8 0% tmpfs /dev/sdb7 /home 34181.3 14406.5 18038.5 47% ext3
di is very customizable. For example you can make the display use Gb instead of Mb for its output:
root@sun:~# di -g Filesystem Mount Gigs Used Avail %used fs Type /dev/sda1 / 8.3 1.3 6.5 21% ext3 tmpfs /dev/shm 0.2 0.0 0.2 0% tmpfs /dev/sdb7 /home 33.4 14.1 17.6 47% ext3
You can also specify which columns should be specified (such as "-A" to show all available columns) with the use of a format string.
A format string basically specifies which fields you wish to see, with codes replaced by the output. To specify the format string you use "-f", then the list of columns you wish to include from the list of available ones.
The formats are described in the man page which you can read by running "man di" but as a quick example you can see the mounted filesystems and the percentage used with the following command:
root@sun:~# di -f m2 Mount %used / 17% /dev/shm 0% /home 44%
Once you've discovered how much space you have available the next most common thing you'll wish to do is see how much space a given directory or file is using.
The du, (or "disk usage"), command will show you how much space a file or directory is occupying:
root@sun:~# du --human-readable 12K ./.gnupg 4.0K ./tmp 4.0K ./public_html 8.0K ./.ssh 4.0K ./bin 32K ./.irssi 100K .
As you can see the "--human-readable", or '-h' flag works here as it did for the "df" command.
To find the biggest files or directories beneath the current one you can use the following command to produce a sorted list:
skx@mystery:~/Programs$ du -ah | sort -rn 940K ./buildd 936K ./buildd/wanna-build 900K ./GNUMP3d/old/website/screenshots 896K ./GNUMP3d/Webpages/gnump3d/screenshots 896K ./GNUMP3d/old/software/gnump3d/screenshots 824K ./GNUMP3d/Program/gnump3d/gnump3d-2.9.4.zip 820K ./Shellcode-Exploitation-Book 816K ./Shellcode-Exploitation-Book/Source_Files 780K ./HTMLPP/htmlpp-4.2a 704K ./altermime-0.3.4 .... ... .... ... etc
Add "| head" to only show the top few lines :
skx@mystery:~$ du -ah | sort -rn | head
The total size of directory can be found by using the either the '-s' or "--summarize" opton:
skx@mystery:~$ du --summarize --human-readable 4.8G .
Or for file or directory in the current one:
skx@mystery:~$ du --summarize --human-readable * 13M Archive 104K bin 667M debian 1.2M debian-administration 1.5G Images 470M Mail 55M Programs 69M public_html 1.2M ROMS 18M sarge 7.1M Text 252M tmp 381M Videos 301M Web
Thankfully you don't need to type out most of the options to df or du as bash's tab completion will fill them out for you easily.
If you install the logwatch package you'll receive a daily email of all the "interesting" messages from your system - as well as a summery of your available disk space. This can be useful if you have a lot of machines and don't wish to be suprised in the event of a full disk
There are also a large number of scripts available online which will send you an email when a partition is reaching a state of fullness - which you can schedule to run regularly via cron if you wish.
#!/bin/bash
system=`hostname --fqdn`
for line in `df -aPh | grep "^/" | sort | awk '{print$6"-"$5"-"$4}'`; do
percent=`echo "$line" | awk -F - '{print$2}' | cut -d % -f 1`
partition=`echo "$line" | awk -F - '{print$1}' | cut -d % -f 1`
limit=90
if [ $partition == '/cdrom' ]; then
limit=101
fi
if [ $percent -ge $limit ]; then
echo "Free Space warning!" | mail -a "From: spacecsekker" -s "[$system] $line" xxx@xxx.bme.hu
fi
done
[ Parent | Reply to this comment ]
I use munin myself. I just wish the bloody thing wasn't in perl. Otherwise it's nice.
[ Parent | Reply to this comment ]
[ Parent | Reply to this comment ]
[ Parent | Reply to this comment ]
alias dum='du --max-depth=1' alias duh='du -h --max-depth=1'dum signifies max-depth and duh signifies human-readable. "--max-depth=1" makes sure you get the disk usage for the complete directory and not for every individual ((sub)sub)subdirectory. Most of the time that is the info I want.
maurits@mauritsvanrees:~/web$ duh 93K ./links 24K ./images 15M ./preken 1,1M ./studie 232K ./weblog 16M .The human readable part does not work so well when sorting:
maurits@mauritsvanrees:~/web$ duh | sort -rg 232K ./weblog 93K ./links 24K ./images 16M . 15M ./preken 1,1M ./studieThis sorts by numerical value, but the difference between a number with M or K at the end goes beyond sort. [BTW, I don't see any difference between 'sort -n' and 'sort -g'.] That's where the 'dum' alias comes in handy:
maurits@mauritsvanrees:~/web$ dum | sort -rg 16058 . 14593 ./preken 1081 ./studie 232 ./weblog 93 ./links 24 ./images
[ Parent | Reply to this comment ]
These can be found with
lsof | grep deleted
Easiest way to free up the space is to simply restart the process that is holding the filehandle open.
Few other useful things I've found for tracking down large files:
find /home/ -size +100000k
to look for large files, and:
find /home -size +100000k -ls | sort -n -k7,7
To get the same results sorted.
Installing quotas can also be useful, as it allows you to see what users are using up lots of space instantly, instead of waiting for a recursive du process, which on large drives can take ages. Downside to this method being that you then have to deal with quotas, which can be undesirable (as in a recent accident whereby vpopmail user accidentally got a quota of about 500Mb on a machine with 800+ users...)
[ Parent | Reply to this comment ]
It's available as a .deb (mon), but you'll be better off getting a newer version. The mailing list is active, but not high volume, and the primary developers are active participants in it.
nagios is great if you have to impress non-technical types, but mon is great because it stays out of the way until you want to deal with it.
mon is cool.
[ Parent | Reply to this comment ]
e.g.
# ls -l /var/log/lastlog
-rw-rw-r-- 1 root utmp 292292 May 23 21:06 /var/log/lastlog
# du -k /var/log/lastlog
12 /var/log/lastlog
ls -s also works:
# ls -s /var/log/lastlog
12 /var/log/lastlog
[ Parent | Reply to this comment ]
And don't use "du -h |sort -rn", but use "du -k|sort -rn" (sizes in kb). -- WBR,a5b
[ Parent | Reply to this comment ]
#!/usr/bin/env bash
# loop over each device name culled from df with grep, excluding tmpfs entries
# the device name is stored in $dev
for dev in `df | grep -v tmpfs | egrep -o ^/dev/[[:alnum:]]\{3,4\}`; do
used=`df | grep ^$dev | egrep -o [0-9]\{1,2\}% | egrep -o [0-9]\{1,2\}` # the percent of space used is stored in $used
fs=`df | grep ^$dev | egrep -o /[[:alpha:]/]*$` # the mount point is stored in fs
if [ $used -ge 90 ]; then # if more than 90% of space is used
echo "$dev ($fs) is $used% full." | mail -s "$dev is low on free space." root # send a warning email to root
fi
done
I'm sure it could be improved upon, but it's simple, and it does what I needed it to do.
[ Parent | Reply to this comment ]
This accounts for hidden files as well.
[ Parent | Reply to this comment ]