Keeping track of disk space

Posted by Steve on Fri 20 May 2005 at 07:26

If you don't wish to be suprised when you suddenly run out of disk space on your system(s) it's a good idea to regularly monitor how much free space you have, and which directories and files are taking up the most space.

The basic way to check on your current disk space is via the df command, where df stands for "disk free".

There are several ways you can have the output displayed, from the default:

skx@mystery:~$ df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda1               256445    125412    117351  52% /
tmpfs                   242112         4    242108   1% /dev/shm
/dev/hda9             30526204  19487308   9488216  68% /home
/dev/hda8               366015      8326    338161   3% /tmp
/dev/hda5              4806048    990076   3571836  22% /usr
/dev/hda6              2883672    610948   2126240  23% /var

To make this more readable you can add the '-h', or "--human-readable" option:

skx@mystery:~$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/hda1             251M  123M  115M  52% /
tmpfs                 237M  4.0K  237M   1% /dev/shm
/dev/hda9              30G   19G  9.1G  68% /home
/dev/hda8             358M  8.2M  331M   3% /tmp
/dev/hda5             4.6G  967M  3.5G  22% /usr
/dev/hda6             2.8G  597M  2.1G  23% /var

Both of these commands will list the same information for each mounted partition, or device.

An alternative to the standard df command is the di package. Once it's installed (via apt-get install di) you can invoke it without any arguments to see a basic overview:

root@sun:~# di
Filesystem      Mount             Megs     Used    Avail %used fs Type
/dev/sda1       /               8450.3   1381.8   6639.3  21%  ext3   
tmpfs           /dev/shm         250.8      0.0    250.8   0%  tmpfs  
/dev/sdb7       /home          34181.3  14406.5  18038.5  47%  ext3   

di is very customizable. For example you can make the display use Gb instead of Mb for its output:

root@sun:~# di -g
Filesystem      Mount            Gigs     Used    Avail %used fs Type
/dev/sda1       /                 8.3      1.3      6.5  21%  ext3   
tmpfs           /dev/shm          0.2      0.0      0.2   0%  tmpfs  
/dev/sdb7       /home            33.4     14.1     17.6  47%  ext3 

You can also specify which columns should be specified (such as "-A" to show all available columns) with the use of a format string.

A format string basically specifies which fields you wish to see, with codes replaced by the output. To specify the format string you use "-f", then the list of columns you wish to include from the list of available ones.

The formats are described in the man page which you can read by running "man di" but as a quick example you can see the mounted filesystems and the percentage used with the following command:

root@sun:~# di -f m2
Mount           %used
/                17% 
/dev/shm          0% 
/home            44% 

Once you've discovered how much space you have available the next most common thing you'll wish to do is see how much space a given directory or file is using.

The du, (or "disk usage"), command will show you how much space a file or directory is occupying:

root@sun:~# du --human-readable 
12K     ./.gnupg
4.0K    ./tmp
4.0K    ./public_html
8.0K    ./.ssh
4.0K    ./bin
32K     ./.irssi
100K    .

As you can see the "--human-readable", or '-h' flag works here as it did for the "df" command.

To find the biggest files or directories beneath the current one you can use the following command to produce a sorted list:

skx@mystery:~/Programs$ du -ah | sort -rn
940K    ./buildd
936K    ./buildd/wanna-build
900K    ./GNUMP3d/old/website/screenshots
896K    ./GNUMP3d/Webpages/gnump3d/screenshots
896K    ./GNUMP3d/old/software/gnump3d/screenshots
824K    ./GNUMP3d/Program/gnump3d/gnump3d-2.9.4.zip
820K    ./Shellcode-Exploitation-Book
816K    ./Shellcode-Exploitation-Book/Source_Files
780K    ./HTMLPP/htmlpp-4.2a
704K    ./altermime-0.3.4
.... ...
.... ...
etc

Add "| head" to only show the top few lines :

skx@mystery:~$ du -ah | sort -rn | head

The total size of directory can be found by using the either the '-s' or "--summarize" opton:

skx@mystery:~$ du --summarize --human-readable  
4.8G    .

Or for file or directory in the current one:

skx@mystery:~$ du --summarize --human-readable   *
13M     Archive
104K    bin
667M    debian
1.2M    debian-administration
1.5G    Images
470M    Mail
55M     Programs
69M     public_html
1.2M    ROMS
18M     sarge
7.1M    Text
252M    tmp
381M    Videos
301M    Web

Thankfully you don't need to type out most of the options to df or du as bash's tab completion will fill them out for you easily.

If you install the logwatch package you'll receive a daily email of all the "interesting" messages from your system - as well as a summery of your available disk space. This can be useful if you have a lot of machines and don't wish to be suprised in the event of a full disk

There are also a large number of scripts available online which will send you an email when a partition is reaching a state of fullness - which you can schedule to run regularly via cron if you wish.

 

 


Posted by Anonymous (152.66.xx.xx) on Fri 20 May 2005 at 09:55
I use this script for email sending.
#!/bin/bash
system=`hostname --fqdn`
for line in `df -aPh | grep "^/" | sort | awk '{print$6"-"$5"-"$4}'`; do
percent=`echo "$line" | awk -F - '{print$2}' | cut -d % -f 1`
partition=`echo "$line" | awk -F - '{print$1}' | cut -d % -f 1`

limit=90

if [ $partition == '/cdrom' ]; then
limit=101
fi

if [ $percent -ge $limit ]; then
echo "Free Space warning!" | mail -a "From: spacecsekker" -s "[$system] $line" xxx@xxx.bme.hu
fi
done

[ Parent | Reply to this comment ]

Posted by Anonymous (130.231.xx.xx) on Fri 20 May 2005 at 10:00
Some monitoring systems allow you to see a long-time graph of disk usage. It may be less useful if your volumes hover around 99% whether they have ample space or not, but may be useful in spotting growing trends etc on server log directories and such.

I use munin myself. I just wish the bloody thing wasn't in perl. Otherwise it's nice.

[ Parent | Reply to this comment ]

Posted by Anonymous (82.69.xx.xx) on Sat 21 May 2005 at 02:11
I use cacti myself. Very useful.

[ Parent | Reply to this comment ]

Posted by midget (80.202.xx.xx) on Sat 21 May 2005 at 14:59
I use torsmo (similar to gkrellm), and it works just fine for me :)

[ Parent | Reply to this comment ]

Posted by maurits (80.126.xx.xx) on Fri 20 May 2005 at 11:06
I use two aliases for du:
alias dum='du --max-depth=1'
alias duh='du -h --max-depth=1'
dum signifies max-depth and duh signifies human-readable. "--max-depth=1" makes sure you get the disk usage for the complete directory and not for every individual ((sub)sub)subdirectory. Most of the time that is the info I want.
maurits@mauritsvanrees:~/web$ duh
93K     ./links
24K     ./images
15M     ./preken
1,1M    ./studie
232K    ./weblog
16M     .
The human readable part does not work so well when sorting:
maurits@mauritsvanrees:~/web$ duh | sort -rg
232K    ./weblog
93K     ./links
24K     ./images
16M     .
15M     ./preken
1,1M    ./studie
This sorts by numerical value, but the difference between a number with M or K at the end goes beyond sort. [BTW, I don't see any difference between 'sort -n' and 'sort -g'.] That's where the 'dum' alias comes in handy:
maurits@mauritsvanrees:~/web$ dum | sort -rg
16058   .
14593   ./preken
1081    ./studie
232     ./weblog
93      ./links
24      ./images

[ Parent | Reply to this comment ]

Posted by stevenothing (81.6.xx.xx) on Sat 21 May 2005 at 11:59
May also be worth mentioning about those times when you see a massive disparity between the results of a df and a du. You don't seem to be using up that much space, but you hardly have any free either. This is often down to long running processes keeping their file handles open (in my experience, often caused by logrotate not doing a restart properly).

These can be found with

lsof | grep deleted

Easiest way to free up the space is to simply restart the process that is holding the filehandle open.

Few other useful things I've found for tracking down large files:

find /home/ -size +100000k

to look for large files, and:

find /home -size +100000k -ls | sort -n -k7,7

To get the same results sorted.

Installing quotas can also be useful, as it allows you to see what users are using up lots of space instantly, instead of waiting for a recursive du process, which on large drives can take ages. Downside to this method being that you then have to deal with quotas, which can be undesirable (as in a recent accident whereby vpopmail user accidentally got a quota of about 500Mb on a machine with 800+ users...)

[ Parent | Reply to this comment ]

Posted by Arthur (168.150.xx.xx) on Sun 22 May 2005 at 09:03
[ View Weblogs ]
If you really want to go nuts, install mon, and specifically, use its DiskSpace (freespace.monitor) monitor. You can set an alert level, and not worry about things until the available storage is less than some configurable amount.

It's available as a .deb (mon), but you'll be better off getting a newer version. The mailing list is active, but not high volume, and the primary developers are active participants in it.

nagios is great if you have to impress non-technical types, but mon is great because it stays out of the way until you want to deal with it.

mon is cool.

[ Parent | Reply to this comment ]

Posted by crispy (84.65.xx.xx) on Mon 23 May 2005 at 20:26
You can also use "du" to help spot "sparse" files.
e.g.

# ls -l /var/log/lastlog
-rw-rw-r-- 1 root utmp 292292 May 23 21:06 /var/log/lastlog

# du -k /var/log/lastlog
12 /var/log/lastlog

ls -s also works:

# ls -s /var/log/lastlog
12 /var/log/lastlog

[ Parent | Reply to this comment ]

Posted by Anonymous (194.85.xx.xx) on Thu 3 Nov 2005 at 04:05
What about traditional quote?
And don't use "du -h |sort -rn", but use "du -k|sort -rn" (sizes in kb). -- WBR,a5b

[ Parent | Reply to this comment ]

Posted by VLegacy (69.243.xx.xx) on Tue 1 Aug 2006 at 04:39
I wrote this tiny script to check my disk space and email me if a partition filled up above 90%. I placed it under /etc/cron.hourly/.

#!/usr/bin/env bash

# loop over each device name culled from df with grep, excluding tmpfs entries
# the device name is stored in $dev
for dev in `df | grep -v tmpfs | egrep -o ^/dev/[[:alnum:]]\{3,4\}`; do
used=`df | grep ^$dev | egrep -o [0-9]\{1,2\}% | egrep -o [0-9]\{1,2\}` # the percent of space used is stored in $used
fs=`df | grep ^$dev | egrep -o /[[:alpha:]/]*$` # the mount point is stored in fs
if [ $used -ge 90 ]; then # if more than 90% of space is used
echo "$dev ($fs) is $used% full." | mail -s "$dev is low on free space." root # send a warning email to root
fi
done

I'm sure it could be improved upon, but it's simple, and it does what I needed it to do.

[ Parent | Reply to this comment ]

Posted by Anonymous (216.241.xx.xx) on Tue 8 Apr 2008 at 18:34
find . -maxdepth 1 -exec du -ks {} \; | sort -rn | head -n 15

This accounts for hidden files as well.

[ Parent | Reply to this comment ]

Sign In

Username:

Password:

[Register|Advanced]

 

Flattr

 

Current Poll

What do you use for configuration management?








( 471 votes ~ 5 comments )