Interception of files with tcpdump

Posted by thomasl on Fri 2 Nov 2007 at 10:48

If you're like me you want to know whats going through your home network. Here is how to use tcpdump, tcpflow and foremost to intercept and extract unencrypted files.

I've never seen a tutorial on this subject before, so I figured I'd try to give a little something back to the GNU/Linux community that has taught me so much over the years.

If you're familiar with the programs mentioned then you can probably stop reading now, as the last sentence just gave you all the information you need. Anyone who isn't familiar with these programs, I urge you to look at the man pages (or search Google) for the three utilities previously mentioned.

Now I urge you to be a responsible administrator and not go off invading your user's or little brother's privacy. However if you are a parent then this is a fine way of finding out what kind of images are coming over your child's ethernet cable. All-in-all what you do with this information is your own business, and not my problem.

First off, run tcpdump on a computer that can sniff the packets of interest. I can do this on my Linux router named dunmer (after the elves in Elder Scrolls).

$ sudo tcpdump -i eth1 -s0 -w rawdump host picard

After a while of browsing the internet upon picard (my main desktop) I go back to dunmer and stop the dumping. I download the rawdump file to picard and put it in it's own directory. (I do this just to make it easier.)

Packets of course will often arrive to the interface out of order, or duplicated. Also there's the problem of packets from one file transfer arriving inter-mixed with packets from another data transfer. Many other problems also exist to make files harder to find, so I use tcpflow to order the data.

I create a temporary directory and change into it. Then I run the tcpflow command...

$ tcpflow -r ../rawdump

Now the data has been broken down, and at this point we could actually go into every flow file and extract the data from it, but to make it easier I'll just put everything into one file. So I cd to the previous directory and use a for loop for that purpose. Why a for loop? Because it's very possible that there will be "too many arguments" if you do it more directly. EDIT: Use "find" instead of a "for loop", thanks to an Anonymous commenter.

$ for i in temp/*; do cat $i >> dump; done
$ find ./ -exec cat '{}' \; > dump

Now that everything is nicely ordered together, just run foremost and wait for it to extract the data. I could have just run foremost on the rawdump file, but that would result in incomplete and corrupted data.

Any suggestions or improvements are welcome!

Share/Save/Bookmark


Posted by Anonymous (151.46.xx.xx) on Fri 2 Nov 2007 at 12:45
I think the for-loop you proposed would be just a little better in this way:
$ for i in temp/*; do cat $i; done > dump

Anyway isn't that for-loop as unsafe as "cat *", if shell is expanding wildcard in too many files?
Maybe the safest would be:
$ find temp/ | xargs -n100 cat > dump"

Thanks, interesting article!

[ Parent | Reply to this comment ]

Posted by Anonymous (194.97.xx.xx) on Fri 2 Nov 2007 at 13:46
If you prefer a more GUI-oriented approach, you can just load the file created by tcpdump into Wireshark (the program formerly known as Ethereal) and use its filters or highlight a packet of interest and select "Follow TCP stream" from the menu.

Regards,
Marc

[ Parent | Reply to this comment ]

Posted by thomasl (70.145.xx.xx) on Fri 2 Nov 2007 at 17:54
[ Send Message ]
Thanks for the comment :)
I'll always have more to learn, and agree that my for loop wasn't really that good. I believe this will work as well...

find ./ -exec cat '{}' \; > dump

[ Parent | Reply to this comment ]

Posted by Anonymous (151.46.xx.xx) on Sat 3 Nov 2007 at 01:45
Yes, it will work correctly, but with a performance difference.

$ find ./ -exec cat '{}' \; > dump
This command will execute a process cat n times for n files.

$ find ./ | xargs -n100 cat > dump
Instead, this command will execute a cat process n/100 times, as every process will have at most 100 arguments (100 is just an arbitrary number, that should ensures to not exceed command buffer).

[ Parent | Reply to this comment ]

Posted by marki (78.141.xx.xx) on Sat 3 Nov 2007 at 13:42
[ Send Message ]
Hi,

xargs is somehow able to determine the maximal number of arguments shell/system is able to use, so there is no need to specify it.
Also using xargs instead of find -exec can be dangerous and does not work always. Imagine you have filenames with spaces. But the solution is really simple:
find . -print0 | xargs -0 cat
That will tell find to pass filenames as null-terminated strings and xargs to expect it like that and it will take care of properly passing it as arguments to cat.
(Unfortunately for me, xargs on HP-UX does not have this option and I need it sometimes at work.)

[ Parent | Reply to this comment ]

Posted by Anonymous (190.2.xx.xx) on Thu 8 Nov 2007 at 02:22
with linux kernel >= 2.6.23 the maximal number of arguments its removed, so using a newer kernel could make you forget about all of this =)

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git; a=commit;h=b6a2fea39318e43fee84fa7b0b90d68bed92d2ba

[ Parent | Reply to this comment ]

Posted by dkg (216.254.xx.xx) on Sun 11 Nov 2007 at 02:15
[ Send Message | View dkg's Scratchpad | View Weblogs ]
I think you actually want (from the parent directory):
$ find ./temp -type f -exec cat '{}' \; > dump
Do it from one level up to avoid the risk of dumping the contents of `dump` into itself, and only select real files (-type f) so that you don't try to cat directories, sockets, etc.

I'm happy to see the poster below suggest that argument count limits will soon be a thing of the past for the Linux kernel, which would mean we could get back to a nice simple:

$ cat ./temp/* > dump

Thanks for an interesting article!

[ Parent | Reply to this comment ]

Posted by Federico2 (2001:0xx:0xx:0xxx:0xxx:0xxx:xx) on Fri 2 Nov 2007 at 14:18
[ Send Message ]
You could avoid loosing information about the involved hosts in each conversation:
$ mkdir /tmp/capture && cd /tmp/capture
$ sudo tcpflow #...options...
$ for f in *; do foremost -i $f -o $f-out && rm $f; done
Foremost seems to be able to extract data as well.

[ Parent | Reply to this comment ]

Posted by hypnojazz (87.0.xx.xx) on Sat 3 Nov 2007 at 22:33
[ Send Message ]
For this purpose use tcpxtract.

Extracts files from network traffic based on file signatures (magic numbers).

Supports tcpdump's files too and 26 (more?) files types/formats.

Easly can import foremost's config file.

Good Luck!
Ciao

[ Parent | Reply to this comment ]

Posted by Anonymous (57.250.xx.xx) on Mon 5 Nov 2007 at 02:08
Hi,

I would also suggest tcpxtract which works great!
You can also take a look at assniffer (for http sessions): http://www.cockos.com/assniffer/

-- julien

[ Parent | Reply to this comment ]

User Login

Username:

Password:

[ Advanced Login ]

Register Account

Quick Site Search