Weblog entry #136 for Steve

Fighting image spam
Posted by Steve on Sat 4 Nov 2006 at 12:40
Tags: none.

Following a nice suggestion here is how I'm catching 99% of incoming image-spam to my servers:

:0 H
* ^Content-Type:.*(multipart/related)
image-spam

(Procmail, obviously.)

No false positives yet....

 

Comments on this Entry

Posted by ajt (81.6.xx.xx) on Sat 4 Nov 2006 at 13:12
[ Send Message | View Weblogs ]
procmail is your friend.

spamassassin is great, but with the volume of spam on the net thesedays you do need a multi-layered approach.

--
"It's Not Magic, It's Work"
Adam

[ Parent | Reply to this comment ]

Posted by Steve (62.30.xx.xx) on Sat 4 Nov 2006 at 19:34
[ Send Message | View Steve's Scratchpad | View Weblogs ]

I've had too many bad experiences with spamassassin (albeit a long time ago) to ever trust it again.

Right now I use procmail which invokes pyzor, razor, and spambayes. Then beyond that I have a couple of custom rules for various steve-specific checks.

(e.g. I added whitelisting support so that some senders never get checked, and I killfile some troublesome senders.)

Steve

[ Parent | Reply to this comment ]

Posted by simonw (84.45.xx.xx) on Sat 4 Nov 2006 at 19:03
[ Send Message | View Weblogs ]
Sorry, I'm being dim here.

What is the pattern being matched in English?

I'm seeing no difference in header "Content-Type:" with "multipart/related" in, between genuine email with it in my personal mail archive, and spam that managed to make it through the filtering at work in the last month or so. With the exception the ham sometimes has 'type="multipart/alternative";' appended to the line, and the spam hasn't so far.

Indeed I have glanced at the small amount of such spam getting through, and it appears well formed, correctly mimicing genuine email. I did note that it often claims to have been generated by relatively old versions of Thunderbird, or Outlook, which I think one could reasonably reject if one wasn't too concerned about the odd false positive with a "please upgrade" message, especially where we know that version claimed has publically known security vulnerabilities.

I also see a lot of spam with SMTPSVC in received headers passing Greylisting, I don't think it is more Microsoft vulnerabilities, I think these are forged headers. Specifically I see received headers starting "Received: from User" in some spam (not much, about 2% of spam passing our filters), but I have never had that string in a genuine email.

Other more potent patterns, would be "(helo=<dotted quad>)" appearing in a received header. In excess of 10% of spam getting passed filters has this, exceptionally little genuine email has this, but rather correctly places the dotted quad in square brackets.

I suspect the next change I will make to our spam filtering in "helo" checks. Of course unlike greylisting, the changes the spammers need to make to pass "helo" checks don't have any long term cost to the spammer. I may also deploy blacklist checking again other received headers (spots relaying), and "X-Originating-IP" (allows one to selectively block spam for some vendors who don't police their own servers for abuse very well).


[ Parent | Reply to this comment ]

Posted by Steve (62.30.xx.xx) on Sat 4 Nov 2006 at 19:33
[ Send Message | View Steve's Scratchpad | View Weblogs ]

This is matching the following line in the header of received messages:

Content-Type: multipart/related;

This has so far only shown up in image spam which I've received, and as I said no false positives. However I guess that depends upon the mail that you're liable to receive.

"type" is a mandatory field with this content type, so mails missing that are breaking RFC2387 - or such is my current understanding.

Steve

[ Parent | Reply to this comment ]

Posted by simonw (84.45.xx.xx) on Sun 5 Nov 2006 at 09:44
[ Send Message | View Weblogs ]
I meant "type" on the same line, as well as "boundary", rather than after a carriage return, the spam never has anything else on the same line, it may have it in the same header.

But, I have "multipart/related" in my email archive from genuine email, about 60 messages, probably about 1% of incoming ham.

[ Parent | Reply to this comment ]

User Login

Username:

Password:

[ Advanced Login ]

Register Account

Quick Site Search