Weblog entry #136 for Steve
Following a nice suggestion here is how I'm catching 99% of incoming image-spam to my servers:
:0 H * ^Content-Type:.*(multipart/related) image-spam
(Procmail, obviously.)
No false positives yet....
Comments on this Entry
spamassassin is great, but with the volume of spam on the net thesedays you do need a multi-layered approach.
--
"It's Not Magic, It's Work"
Adam
[ Parent | Reply to this comment ]
[ Send Message | View Steve's Scratchpad | View Weblogs ]
I've had too many bad experiences with spamassassin (albeit a long time ago) to ever trust it again.
Right now I use procmail which invokes pyzor, razor, and spambayes. Then beyond that I have a couple of custom rules for various steve-specific checks.
(e.g. I added whitelisting support so that some senders never get checked, and I killfile some troublesome senders.)
[ Parent | Reply to this comment ]
What is the pattern being matched in English?
I'm seeing no difference in header "Content-Type:" with "multipart/related" in, between genuine email with it in my personal mail archive, and spam that managed to make it through the filtering at work in the last month or so. With the exception the ham sometimes has 'type="multipart/alternative";' appended to the line, and the spam hasn't so far.
Indeed I have glanced at the small amount of such spam getting through, and it appears well formed, correctly mimicing genuine email. I did note that it often claims to have been generated by relatively old versions of Thunderbird, or Outlook, which I think one could reasonably reject if one wasn't too concerned about the odd false positive with a "please upgrade" message, especially where we know that version claimed has publically known security vulnerabilities.
I also see a lot of spam with SMTPSVC in received headers passing Greylisting, I don't think it is more Microsoft vulnerabilities, I think these are forged headers. Specifically I see received headers starting "Received: from User" in some spam (not much, about 2% of spam passing our filters), but I have never had that string in a genuine email.
Other more potent patterns, would be "(helo=<dotted quad>)" appearing in a received header. In excess of 10% of spam getting passed filters has this, exceptionally little genuine email has this, but rather correctly places the dotted quad in square brackets.
I suspect the next change I will make to our spam filtering in "helo" checks. Of course unlike greylisting, the changes the spammers need to make to pass "helo" checks don't have any long term cost to the spammer. I may also deploy blacklist checking again other received headers (spots relaying), and "X-Originating-IP" (allows one to selectively block spam for some vendors who don't police their own servers for abuse very well).
[ Parent | Reply to this comment ]
[ Send Message | View Steve's Scratchpad | View Weblogs ]
This is matching the following line in the header of received messages:
Content-Type: multipart/related;
This has so far only shown up in image spam which I've received, and as I said no false positives. However I guess that depends upon the mail that you're liable to receive.
"type" is a mandatory field with this content type, so mails missing that are breaking RFC2387 - or such is my current understanding.
[ Parent | Reply to this comment ]
But, I have "multipart/related" in my email archive from genuine email, about 60 messages, probably about 1% of incoming ham.
[ Parent | Reply to this comment ]