Weblog entry #185 for Steve
Here are my incoming spam stats for yesterday:
Spam Rejecting Plugin Count
--------------------------------------------------------------
dnsbl 6799
hosts_allow 1596
greylisting 740
check_earlytalker 429
check_badrcptto 239
require_resolvable_fromhost 217
check_spamhelo 85
virus::clamav 69
check_badmailfrom 11
--------------------------------------------------------------
Total Mails : 10610
Total SPAM : 10185
Total Accepted : 425
Spam Percentage: 95.99%
Of over 10 thousand incoming mails I accepted only 400ish for delivery. The rest were rejected at SMTP time - making my total mail 4.01% non-spam.
Most were rejected because of DNSRBL (6799). Then 1596 were rejected because they were sent by hosts which have been on a DNSRBL in the past five days.
Finally, the reason for this post, of the remaining mail 740 senders sent a message which was not retried via greylisting.
Since mail was already rejected by DNSRBLs prior to that test it is unclear how effective greylsting would have been - but as a percentage of all mail not already rejected 33%.
I guess that means it is worth keeping - however it is clear that greylisting alone does virtually nothing for my system since most of my spam is routed via my @debian.org mail address - which will happily retry until the greylisting succeeds. (via master.debian.org).
Comments on this Entry
I'm assuming you have talked about it in the past, I'm just curious what the exact combination is at the moment?
--
"It's Not Magic, It's Work"
Adam
[ Parent | Reply to this comment ]
[ Send Message | View Steve's Scratchpad | View Weblogs ]
All stats coming from a soon-to-be-launched SMTP proxy service, currently in use by myself and a few local people.
Internally it uses MySQL for saving per-domain configuration data (with a web-based control panel allowing different filtering types to be enabled/disabled on a per-domain basis and logs to be viewed).
At the SMTP side I use a combination of Exim4 + QPSMTPD + a completely replaced collection of custom plugins to do the detection/processing and logging.
[ Parent | Reply to this comment ]
I have a domain that only gets spam, with no known forwarding, that makes the maths easier.
Yesterday;
49 spam email delivered.
15400 Rejected connections.
In order (roughly).
34 HELO'ed with our own servers details.
14046 on SPAMHAUS zen list.
1105 Greylisted connections.
161 on ix.dnsbl.manitu.net
54 sender domain did not exist.
So Greylisting is still getting about 80% of spam email exposed to it. This is a lot less effective that my previous 99% measured for Greylisting when I introduced it, but one needs to account for other changes in the processes used. For example spamhaus now lists IP address space declared as dead by ISPs, which means it should catch a greater proportion of spambots.
The domain is slightly unusual in that I accept all email for all addresses. Real domains are best configured without catch-alls, which usually kills another fair chunk.
I've no doubt Greylisting has been supplanted by Spamhaus Zen, as the most effective single measure, due to the advent of spambots that retry. Greylisting is still very effective where no forwarding occurs.
For my personal email I'm using the policy-weightd package, and I believe it substantially improves on the better than the ~99.7% spam kill rate above, with (so far) no discovered false positives, but I've not documented the skill (I don't get that much rubbish email sent to my own email address).
The filtering above is still almost entirely devoid of content based filtering, and so errors are all related to sender behaviour and reputation, and won't block email that just happens to look like spam, or talk about spam/abuse issues. On the other hand it doesn't stop spam forwarded by Debians mail servers either, or outscatter.
Almost all my spam is forwarded from the GNU email servers, who claim they have just finished a roll-out of new anti-spam system. A close second is a mail server run by a guy who works for Message Labs (but at least he has the excuse it is too much like his day job).
[ Parent | Reply to this comment ]
Does that mean Constant Contact constitute 7% of all spam sources on the Internet? Lies, damned lies, and statistics.
[ Parent | Reply to this comment ]
My most serious recent spam problem has been backscatter, i.e. someone uses my address as the from: in their spam, and I get the bounces (about 5000 in one weekend). Of course this has different characteristics from other spam and it's harder to filter; I'm currently rejecting mail 'from:<>' from IPs in backscatterer.org, but this includes sites like the sourceforge and gnu.org mail servers that do "call out" verification.....
Phil.
[ Parent | Reply to this comment ]
I tested the two I mention extensively, and have never seen a false positive with either (unlike Greylisting), in one place this is many 10's millions of (mostly reject) decisions, no complaints.
That said I like spamhaus because they have end user removal. i.e. If a real person is blocked, they can click on a link, and say "I'm a real person not a spambot", and so there is no need for people to bug me (postmaster) to get themselves removed.
I think a bigger issue is big chunks of IP space, controlled by static IP address spammers, not in spamhaus. They can be a tad conservative, which is I guess why they have so few false positives.
Backscatter/outscatter is discussed by Wietse on the Postfix site. If you control the sending hosts for all email for a domain you can try Domain keys (or [spits] SPF), but Wietse explains how you can simply tag outgoing mail, so that you can recognize a lot of junk that isn't a genuine bounce. Blocklists are useless against outscatter, since most of the sources are genuine email servers (or Barracuda boxes [spit]).
"Call out" is bad karma, as it shifts forged sender costs, onto the forged sender, and he probably doesn't have the resources that the spammer does. Bad GNU (I've told them this several times). It also messes with greylisting, since the call out is declined first time, delaying email unnecessarily, but I think that is a minor point, since the host is usually whitelisted soon enough.
[ Parent | Reply to this comment ]
So you mean that you get false positives with greylisting, i.e. genuine messages where the sending MTA doesn't retry? Or do you just mean that the messages are delayed? If the former, I'm worried!
> If you control the sending hosts for all email for a domain
> you can try Domain keys (or [spits] SPF)
Ah, SPF, that's something else that you & Steve don't mention. I have published SPF records, and some of the *clueless* backscatter bounces were sent to me because the SPF had failed! *D'oh*
> Blocklists are useless against outscatter, since most of the sources are
> genuine email servers
Have a look at backscatterer.org if you start to find it a problem. You won't want to treat it like other blacklists though; only for from:<> messages.
> (or Barracuda boxes [spit]).
Double spit.
> "Call out" is bad karma, as it shifts forged sender costs, onto the
> forged sender, and he probably doesn't have the resources that the
> spammer does.
Agreed, though wasting my bandwidth is orders of magnitude less bad then putting equivalent volumes of spam in my inbox. This leads to a problem with backscatterer.org; they treat sites that do callout the same as sites that send backscatter bounces, and I'd like to be able to be selective.
> Bad GNU (I've told them this several times).
And I've told sourceforge. One problem is that some sourceforge lists, like Debian lists, don't require subscription. So they do callout to keep the spam down. Although it can be useful to let anyone post messages without subscribing, I think the callout issue outweighs it; if you require subscription before posting then there's no need for callout.
However, I did find a solution that lets me post to sourceforge and gnu lists (based on something Google found in the Exim docs I think): my backscatterer blacklist rule is applied after DATA, not after RCPT; the callout verification sees a normal response to the RCPT command and is happy.
[ Parent | Reply to this comment ]
Some mail servers don't retry, since I'd reject email from these anyway if the server was overwhelmed with spam (or even genuine email), I regard it as acceptable to reject email from such systems.
Some email is retried from servers in a different /24, thus failing Postgrey greylisting. I've not seen one of these in anger, but I did follow up on a report of such on the Postgrey mailing list, so they exist but are extremely rare.
SPF I don't mention as I don't reject on this basis. Too much email is forwarded without envelope rewriting, that rejecting on SPF fail/softfail would reject more genuine email than spam (after other filters have been applied), as such I regard SPF broken by design.
Note I regard almost all spam filtering as a tactical fix, the real problem (in most cases) is the large number of compromised nodes on the Internet.
[ Parent | Reply to this comment ]
[ Send Message | View Utumno's Scratchpad | View Weblogs ]
You guys talk from a mail administrator standpoint; let me add my $0.02 from a user's standpoint.
I have a shell account at a friend's FreeBSD server and have been receiving all my private mail there for the last 12 years. He provides a simple spamassasin+procmail setup. Here's a snipplet from my .procmailrc:
(...) # deal with spam: # SPAM_LEVEL <= 4.0 --> good # 4.0 < SPAM_LEVEL <= 15.0 --> possible spam # 15.0 < SPAM_LEVEL --> /dev/null # this bastard just wont remove me from his mailing list :0: * ^From.*pgr_art* /dev/null :0fw:spamassassin.lock | /usr/local/bin/spamc -U /var/run/spamd # delete all mail with SPAM STATUS >= 15 :0 * ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\* /dev/null # move all mail above SPAM threshold to the SPAM folder :0: * ^X-Spam-Status: Yes $SPAM
First, I've been testing this for 2 years without the /dev/null option. During those two years, I got 3 or 4 false positives, but all of them scored well below the 15 point threshold. As I receive about 100 spams a day, and cleaning the spam/ folder was a big nuisance, I decided to just send everything above 15 penalty points straight to /dev/null.
I dont keep exact statistics, but looking at my procmail.log reveals that about 80% of the mail goes to /dev/null, next 15% to spam/ , while about 5% goes to Inbox.
Even with that, 1-2 spams/day manage to break in my Inbox.
[ Parent | Reply to this comment ]