Why Can't Google Catch Easy Spam?

Google: I don't speak Chinese! When I get e-mail that's entirely Chinese characters, it's spam, OK? What could be easier than that?

Discuss this article in the Datamation discussion forumComment on Google's Spam-Fighting Skills

Google's Gmail spam filter is legendary, and considered one of the best in the business. And it's true, for the most part. I currently use Gmail's spam filtering both for Gmail, and also for POP3 mail, which I "launder" through Gmail in order to take advantage of the great spam filtering.

One of the amazing features about Gmail spam filtering is its low percentage of "false positives" -- e-mail identified as spam by the system that's really legitimate e-mail.

Another important fact is that Gmail’s spam filter gets better over time. Its ability to catch spam keeps going up, and false positives go down.

I currently have 4776 messages in my “Spam” folder (can anyone top that?), and get about 10 or so spasm in my inbox every day. As a ratio, you really can’t beat that.

Mike Elgan and More
Why You'll Hate Cell Phone Spam

Is It Time to Globalize Time?

Killing XP: Microsoft's Fatal Error

How Fake Is Your PC?

FREE Tech Newsletters

Gmail's spam filtering really is great, and I shouldn't complain. But, hey, it's my job.

I'm perplexed about Gmail's poor handling of two specific kinds of easy-to-spot spam: 1) Nigerian 419 e-mail, and 2) foreign-language spam.

I get Nigerian scam e-mails in my inbox every day. They somehow evade the Gmail spam filters. Sometimes I identify them using the "Report Spam" button, and then I get an identical one later in the day.

These e-mails seem to me to be trivially easy to identify. They tend to use oddly out-of-style language, and unusual formality combined with grammatical errors. Just flag any message sent from Nigeria or surrounding countries that contains any five of the following words or phrases: "pray," "faith," "proposal," "introduce myself," "widow" "fund," transfer," "expenses," "bank," "urgently," "the late Mr. ," "need your assistance," "deposited the sum of," "asking for your kind assistance" -- that sort of thing.

Given the fantastic job Gmail does of catching other kinds of spam, how hard can it be to stop e-mail with these obvious clues?

Even more confusing is why foreign-language spam isn't flagged. I've been using Gmail for two years, and I have flagged as spam every single e-mail I've received that's entirely in Mandarin or Russian.

Google: I don't speak Chinese! When I get e-mail that's entirely Chinese characters, it's spam, OK? What could be easier than that?

I understand that Google has to be careful. It can’t summarily dismiss notes about widows in Nigeria, or Chinese- or Russian-language e-mail. Some percentage of users actually get legitimate e-mail that fall into these categories. But Gmail should be capable of individual, user-added criteria, such as aggressive Nigerian-scam filtering, and maybe an option that tells Gmail to “consider all mail in this language as spam.”

What about you? Does apparently easy-to-spot spam make it into your Gmail inbox? Tell me about it: mike.elgan@gmail.com




Tags: Google, spam, IT, e-Mail


0 Comments (click to add your comment)
Comment and Contribute

 


(Maximum characters: 1200). You have characters left.