Tuesday, June 25, 2024

Why Can’t Google Catch Easy Spam?

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Discuss this article in the Datamation discussion forumComment on Google’s Spam-Fighting Skills

Google’s Gmail spam filter is legendary, and considered one of the best in the business. And it’s true, for the most part. I currently use Gmail’s spam filtering both for Gmail, and also for POP3 mail, which I “launder” through Gmail in order to take advantage of the great spam filtering.

One of the amazing features about Gmail spam filtering is its low percentage of “false positives” — e-mail identified as spam by the system that’s really legitimate e-mail.

Another important fact is that Gmail’s spam filter gets better over time. Its ability to catch spam keeps going up, and false positives go down.

I currently have 4776 messages in my “Spam” folder (can anyone top that?), and get about 10 or so spasm in my inbox every day. As a ratio, you really can’t beat that.

Mike Elgan and More

Why You’ll Hate Cell Phone Spam

Is It Time to Globalize Time?

Killing XP: Microsoft’s Fatal Error

How Fake Is Your PC?

FREE Tech Newsletters

Gmail’s spam filtering really is great, and I shouldn’t complain. But, hey, it’s my job.

I’m perplexed about Gmail’s poor handling of two specific kinds of easy-to-spot spam: 1) Nigerian 419 e-mail, and 2) foreign-language spam.

I get Nigerian scam e-mails in my inbox every day. They somehow evade the Gmail spam filters. Sometimes I identify them using the “Report Spam” button, and then I get an identical one later in the day.

These e-mails seem to me to be trivially easy to identify. They tend to use oddly out-of-style language, and unusual formality combined with grammatical errors. Just flag any message sent from Nigeria or surrounding countries that contains any five of the following words or phrases: “pray,” “faith,” “proposal,” “introduce myself,” “widow” “fund,” transfer,” “expenses,” “bank,” “urgently,” “the late Mr. ,” “need your assistance,” “deposited the sum of,” “asking for your kind assistance” — that sort of thing.

Given the fantastic job Gmail does of catching other kinds of spam, how hard can it be to stop e-mail with these obvious clues?

Even more confusing is why foreign-language spam isn’t flagged. I’ve been using Gmail for two years, and I have flagged as spam every single e-mail I’ve received that’s entirely in Mandarin or Russian.

Google: I don’t speak Chinese! When I get e-mail that’s entirely Chinese characters, it’s spam, OK? What could be easier than that?

I understand that Google has to be careful. It can’t summarily dismiss notes about widows in Nigeria, or Chinese- or Russian-language e-mail. Some percentage of users actually get legitimate e-mail that fall into these categories. But Gmail should be capable of individual, user-added criteria, such as aggressive Nigerian-scam filtering, and maybe an option that tells Gmail to “consider all mail in this language as spam.”

What about you? Does apparently easy-to-spot spam make it into your Gmail inbox? Tell me about it: mike.elgan@gmail.com

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Similar articles

Get the Free Newsletter!

Subscribe to Data Insider for top news, trends & analysis

Latest Articles