Gmail filters out an additional 100 million spam messages every day thanks to AI

published on 11/02/2019,

updated on 17/08/2021

This article is a summarized translation of a post published on the Google Cloud blog by Neil Kumaran. We draw some conclusions after the translation.

Original title: Spam does not bring us joy - ridding Gmail of 100 million more spam messages with TensorFlow

1.5 billion people use Gmail every month, and 5 million businesses use Gmail via G Suite. For consumers and businesses alike, a large part of Gmail's appeal is due to its integrated security management. Good security means always staying ahead of threats, and our current Machine Learning models are highly effective. In combination with our other protections, they prevent more than 99.9% of spam, phishing and malware from reaching Gmail inboxes.Recently, we implemented new protections using TensorFlow, an Opensource machine learning framework developed by Google. These new protections complement existing protections, whether Machine Learning or rules-based. With TensorFlow, around 100 million additional spam messages are blocked every day. Where were these 100 million additional spam messages found? Mainly in spam categories that are very difficult to detect. Using TensorFlow, we were able to block messages using images, emails with hidden content and messages from newly created domains attempting to hide a low volume of spam in legitimate traffic.

Given that Gmail already blocks the majority of spam, blocking millions more with precision is a feat.TensorFlow makes it possible to block the latest 0.1% without accidentally blocking important messages for users.

One person's spam is another person's treasure

Machine Learning makes spam interception possible by helping to identify patterns in large datasets that the humans creating the rules would not be able to detect. Machine Learning enables more granular decision-making on many parameters. Consider that each email contains several thousand potential signals. Just because an email contains signals commonly considered as spam doesn't necessarily mean that the message is spam. Machine Learning allows us to check all these signals together in order to make a decision. Finally, it also helps us to tailor our spam protection to each individual user. What one person considers spam, another may consider an important message (think of newsletters or regular application notifications).

Some comments " Made in Badsender

As is always the case when we use email for marketing purposes (as you and we probably do), an advance in spam filter intelligence raises the question of the impact it will have on our campaigns. And it's bound to. As Google says, one person's spam can be another person's treasure. The same applies to the perception of the legitimacy of a message, which varies greatly from one person to another. What is very clear, and in the air at the moment in the anti-spam battle, is that all marketing messages, and I do mean ALL of them, can potentially be considered spam. Even if you have collected consent, even if there have been openings, even if you have exemplary database hygiene. In this evolution, we won't necessarily retain the part concerning the detection of messages included in images, emails using hidden content or messages coming from "fresh" domains. These practices are clearly reserved for spammers... so it's best to stay as far away from them as possible. The most important thing to remember is the notion of personalizing anti-spam protection. In the past, positive signals (opens, clicks, replies, forwards, tidies) and negative signals (spam complaints, deletes before reading, bounces, etc.) affected your reputation as a sender, but now Gmail seems to be adding a personalized reputation for each recipient. What does this say about the evolution of best practices? All the best practices of the past probably remain valid, but the personalization of your engagement actions needs to be ever stronger. It's more important than ever to modulate your marketing pressure, to personalize your content according to the individual to whom you're addressing a message.

Need help to improve your deliverability?

Discover our services of deliverability monitoring anddeliverability audit.

Need help?

Reading content isn't everything. The best way is to talk to us.

Support the "Email Expiration Date"

Brevo and Cofidis financially support the project. Join the movement and together, let's make the email industry take responsibility for the climate emergency.

I want to know more

The author

Jonathan Loriaux

I've been involved in email marketing for over ten years, and my career path began on the technical side (email campaign integration) before moving on to sales (as an eCRM expert) and finally marketing consultancy. For the past 9 years, I've been the author of the Badsender.com blog. Emailing isn't just an expertise, it's truly become my passion, which is why Badsender is now my main activity, with the creation of an emailing consulting business linked to the site.

All publications