It is not easy to find an analogy when talking about spam filters. The best one that came to my mind is a recruitment process in a company. You know, these recruitments that sometimes take several months, with many steps, psychological tests, interviews with the hierarchy, ... It's a bit the same thing with an email. So that it arrives at its destination, that it is engaged by the recipient's email box, it must pass many interviews!
The different filtering techniques
As in recruitment, there is no standard procedure for filtering an email. Each recruiter can apply their own methods and especially combine several different ones. Sometimes, a simple meeting or a CV reading will be enough to get you hired. In other cases, you will have to undergo psychological tests, provide numerous documents and meet with half a dozen managers before your application is rejected.
Challenge-response
When you send your CV to a company, you expect a response from them, either a refusal from the outset, or an appointment proposal to which you must respond. The simple fact of responding quickly to this meeting proposal will prove that you are a motivated candidate.
The "Challenge-response" technique in emailing is a bit the same, if the server you are trying to send an email to does not know you, it will send you a "Challenge". A challenge is a message asking for a manual intervention that will serve as proof of your good faith. Following this intervention, your email can be sent and you will not have to perform this action in the future.
Content filters
When a recruiting service receives your resume... they read it. But be careful, if there are too many CVs, it is very likely that the first reading is only superficial. Is the CV well presented, does it include the classic sections, is it free of spelling mistakes, ...
For the content-analyzing spam filters emails, the process is similar:
- Is your email written with correct spelling (so as not to hide for example the word V14gr4 by using some subterfuge)?
- Do the links point to respectable sites?
- Are there any groups of words that are not "listed"?
- Is the text to image ratio correct?
- ...
However, the disadvantage is the number of "false positives" that this type of filter can cause. Just because a candidate is not good at spelling does not mean that he or she will not turn out to be a brilliant graphic designer.
Black lists
It's hard to find an equivalent in the recruitment world. In any case, if this service exists, I have never heard of it. But let's pretend it does. A blacklist in the job search world would mean that you have the possibility to compare the applications received with a blacklist published by another company or organization whose objective is to reference all the bad employees... that's pretty scary!
In emailing, such lists exist. In fact, there are many of them! These lists propose to reference IP addresses or domain names sending spam. They allow system administrators to preventively block all emails coming from a source referenced in these lists.
The criteria of the blacklists are all different, some use spontaneous complaints from users, others create "spamtraps" (email boxes created to attract spammers), ...
The Grey Listing
To test your motivation, a recruiter can leave you hanging! Several days (or weeks) after sending your CV, you have not received any answer, neither positive nor negative. Since it's your dream job, you pick up your phone to check if your CV has arrived at its destination!
Greylisting is a method that assumes that a spammer will never make a second attempt to send his email if the first one failed (it would cost too much in resources). The email server, when it does not know a sender, will therefore systematically send a temporary error when receiving an email. In this case, a legitimate sender will usually try again some time later. This will prove to the server that he is probably not a spammer.
The reputation
Here we fall into a major trend in recruitment practices. How does a recruiter check your "reputation"? A quick Google search and another on Facebook. And then, your six-month prison sentence for theft or your parties that were a little too drunk resurface. It's hard to erase this kind of trace.
Need help?
Reading content isn't everything. The best way is to talk to us.
The email world also loves to check your reputation. However, this will be done in a more automated way. Email reputation consists of one or more scores that evolve over the course of your email campaigns. The evolution of this score is linked to the reactions of the recipients of your emails, to the content of these emails, ... But an important element of the email reputation is linked to the identification. Your score will be linked either to your sending domain name (but also to the domain name of your links) or to your IP address.
Bayesian filters
The Bayesian filter in recruitment is the recruiter's experience! Over the years, recruiters have learned to recognize a quality CV at a glance. Where it used to take several hours to sort through a hundred resumes, today it only takes half that time.
Bayesian filters also use experience to filter emails! The more spam the user reports, the more the filter will learn which words or groups of words should be considered spam. It is therefore a filter based exclusively on probability calculations. Bayesian filters work on the basis of a threshold. Below this threshold an email is not considered as spam, above this threshold it will be filtered.
Verification of compliance with the SMTP standard
Ok, here, it's going to be complicated to make the parallel with a resume, a cover letter or a recruitment process... so we'll skip it 😉
The SMTP protocol is the set of technical rules that describe how an email should be sent (and received). Not really the kind of text you want to have on your bedside table. On the other hand, if it's a very precise document, not everyone applies the rules described to the letter. For this reason, some people do not hesitate to filter emails according to the quality of the SMTP dialogue. But we'll leave it at that, I'm not a computer network specialist and it would take many pages to talk about what is a quality SMTP dialog.
And other purely technical methods
There are still many filtering techniques, such as Nolisting, which consists of referencing several email servers in DNS records, some of which deliberately don't work. Or the Reverse DNS" verification which consists in checking the domain name associated with an IP address.
A combination of different filters
You think it's complex? So do I! But the worst is yet to come. None of these filtering technologies is used alone. Each ISP, each webmail, each company uses a mix of all these techniques, with different settings, different tolerance thresholds, different implementations, ...
This is why deliverability is more of a balancing act than an exact science. It can be said that where a spammer tries to bypass the filters, a legitimate sender must try to meet all their criteria (but that will be the subject of another article).
Where are the filters located?
It's all well and good to talk about filters for hours, but you still need to understand where they are. Some quick answers.
User-side filters
There are two types of filters on the user side:
- Spam filter of the email client As in Outlook, Apple Mail or Thunderbird, these are mainly Bayesian filters. They work on the basis of the user's previous spam rankings.
- User-created rules Filtering: Certainly the most difficult filter for email senders to counteract. If the user has decided to filter all emails coming from your domain name in order to put them in their trash, it will be almost impossible to escape.
Server side filters (webmails, ISPs and enterprises)
- In-house filters Some organizations have decided to create their own email filtering systems. In this case, very little information will be available to know the rules used by this type of actor.
- Use of commercially available technologies Whether commercial or free (such as the famous spamassassin), these technologies are used in combination with others to create a tailor-made anti-spam solution.
- The appliances Appliances: Mainly used in companies (but not only), appliances are "ready-to-use" servers that filter incoming emails. You just have to connect them upstream of the company's email server, and the system administrator just has to trust them.
But also filters on the router side
Don't think that spam filtering technologies are only used on the recipient's side. In order to prevent spammers from using their platforms, email routing solutions are forced to create preventive filters. But in addition to the above solutions, they will also use other techniques:
- Limiting the sending speed for recent accounts;
- Verification of the complaint rates received from the feedback loops;
- Analysis of the identity of new customers;
- ...
Techniques in constant evolution
Don't think that everything written here is set in stone. As in the world of doping, spammers are always one step ahead. Therefore, filtering techniques have to evolve constantly, especially in order to counter new scourges such as, for example, the rise of phishing attacks.
8 réponses
Good article Jonathan 🙂
Yes, it's a game of cops and robbers, as soon as one takes protection measures, the other one bypasses them and as soon as new spam techniques are detected, we try to counter them.
And in this, the poor little users in companies, collateral victims, we are struggling... 😉
There are still several methods, simple and effective to limit spam. If we consider the SPF (Sender Policy Framework), or the dkim (Domain Keys Identified Mail), we can say that a lot of efforts (certainly not always coordinated between them, lobbying obliges...) have been made to try to offer an answer to this phenomenon. The SPF being a simple record in the DNS zone, it is very easy to set up. Still, it is necessary to understand the ins and outs of this kind of solution.
Hello Rémy,
Thank you for your comment.
Indeed, SPF and DKIM are essential tools when sending emails. But the goal here was to make a list of the different filtering techniques and not the useful ones in order to limit the spamming.
Nevertheless, SPF and DKIM are indirectly mentioned in this article when talking about reputation filters that absolutely must have maximum email authentication and therefore must use these two technologies.
See you soon,
Jonathan
Very interesting and affordable article, even for the uninitiated me! Thank you!
I discovered your blog through this article and I'm going to have a look around :o)
Concerning the verification of the respect of the SMTP standard, I think that there is a possible analogy with a recruitment, if an employer asks for a handwritten letter by mail, indicating the reference of the offer in subject and with a CV which fits on 1 page, the applications which will not respect this "standard" will go directly to the dustbin... Certainly that is not a standard common to all the recruiters...
Hello David,
Indeed, it's a very good analogy! This is the standard the recruiter wants to receive, which is enough for the candidate to fit in 😉
See you soon,
To answer Remy, no it is not true (contrary to popular belief) that DKIM and SPF improve deliverability. This is a myth that has a long life! The reason: all good spammers sign their emails (DKIM) and most transactional emails sent by sites are not signed...
I am at the beginning of my mailing work and I have tested the messages to my personal mailbox and I find a big problem I would like you to help me to avoid spam I would like to go far in my work even if it's not my field of study
Hello,
I am 84 years old, never sent a single spam. Just e-mails to about twenty schoolmates to give news, or to about twenty friends to forward jokes I find funny.
For the last two weeks or so, about half of my simple emails to friends (only 1 recipient) are getting a "spam detected" message and not arriving. I've never had this problem in the 18 years I've had a PC and corresponded with my friends.
I'm a bit desperate because exchanging mails with friends is about all the pleasure I have left on this earth that I'm about to leave. My very poor computer culture doesn't allow me to understand what to do.
Would a dedicated and competent young person (for me, one is young until 75) miraculously agree to help me?
Thank you in advance. Sincerely. Raymond