Whether you are trying to understand why a specific e-mail ended up in SPAM/Junk for your daily Administrative duties or for your Red-Team Phishing simulation purposes, this script is there for you to help!
Idea arose while delivering a commercial Phishing Simulation exercises against MS Office365 E5 estate, equipped with MS Defender for Office365. As one can imagine, pretty tough security stack to work with from a phishing-simulation perspective.
After digging manually through all these Office365 SMTP headers and trying to cherry-pick these SCL values, time come to write up a proper parser for SMTP headers.
Time went by, I was adding support for more and more SMTP headers - and here we have it. Tool that now comprehends tens of different headers.
This tool accepts on input an `*.EML` or `*.txt` file with all the SMTP headers. It will then extract a subset of interesting headers and using **79+** tests will attempt to decode them as much as possible.
This script also extracts all IPv4 addresses and domain names and performs full DNS resolution of them.
Resulting output will contain useful information on why this e-mail might have been blocked.
In order to embellish your Phishing HTML code before sending it to your client, you might also want feed it into my [`phishing-HTML-linter.py`](https://github.com/mgeeky/Penetration-Testing-Tools/blob/master/phishing/phishing-HTML-linter.py). It does pretty decent job finding _bad smells_ in your HTML that will get your e-mail with increased Spam-score.
- Chain of MTA servers (nicely parsed `Received` headers):
![1.png](img/1.png)
- Various headers decoded as much as possible, according to publicly available documentation (here _Office365 ForeFront Spam Report_):
![2.png](img/2.png)
- Different custom heuristics implement to actively validate and seek for clues of spam categorization, here logic detecting _Domain Impersonation_:
![3.png](img/3.png)
- Script attempts to reverse-engineer and document some of the Office365 Anti-Spam rules, as well as collect public knowledge about other opaque Anti-Spam headers:
![4.png](img/4.png)
### Processed headers
Processed headers (more than **67+** headers are parsed):
-`X-forefront-antispam-report`
-`X-exchange-antispam`
-`X-exchange-antispam-mailbox-delivery`
-`X-exchange-antispam-message-info`
-`X-microsoft-antispam-report-cfa-test`
-`Received`
-`From`
-`To`
-`Subject`
-`Thread-topic`
-`Received-spf`
-`X-mailer`
-`X-originating-ip`
-`User-agent`
-`X-forefront-antispam-report`
-`X-microsoft-antispam-mailbox-delivery`
-`X-microsoft-antispam`
-`X-exchange-antispam-report-cfa-test`
-`X-spam-status`
-`X-spam-level`
-`X-spam-flag`
-`X-spam-report`
-`X-vr-spamcause`
-`X-ovh-spam-reason`
-`X-vr-spamscore`
-`X-virus-scanned`
-`X-spam-checker-version`
-`X-ironport-av`
-`X-ironport-anti-spam-filtered`
-`X-ironport-anti-spam-result`
-`X-mimecast-spam-score`
-`Spamdiagnosticmetadata`
-`X-ms-exchange-atpmessageproperties`
-`X-msfbl`
-`X-ms-exchange-transport-endtoendlatency`
-`X-ms-oob-tlc-oobclassifiers`
-`X-ip-spam-verdict`
-`X-amp-result`
-`X-ironport-remoteip`
-`X-ironport-reputation`
-`X-sbrs`
-`X-ironport-sendergroup`
-`X-policy`
-`X-ironport-mailflowpolicy`
-`X-remote-ip`
-`X-sea-spam`
-`X-fireeye`
-`X-antiabuse`
-`X-tmase-version`
-`X-tm-as-product-ver`
-`X-tm-as-result`
-`X-imss-scan-details`
-`X-tm-as-user-approved-sender`
-`X-tm-as-user-blocked-sender`
-`X-tmase-result`
-`X-tmase-snap-result`
-`X-imss-dkim-white-list`
-`X-tm-as-result-xfilter`
-`X-tm-as-smtp`
-`X-scanned-by`
-`X-mimecast-spam-signature`
-`X-mimecast-bulk-signature`
-`X-sender-ip`
-`X-forefront-antispam-report-untrusted`
-`X-microsoft-antispam-untrusted`
-`X-sophos-senderhistory`
-`X-sophos-rescan`
Most of these headers are not fully documented, therefore the script is unable to pinpoint all the details, but at least it collects all I could find on them.
I'm making signifcant efforts to spot and understand different Office365 ForeFront Anti-Spam ruls (SFS, ENG) despite them not being publicly documented.
The process is purely manual and resorts to sending specifically designed mails to the Office365 mail servers and then manually reviewing and correlating collected rules.
Having sent more than 60 mails already, this is what I can tell by now about Microsoft's rules:
```py
#
# Below rules were collected solely in a trial-and-error manner or by scraping any
# pieces of information from all around the Internet.
#
# They do not represent the actual Anti-Spam rule name or context and surely represent
# something close to what is understood (or they may have totally different meaning).
#
# Until we'll be able to review anti-spam rules documention, there is no viable mean to map
# - GET parameter with value, being a URL to another website
'45080400002' : 'Mail body contained <a> tag with URL containing GET parameter with value of another URL: ex. href="https://foo.bar/file?aaa=https://baz.xyz/"',
# Message contained <a> with href pointing to a file with dangerous extension, such as file.exe
'460985005' : 'Mail body contained HTML <a> tag with href URL pointing to a file with dangerous extension (such as .exe)',
# Message1 - FirstHop Gmail SMTP Received with ESMTPS.
# Message2 - FirstHop Gmail SMTP-Relay Received with ESMTPSA.
#
'121216002' : 'First Hop MTA SMTP Server used as a SMTP Relay. It\'s known to originate e-mails, but here it acted as a Relay. Or maybe due to use of "with ESMTPSA" instead of ESMTPS?',
}
```
Should you know anything about any other Office365 anti-spam rules (or have suggestions to the ones described above) - let me know in this repo's issues, I'll add it straight away :)
This and other projects are outcome of sleepless nights and **plenty of hard work**. If you like what I do and appreciate that I always give back to the community,
[Consider buying me a coffee](https://github.com/sponsors/mgeeky) _(or better a beer)_ just to say thank you! 💪