MQL is highly extensible and can integrate virtually any tool or service to build better detection rules.
Request a function!
Don't see a function you want? Let us know via email or Slack!
Files can be delivered via email in a variety of ways, including directly as an attachment or auto-downloaded via links.
beta.message_screenshot() → File
beta.message_screenshot function takes a screenshot of the message using the message body's HTML section. This screenshot is the same as the one that shows in the Message Preview pane when viewing a message, and is a representation of what the end-user would see. The resulting file can be passed into other File analysis functions, such as
// Check for an embedded Microsoft logo any(ml.logo_detect(beta.message_screenshot()).brands, .name == "Microsoft" and .confidence in ("medium", "high") ) // Run OCR on a screenshot of the message any(file.explode(beta.message_screenshot()), strings.ilike(.scan.ocr.raw, "*free cooler*") )
file.explode(input: File | HTML) -> [FileExplodeOutput]
FileExplode uses Strelka, a file extraction and metadata collection system developed by Target.
Strelka uses a variety of scanners to parse files of a specific flavor and performs data collection and/or file extraction on them. Strelka can recursively extract nested files (like a Word doc within a Zip file), identify malicious scripts, suspicious executables and text, run analysis like OCR and Macro detection, and more. For more information on how Strelka works, see the official Strelka documentation.
file.oletools(input: File) -> OleToolsOutput
file.oletools to analyze attachments for malware or suspicious indicators like VBA macros, remote OLE objects, encryption, and more.
// detect suspicious macros any(attachments, file.oletools(.).indicators.vba_macros.exists) any(attachments, file.oletools(.).indicators.vba_macros.risk == "high") // detect potential attempts to exploit CVE-2021-40444 (https://msrc.microsoft.com/update-guide/vulnerability/CVE-2021-40444) any(attachments, any(file.oletools(.).relationships, strings.ilike(.target, "*html:http*"))) // detect external OLE object relationships any(attachments, file.oletools(.).indicators.external_relationships.count > 0) // detect encrypted Office documents any(attachments, file.oletools(.).indicators.encryption.exists) // detect macros that attempt to auto-execute when the document is opened any(attachments, any(file.oletools(.).macros.keywords, .type == "autoexec")) // detect suspicious macro source code any(attachments, strings.ilike(file.oletools(.).macros.vba_code_all_modules, "*kernel32*", "*GetProcessId*"))
This function is currently in private beta. For access, please contact the Sublime Team.
file.parse_eml(input: Attachment) -> MessageDataModel
file.parse_eml function takes in an EML attachment (file extension
.eml or content type
message/rfc822) and parses it into an MDM.
any(attachments, (.file_extension == "eml" or .content_type == "message/rfc822") and strings.icontains(file.parse_eml(.).subject.subject, "invoice") )
ml.macro_classifier(input: File) → MLMacrosOutput
The Sublime Macro Classifier introduces machine learning in MQL to detect malicious VBA macro attachments. Combining ML and MQL allows users to combine the model output with custom detection logic to surface what matters most while reducing the noise commonly associated with black-box ML approaches.
ml.macro_classifier to detect suspicious VBA macro attachments.
// detect malicious VBA macros in Office documents, high confidence any(attachments, .file_extension in~ ("doc", "docm", "docx", "dot", "dotm", "pptm", "ppsm", "xlm", "xls", "xlsb", "xlsm", "xlt", "xltm", "zip") and ml.macro_classifier(.).malicious and ml.macro_classifier(.).confidence in ("high") ) // detect malicious VBA macros in Office documents, low or medium confidence any(attachments, .file_extension in~ ("doc", "docm", "docx", "dot", "dotm", "pptm", "ppsm", "xlm", "xls", "xlsb", "xlsm", "xlt", "xltm", "zip") and ml.macro_classifier(.).malicious and ml.macro_classifier(.).confidence in ("low", "medium") )
ml.logo_detect(input: File) -> [LogoDetectOutput]
LogoDetect uses computer vision to detect common brand logos used in attachment-based credential phishing attacks, such as impersonations of PayPal, Adobe, Microsoft, Outlook, Office365, DocuSign, and more. This includes embedded images in the body of messages as CIDs.
Our object detection model identifies logos, which are then cropped into separate images. These images are passed through a Siamese Neural Network to generate a feature vector. We compare this vector to a database of known logos using a similarity calculation. If the score exceeds a predetermined threshold, we confirm it as the respective brand logo.
For text-based logos, we utilize OCR, a computer vision technique for extracting text from images. Combined with Siamese Networks, this approach ensures comprehensive logo detection.
// detect SharePoint logos in attached images any(attachments, .file_type in ('png', 'jpeg', 'jpg', 'bmp') and any(beta.logo_detect(.).brands, .name == "Microsoft SharePoint") ) // detect DocuSign logos in attached images any(attachments, .file_type in ('png', 'jpeg', 'jpg', 'bmp') and any(beta.logo_detect(.).brands, .name == "DocuSign") ) // detect Norton logos in attached PDFs any(attachments, .file_type == "pdf" and any(beta.logo_detect(.).brands, .name == "Norton") )
ADP AT&T Adobe Amazon American Express Apple BB&T Corporation Bank of America Box Capital One Bank Captcha Chase ChicagoTitle Coinbase DHL Discover DocuSign Dropbox Ebay Facebook FidelityTitle FirstAm GeekSquad Generic Webmail Google GoogleDrive Gusto HSBC Bank Heroku Hulu IRS Instagram Key Bank LawyersTitle Ledger LinkedIn M & T Bank MadisonTitle Mastercard Meta Microsoft Microsoft Office365 Microsoft OneDrive Microsoft Outlook Microsoft SharePoint Microsoft Teams Navy Federal Credit Union Netflix Norton Okta OldRepublicTitle PayPal Quickbooks Rakuten SBB Silicon Valley Bank Slack Spotify Square StewartTitle SunTrust Bank Swiss Post Swisscom TD Bank TicorTitle U.S. Bank UPS Venmo Visa WeTransfer Wells Fargo WhatsApp Zoom
beta.whois(domain: Domain) -> WhoisOutput
beta.whois performs a WHOIS lookup for domain registration on the
.root_domain field of a Domain. It returns the domain age, registrar information, and timing information about the age of the registration record and when it was retrieved.
This function can be used to identify newly registered domains, by searching for domain age or if a domain is not found. Lookups are performed against Sublime's WHOIS service, which may be delayed by ~24 hours. Since new domains have a slight delay, searching for
.found == false will identify both unregistered and newly registered domains. For some detections, the
.found == false could be high enough signal.
beta.whois(sender.email.domain).found == false or beta.whois(sender.email.domain).days_old <= 7
any(body.links, beta.whois(.href_url.domain).days_old <= 14)
Behavior of historical functions
The result of historical functions is always relative to the time of the message that is being evaluated. During live processing, this means the latest possible information is available. However, during a backtest, these functions only take into account messages that are seen prior to that point in time. If there's not enough data, some fields like
"unknown". This behavior ensures that during a backtest there's never access to "future" data, which would lead to incorrect results and a false sense of confidence in the efficacy of a rule.
Results are typically delayed by several hours, so that the prevalence of a sender can remain as
"new"for approximately 8-12 hours.
profile.by_sender() -> SenderProfile
profile.by_sender uses previously ingested inbound messages to build a profile for messages received from a matching Sender. This profile captures information like the
.prevalence of the sender domain within your environment to assess how common or uncommon it is across messages. It also captures information about flagged messages, such as false positives or true positives.
profile.by_sender function, the list
$free_email_providers is used to determine whether a sender means a matching email or domain. If the value of
sender.email.domain.domain is in
sender.email.email is used to determine a matching Sender. Otherwise, all messages with a matching
sender.email.domain.domain are considered to be from the same Sender. This ensures that for
profile.by_sender, a matching Sender covers messages from an organization, instead of an individual.
profile.by_sender() to find a first-time sender:
type.inbound and profile.by_sender().prevalence == "new"
Using lists do find a first-time sender is the same but more verbose:
type.inbound and ( ( sender.email.domain.root_domain in $free_email_providers and sender.email.email not in $sender_emails ) or ( sender.email.domain.root_domain not in $free_email_providers and sender.email.domain.domain not in $sender_domains ) )
To check against the historical reputation for a sender, check whether a sender has sent at least 1 message flagged as malicious or spam but no confirmed false positives.
type.inbound and profile.by_sender().any_messages_malicious_or_spam and not profile.by_sender().any_false_positives // Additional logic on the suspicious sender. and ...
profile.by_sender_domain() -> SenderProfile
profile.by_sender_domain uses previously ingested inbound messages to build a profile for messages received from a matching
type.inbound // filter by first-seen domains or anomalous domains in your environment and profile.by_sender_domain().prevalence in ("outlier", "new") // scrutinize PDF attachments, for example and any(attachments, .file_extension == "pdf" and ...)
profile.by_sender_email() -> SenderProfile
profile.by_sender_domain uses previously ingested inbound messages to build a profile for messages received from a matching
type.inbound // filter by first-seen or anomalous email addresses in your environment and profile.by_sender_email().prevalence in ("outlier", "new")
profile.by_sender_email can be used to tell when a domain is common but the sending email address is new:
type.inbound // not a free email provider and sender.email.domain.domain not in $free_email_providers // domain is common in your environment and profile.by_sender_domain().prevalence == "common" // but this is the first time you've received messages from this sender and profile.by_sender_email().prevalence == "new"
beta.linkanalysis(input: Link | URL, mode="default") → LinkAnalysisOutput
LinkAnalysis analyzes a link and classifies them as benign or suspicious. The service sends suspicious URLs to a headless browser which resolves the effective URL and collects a screenshot. The screenshot is sent to an object detection model to detect brand logos, buttons, and input forms. We chose Phishpedia, an Open Source object detection project as our baseline model architecture.
If any logos are detected, those logos are cropped from the original screenshot and compared to a set of protected brand logos commonly used in credential phishing attacks. Discovered brands are available to MQL, along with summary information about login input boxes or captchas in the screenshot.
mode is an optional argument that alters LinkAnalysis's analysis criteria (see note below). By changing
mode from its default of
"aggressive", LinkAnalysis performs extra processing on a link when determining whether to fully analyze the link. For example, LinkAnalysis with
mode="aggressive" will fetch the destination link of known common click trackers via HEAD and apply normal analysis criteria to that destination link.
// detect links to credential phishing pages any(body.links, all([beta.linkanalysis(.)], .credphish.disposition == "phishing" and .credphish.brand.confidence in ("medium", "high") ) ) // detect any links to credential phishing pages any(body.links, any([beta.linkanalysis(., mode="aggressive")], .credphish.disposition == "phishing" and .credphish.brand.confidence in ("medium", "high") ) ) // detect free subdomain links with a login or captcha any(body.links, all([beta.linkanalysis(.)], ( .credphish.contains_login or .credphish.contains_captcha ) and ( .effective_url.domain.root_domain in $free_subdomain_hosts or .original_url.domain.root_domain in $free_subdomain_hosts )) ) // analyze the final DOM of a link within the body any(body.links, strings.icontains(beta.linkanalysis(.).final_dom.display_text, "Redirect Notice") and strings.contains(beta.linkanalysis(.).final_dom.display_text, ".zip") )
In order to prevent LinkAnalysis from "clicking" on every link, such as Unsubscribes and one-time password resets, LinkAnalysis uses a URL classification model to determine which links to actually send to the service for analysis.
You can check whether LinkAnalysis
analyzedthe target page by inspecting the response in the MQL editor.
If you observe LinkAnalysis analyzing links it shouldn't or not analyzing links it should, please send us an email or post in the Slack Community.
ml.nlu_classifier(input: str) -> NluResult
Natural Language Understanding, or NLU, provides users with a machine learning service to analyze text-based content. The service has two primary capabilities:
- Email Classification
- Named Entity Recognition
The Email Classification component takes a body of text as input and provides Intents and/or Tags.
Intents are top-level categories describing common language attackers use to carry out phishing attacks.
|Emails containing urgent language about quick tasks from C-suite, HR, and Accounting Depts.|
|Emails containing language about renewing/purchasing services such as tech support, antivirus, or cryptocurrency.|
|Emails contain language urging users to visit a link leading to a realistic-looking portal that requires their credentials to log in.|
|Emails meant to intimidate victims with threats of blackmail.|
|Emails requesting updates to billing information, personal identification, and tax returns.|
|Deceptive emails disguised as employment offers to dupe students into divulging sensitive data or becoming unwitting accomplices in criminal or fraudulent schemes.|
Tags are subcategories that provide additional context for financial-themed phishing attacks. The service returns the following values:
|These emails contain language about viewing invoices via links or attachments.|
|These emails contain language about ACH, EFT, or Wire payments.|
|These emails contain language about Purchase Orders, Requests for Quotation.|
type.inbound and any([body.plain.raw, body.html.inner_text], any(ml.nlu_classifier(.).intents, .name == "bec" and .confidence == "high" ) ) // first-time sender and ( ( sender.email.domain.root_domain in $free_email_providers and sender.email.email not in $sender_emails ) or ( sender.email.domain.root_domain not in $free_email_providers and sender.email.domain.domain not in $sender_domains ) )
Named Entity Recognition (NER) identifies, tags, and extracts important keywords within a body of text. Users can leverage this output to determine if an email contains language commonly associated with urgency, requests, or financial matters. The available entities are listed below:
|Token(s) that aid in the identification of the recipient||hello, dear|
|Token(s) containing financial details such as payments, bank accounts, or real estate transactions||wire, bank details, ACH payment|
|Token(s) containing an organization name||Google, Microsoft|
|Token(s) representing the recipient of the email. Either a name or a generic designator.||Jane Doe, all|
|Token(s) asking the recipient to act on behalf of the sender||"I need you to", "please open"|
|Token(s) signifying the end of the correspondence, aids in the identification of the sender||thanks, regards|
|Token(s) representing the sender of an email. Either a name or a generic designator.||Ms. Tyrell, IT Department|
|Token(s) containing language meant to urge recipient to act immediately||ASAP, immediately|
type.inbound and sender.display_name in~ $org_display_names and any(ml.nlu_classifier(body.current_thread.text).entities, .name == "urgency") and any(ml.nlu_classifier(body.current_thread.text).entities, .name == "request")
It is important to remember that the NLU engine only looks at text. Because of this, it needs additional context to be an adequate detector. For example, attackers may craft an email that looks the same as a password reset for your favorite social network. The NLU engine would classify the text as
cred_theft, but it would also do the same for a legitimate password reset email. But pairing it with a First-Time/Unsolicited Sender or LinkAnalysis provides the necessary context to make an effective detector.
Updated about 1 month ago