Enrichment functions

MQL is highly extensible and can integrate virtually any tool or service to build better detection rules.

๐Ÿ“˜

Request a function!

Don't see a function you want? Let us know via email or Slack!

Files

Files can be delivered via email in a variety of ways, including directly as an attachment or auto-downloaded via links.

file.explode

file.explode(input: File | HTML) -> [FileExplodeOutput]

FileExplode uses Strelka, a file extraction and metadata collection system developed by Target.

Strelka uses a variety of scanners to parse files of a specific flavor and performs data collection and/or file extraction on them. Strelka can recursively extract nested files (like a Word doc within a Zip file), identify malicious scripts, suspicious executables and text, run analysis like OCR and Macro detection, and more. For more information on how Strelka works, see the official Strelka documentation.

For a list of all available scanners, see the Github repo or the official Strelka docs.

View detection rules that use this function

// detect HTML smuggling techniques
any(attachments, .file_extension in~ ('html', 'htm') and
  any(file.explode(.), "unescape" in .scan.javascript.identifiers) 
)

// detect encrypted zip files
any(attachments,
  any(file.explode(.), 
    any(.flavors.yara, . == 'encrypted_zip'))
)

// detect attachments soliciting the user to enable macros using OCR
any(attachments,
  any(file.explode(.),
    strings.icontains(.scan.ocr.raw, "enable macros")
  )
)

// detect macros with auto-open
any(attachments,
  any(file.explode(.),
    any(.scan.vba.auto_exec, . == "AutoOpen"))
)

// detect macros calling an exe
any(attachments,
  any(file.explode(.),
    any(.scan.vba.hex, strings.ilike(., "*exe*")))
)

๐Ÿ‘

Coming soon

  • External API integrations, like VirusTotal

file.oletools

file.oletools(input: File) -> OleToolsOutput

Oletools, developed by Philippe Lagadec, analyzes Microsoft OLE2 files such as Microsoft Office documents for malware and other suspicious indicators.

Use file.oletools to analyze attachments for malware or suspicious indicators like VBA macros, remote OLE objects, encryption, and more.

View detection rules that use this function

// detect suspicious macros
any(attachments, file.oletools(.).indicators.vba_macros.exists)
any(attachments, file.oletools(.).indicators.vba_macros.risk == "high")

// detect potential attempts to exploit CVE-2021-40444  (https://msrc.microsoft.com/update-guide/vulnerability/CVE-2021-40444)
any(attachments, any(file.oletools(.).relationships, strings.ilike(.target, "*html:http*")))

// detect external OLE object relationships
any(attachments, file.oletools(.).indicators.external_relationships.count > 0)

// detect encrypted Office documents
any(attachments, file.oletools(.).indicators.encryption.exists)

// detect macros that attempt to auto-execute when the document is opened
any(attachments, any(file.oletools(.).macros.keywords, .type == "autoexec"))

// detect suspicious macro source code
any(attachments, strings.ilike(file.oletools(.).macros.vba_code_all_modules, "*kernel32*", "*GetProcessId*"))

ml.macro_classifier

ml.macro_classifier(input: File) โ†’ MLMacrosOutput

The Sublime Macro Classifier introduces machine learning in MQL to detect malicious VBA macro attachments. Combining ML and MQL allows users to combine the model output with custom detection logic to surface what matters most while reducing the noise commonly associated with black-box ML approaches.

The classifier uses XGBoost to analyze VBA keywords, file metadata, andย Oletoolsย output to predict whether an attachment is likely to cause harm.

Use ml.macro_classifier to detect suspicious VBA macro attachments.

View rules that use this function

// detect malicious VBA macros in Office documents, high confidence
any(attachments, .file_extension in~ ("doc", "docm", "docx", "dot", "dotm", "pptm", "ppsm", "xlm", "xls", "xlsb", "xlsm", "xlt", "xltm", "zip")
    and ml.macro_classifier(.).malicious
    and ml.macro_classifier(.).confidence in ("high")
)

// detect malicious VBA macros in Office documents, low or medium confidence
any(attachments, .file_extension in~ ("doc", "docm", "docx", "dot", "dotm", "pptm", "ppsm", "xlm", "xls", "xlsb", "xlsm", "xlt", "xltm", "zip")
    and ml.macro_classifier(.).malicious
    and ml.macro_classifier(.).confidence in ("low", "medium")
)

Domains

beta.whois

beta.whois(domain: Domain) -> WhoisOutput

beta.whois performs a WHOIS lookup for domain registration on the .root_domain field of a Domain. It returns the domain age, registrar information, and timing information about the age of the registration record and when it was retrieved.

This function can be used to identify newly registered domains, by searching for domain age or if a domain is not found. Lookups are performed against Sublime's WHOIS service, which may be delayed by ~24 hours. Since new domains have a slight delay, searching for .found == false will identify both unregistered and newly registered domains. For some detections, the .found == false could be high enough signal.

View rules that use this function

beta.whois(sender.email.domain).found == false or
beta.whois(sender.email.domain).days_old <= 7
any(body.links, beta.whois(.href_url.domain).days_old <= 14)

Links

beta.linkanalysis

beta.linkanalysis(input: Link | URL) โ†’ LinkAnalysisOutput

LinkAnalysis analyzes a link and classifies them as benign or suspicious. The service sends suspicious URLs to a headless browser which resolves the effective URL and collects a screenshot. The screenshot is sent to an object detection model to detect brand logos, buttons, and input forms. We chose Phishpedia, an Open Source object detection project as our baseline model architecture.

If any logos are detected, those logos are cropped from the original screenshot and compared to a set of protected brand logos commonly used in credential phishing attacks. Discovered brands are available to MQL, along with summary information about login input boxes or captchas in the screenshot.

View rules that use this function

// detect links to credential phishing pages
any(body.links, 
    all([beta.linkanalysis(.)],
        .credphish.disposition == "phishing"
         and .credphish.brand.confidence in ("medium", "high")
     )
) 

// detect free subdomain links with a login or captcha
any(body.links, 
    all([beta.linkanalysis(.)], (
          .credphish.contains_login
          or .credphish.contains_captcha
     )
     and (
          .effective_url.domain.root_domain in $free_subdomain_hosts
          or .original_url.domain.root_domain in $free_subdomain_hosts
     ))
)

๐Ÿ“˜

Analysis criteria

In order to prevent LinkAnalysis from "clicking" on every link, such as Unsubscribes and one-time password resets, LinkAnalysis uses a URL classification model to determine which links to actually send to the service for analysis.

You can determine whether LinkAnalysis analyzed the link by inspecting the analyzed response property in the MQL editor.

If you observe LinkAnalysis analyzing links it shouldn't, or not analyzing links it should, send us an email or post in the Slack Community.

Text

ml.nlu_classifier

ml.nlu_classifier(input: str) -> NluResult

Natural Language Understanding, or NLU, provides users with a machine learning service to analyze text-based content. The service has two primary capabilities:

  • Email Classification
  • Named Entity Recognition

Email Classification

The Email Classification component takes a body of text as input and provides Intents and/or Tags.

Intent
Intents are top-level categories describing common language attackers use to carry out phishing attacks.

namedescription
becEmails containing urgent language about quick tasks from C-suite, HR, and Accounting Depts.
callback_scamEmails containing language about renewing/purchasing services such as tech support, antivirus, or cryptocurrency.
cred_theftEmails contain language urging users to visit a link leading to a realistic-looking portal that requires their credentials to log in.
extortionEmails meant to intimidate victims with threats of blackmail.
steal_piiEmails requesting updates to billing information, personal identification, and tax returns.
job_scamDeceptive emails disguised as employment offers to dupe students into divulging sensitive data or becoming unwitting accomplices in criminal or fraudulent schemes.

Tags
Tags are subcategories that provide additional context for financial-themed phishing attacks. The service returns the following values:

namedescription
invoiceThese emails contain language about viewing invoices via links or attachments.
paymentThese emails contain language about ACH, EFT, or Wire payments.
purchase_orderThese emails contain language about Purchase Orders, Requests for Quotation.

Example Usage

type.inbound
and any([body.plain.raw, body.html.inner_text], 
  any(ml.nlu_classifier(.).intents,
    .name == "bec" and .confidence == "high"
  )
)
// first-time sender
and (
        (
            sender.email.domain.root_domain in $free_email_providers
            and sender.email.email not in $sender_emails
        )
        or (
            sender.email.domain.root_domain not in $free_email_providers
            and sender.email.domain.domain not in $sender_domains
        )
)

Entity Recognition

Named Entity Recognition (NER) identifies, tags, and extracts important keywords within a body of text. Users can leverage this output to determine if an email contains language commonly associated with urgency, requests, or financial matters. The available entities are listed below:

namedescriptionexample(s)
greetingToken(s) that aid in the identification of the recipienthello, dear
financialToken(s) containing financial details such as payments, bank accounts, or real estate transactionswire, bank details, ACH payment
orgToken(s) containing an organization nameGoogle, Microsoft
recipientToken(s) representing the recipient of the email. Either a name or a generic designator.Jane Doe, all
requestToken(s) asking the recipient to act on behalf of the sender"I need you to", "please open"
salutationToken(s) signifying the end of the correspondence, aids in the identification of the senderthanks, regards
senderToken(s) representing the sender of an email. Either a name or a generic designator.Ms. Tyrell, IT Department
urgencyToken(s) containing language meant to urge recipient to act immediatelyASAP, immediately

Example Usage

type.inbound
and sender.display_name in~ $org_display_names
and any([body.plain.raw, body.html.inner_text], 
  any(ml.nlu_classifier(.).entities, .name == "urgency") and
  any(ml.nlu_classifier(.).entities, .name == "request")
)
// first-time sender
and (
        (
            sender.email.domain.root_domain in $free_email_providers
            and sender.email.email not in $sender_emails
        )
        or (
            sender.email.domain.root_domain not in $free_email_providers
            and sender.email.domain.domain not in $sender_domains
        )
)

Considerations

It is important to remember that the NLU engine only looks at text. Because of this, it needs additional context to be an adequate detector. For example, attackers may craft an email that looks the same as a password reset for your favorite social network. The NLU engine would classify the text as cred_theft, but it would also do the same for a legitimate password reset email. But pairing it with a First-Time/Unsolicited Sender or LinkAnalysis provides the necessary context to make an effective detector.