Rapid7’s Managed Detection and Response (MDR) services team leverages specialized toolsets, malware analysis, tradecraft, and collaboration with Rapid7’s Threat Intelligence researchers to detect and remediate threats. Recently, we identified increased use of a type of malicious document that leverages malformed document headers, white fonts to hide obfuscated JScript code, and embedded VBA macros that execute the document’s contents using WScript.

Rapid7 determined that the techniques related to the sample analyzed in this blog post are commonly used one at a time across many distinct malware families as one-off antivirus bypasses. However, the multi-layered antivirus evasion techniques found in this sample highlight the increasing sophistication of commodity malware campaigns’ dropper payloads. Our MDR team determined that at the time of analysis, the document sample’s actions resulted in the execution of a final-stage payload that contained a configuration file colloquially associated with the TrickBot family of malware. Malicious document dropper techniques are often final-stage-agnostic, so this analysis will focus on the malicious document itself. No familiarity with TrickBot is required.

Malicious document sample:
Filename:18575DOC18575.docm
MD5:1acfb8c3d7d2f4b72facc09e4d2631ad
SHA1:984d556e7ed72666a63c2053c4f2e787b3612162
SHA256:6779afbdb100e56b118495d0745a7c8ae4bed6beeac6b6c26f578daeffc35c49
SHA512:978f29068eb3c865cee95a7bff6d4a9b0c4a6fe209cc06e8870097c348bd6c4eda9fcdad05480a9bff21dd22b9a42e68570ca4b6b2bcac3e91b581c02a53360c

To begin with malicious document analysis, our MDR team often detonates samples using open source automated malware analysis sandbox tools to gain some insight into the behaviors of the sample in question.

In this instance, upon opening the document in Microsoft Word, the malicious document displays the following error: “We’re sorry. We can't open 18575DOC18575 because we found a problem with its contents."

Clicking “Details” reveals the following message: "The file is corrupt and cannot be opened."

word-file-corrupt-cannot-be-opened

After clicking "OK," a new error dialog box displays the following: "Word found unreadable content in 18575DOC18575. Do you want to recover the contents of this document? If you trust the source of this document, click Yes."

word-unreadable-conten-trust-source

The error is caused by the inclusion of form-data in place of the document header:

form-data-inclusion

It is possible that the document headers were mistakenly prepended—however, the nature of the malformed document's error indicates that the malformed headers are used to circumvent antivirus solutions or act as an anti-sandboxing technique:

word-document-created-earlier-version

If a user enables the document's macros, the following VBA is executed when the document is closed:

The macro subprocedures CopyTemplateToRepo and StupidError are "junk code" taken from an open source project, which are included but not called. This is a technique leveraged by malicious actors to circumvent antivirus software.

Upon closing the document, the remaining code copies the hidden (white font) text from the document as a file saved to path: <Application.StartupPath>\var.jse, then runs the PowerShell module "Get-History" before executing the copied file (<Application.StartupPath>\var.jse) with Windows' default '.jse' file handler (typically WScript.exe).

word-document-created-earlier-version-code

The document's text is heavily obfuscated and is shown truncated below:

document-text-obfuscated-truncated

During initial malware triage, analysts require a quick way to extract and act upon indicators of compromise (IoCs) before full analysis is complete.

The dropper payload and its ilk rely on convoluted function calls that return individual characters (this obfuscation technique is quickly identified by the large number of calls to one function).

Using a tool such as CyberChef's "JavaScript Beautify" recipe, we can format the obfuscated code to be more readable.

Beautified Code

We find the following functions at lines 116–131:

Rapid7 reviewed JScript Math functions and determined function obDlaao99 will return 0.

Rapid7 analyzed function obDlaao and determined that it returns the result of fromCharCode(first_parameter).

To begin manual deobfuscation of large JScript payloads, Rapid7 uses scripting languages (notably Python) and regular expressions to speed up deobfuscation.

Provided the function declarations and corresponding return values all sum the two integer values and call the "fromCharCode" function (obDlaao), Rapid7 used the following Python script and regular expression to quickly extract IoCs from the document’s hidden white font JScript code.

This prints the following:

We can immediately identify function calls, an error message, Microsoft binaries, a malicious URL, and GET parameters, all of which could prove useful as IoCs.

After identifying a pattern in the function call, we can reduce the malicious payload from roughly 11,000 lines to fewer than 600.

Deobfuscated Code

Rapid7 found this pattern by looking for similarities in the structure surrounding the function call. We are able to grab capture groups that are added before being passed to the function. This particular regular expression will not work for most samples, but the methodology is effective across samples.

After removing unused variables, creating comments, and adding suitable variable names, Rapid7 created the following corresponding pseudocode:

After in-depth analysis of the malicious document, Rapid7 determined the following about the malicious document's white-font hidden JScript code:

  • Checks whether the running script is in %TEMP% by searching for the substring "\temp" in WScript[ScriptFullName].

    • If the running script is not in %TEMP%, the sample produces an error message popup, copies the contents of the document to a variable and appends "var seed<random_integer>=<random_integer>;" to the variable.
  • Uses WMI tasks to fingerprint Win32_Operating System, Win32_ComputerSystem, and Win32_Process Operating System Classes data.

    • POSTs fingerprint to C2
    • These WMI task fingerprinting techniques have been associated with OSTAP droppers in the past, which indicates this is an artifact from older samples.
  • Acquires a positive random integer smaller than 2^32, which it uses as a .txt filename and a "&z=" GET parameter.

  • Saves a copy of the white-font hidden JScript from the existing variable (with the appended seed) to the random integer named text file (which we will now call persistence.txt).

  • Creates an .LNK shortcut file with filename maxp.lnk to the Windows Startup folder.

    • The .LNK file has a target path of: WScript, and arguments: /B /e:Jscript <path to persistence.txt>
    • This technique is used by attackers to persist upon shutdown and restart.
  • Attempts to acquire the second-stage payload from the following URL (replacing parentheses and variables in caps): https://185.130.104[.]187/nana/kum.php?pi=18b&tan=cezar&z=(RANDOM_INTEGER_FROM_PERSISTENCE_OR_444444)&n=0&u=0&an=(RANDOM_INTEGER_BETWEEN_3_AND_11779)

  • Checks the response and determines whether it is an executable based on the presence of the MZ header.

    • If the response isn't an executable, the sample uses ROT13 and Base64 to decode the response.
    • Modifying the "&z=" GET parameter will provide obfuscated JScript if &z=444444, and an encoded executable for other valid random integers.
  • Checks whether the "produce" response header is not '0', and writes the response to an alternate data stream (ADS).

    • The alternate data stream has the following format: (RANDOM_INTEGER_BETWEEN_1_AND_1000).xml:RE(RANDOM_INTEGER_BETWEEN_1_AND_15001) and is written to %TEMP%
    • Use of alternate data streams are another older and less common technique that has fallen out of fashion with malware authors but has been picked up by this dropper likely as an antivirus evasion technique.
  • Determines whether the "produce" response header is '0', and writes the response to the existing persistence.txt file.

  • Checks for all available drives, and all available files matching the following wildcard file masks: *.doc *.xls *.pdf *.rtf *.txt *.pub *.odt

  • Attempts to overwrite all matching files with the contents of persistence.txt

    • This spreading technique relies upon shared network drives where users share documents and would thereby open a compromised document.
  • Attempts to execute the second-stage response using ShellExecute or ShellExecute in conjunction with PowerShell's Start-Process module.

Overall, malicious documents are one of the most common methods of initial breach. The sample analyzed incorporated malformed document headers, malicious embedded VBA macros, PowerShell commands, heavily obfuscated JScript, startup persistence, and payload hiding in NTFS alternate data streams. These are techniques which evaded the detection of many antivirus solutions.

engines-detected-files

When facing unknown threats, or the aforementioned evasion techniques, it is advantageous to have a team of around-the-clock experts monitoring to defend against threats and stop malicious actors in their tracks.

For additional context, we recommend reading the following adjacent and insightful work: