Practical Malware Analysis

Lab 5 — IDA Pro

Chris Eastwood
Malware Analysis
Published in
11 min readDec 29, 2021

--

Solutions for Lab 5 within Practical Malware Analysis.

IDA Pro

IDA Pro, an Interactive Disassembler, is a disassembler for computer programs that generates assembly language source code from an executable or a program. IDA Pro enables the disassembly of an entire program and performs tasks such as function discovery, stack analysis, local variable identification, in order to understand (or change) its functionality.

This lab utilises IDA to explore a malicious .dll and demonstrates various techniques for navigation and analysis. Any useful shortcuts will be identified.

Practical Malware Analysis
Download Labs

Labs skip from 3 to 5, as there is no Lab 4-x in the book, this chapter covers x86 disassembly, covered here (coming soon)

________________________________________________________________

Lab 5–1

This lab analyses the malware found in the file Lab05–01.dll, and is a longer lab designed to demonstrate features of IDA Pro and give hands-on experience.

1. What is the address of DllMain?
2. Use the Imports window to browse to gethostbyname. Where is the import located?
3. How many functions call gethostbyname?
4. Focusing on the call to gethostbyname located at 0x10001757, can you figure out which DNS request will be made?
5. How many local variables has IDA Pro recognized for the subroutine at 0x10001656?
6. How many parameters has IDA Pro recognized for the subroutine at 0x10001656?
7. Use the Strings window to locate the string \cmd.exe /c in the disassembly. Where is it located?
8. What is happening in the area of code that references \cmd.exe /c?
9. In the same area, at 0x100101C8, it looks like dword_1008E5C4 is a global variable that helps decide which path to take. How does the malware set dword_1008E5C4? (Hint: Use dword_1008E5C4’s cross-references.)
10. A few hundred lines into the subroutine at 0x1000FF58, a series of comparisons use memcmp to compare strings. What happens if the string comparison to robotwork is successful (when memcmp returns 0)?
11. What does the export PSLIST do?
12. Use the graph mode to graph the cross-references from sub_10004E79. Which API functions could be called by entering this function? Based on the API functions alone, what could you rename this function?
13. How many Windows API functions does DllMain call directly? How many at a depth of 2?
14. At 0x10001358, there is a call to Sleep (an API function that takes one parameter containing the number of milliseconds to sleep). Looking backward through the code, how long will the program sleep if this code executes?
15. At 0x10001701 is a call to socket. What are the three parameters?
16. Using the MSDN page for socket and the named symbolic constants functionality in IDA Pro, can you make the parameters more meaningful? What are the parameters after you apply changes?
17. Search for usage of the in instruction (opcode 0xED). This instruction is used with a magic string VMXh to perform VMware detection. Is that in use in this malware? Using the cross-references to the function that executes the in instruction, is there further evidence of VMware detection?
18. Jump your cursor to 0x1001D988. What do you find?
19. If you have the IDA Python plug-in installed (included with the commercial version of IDA Pro), run Lab05–01.py, an IDA Pro Python script provided with the malware for this book. (Make sure the cursor is at 0x1001D988.) What happens after you run the script?
20. With the cursor in the same location, how do you turn this data into a single ASCII string?
21. Open the script with a text editor. How does it work?

0. Before we get started.

To help with navigation of IDA, some useful settings and windows should be configured. First enable Line Prefixes, set Opcode bytes to 6, and enable Auto Comments. This will provide some clarity to the assembly. The windows will likely be present by default, but can be switched to with the shortcuts.

1. What is the address of DllMain?

The address off DllMain is 0x1000D02E. This can be found within the graph mode, or within the Functions window (figure 2).

Figure 2: Address of DllMain

2. Where is the import gethostbyname located?

gethostbyname is located at 0x100163CC within .idata (figure 3).This is found through the Imports window and double-clicking the function. Here we can also see gethostbyname also takes a single parameter — something like a string.

Figure 3: Location of gethostbyname

3. How many functions call gethostbyname?

Searching the xrefs (ctrl+x) on gethostbyname shows it is referenced 18 times, 9 of which are type (p) for the near calll, and the other 9 are read (r) (figure 4). Of these, there are 5 unique calling functions.

Figure 4: gethostbyname xrefs

4. For gethostbyname at 0x10001757, which DNS request is made?

Pressing G and navigating to 0x10001757, we see a call to thegethostbyname function, which we know takes one parameter; in this case, whatever is in eax — the contents of off_10019040 (figure 5)

Figure 5: gethostbyname at 0x10001757

The contents of off_10019040 points to a variable aThisIsRdoPicsP which contains the string [This is RDO]pics.practicalmalwareanalysis.com. This is moved into eax (figure 6).

Figure 6: Contents of off_1001904 (aThisIsRdoPicsP)

Importantly, 0Dh is added to eax, which moves the pointer along the current contents. 0Dh can be converted in IDA by pressing H, to 13. This means the eax now points to 13 characters inside of its current contents, skipping past the prefix [This is RDO] and resulting in the DNS request being made for pics.practicalmalwareanalysis.com.

5 & 6. How many parameters and local variables are recognized for the subroutine at 0x10001656?

There are a total of 24 variables and parameters for sub_10001656 (figure 7).

Figure 7: sub_10001656 parameters and varliables

Local variables correspond to negative offsets, where there are 23. Many are generated by IDA and prepended with var_ however there are some which have been resolved, such as name or commandline. As we work through, we generally rename any of the important ones.

Parameters have positive offsets. Here there is one, currently lpThreadParameter. This may also be seen as arg_0 if not automagically resolved.

7. Where is the string \cmd.exe /c located in the disassembly?

Press Alt+T to perform a string search for \cmd.exe /c, which is stored as aCmdExeC, found within sub_1000FF58 at offset 0x100101D0 (figure 8).

Figure 8: Location of ‘\cmd.exe /c’

8. What happens around the referencing of \cmd.exe /c?

The command cmd.exe /c opens a new instance of cmd.exe and the /c parameter instructs it to execute the command then terminate. This suggests that there is likely a construct of something to execute somewhere nearby.

Taking a cursory look around sub_1000FF58, we see several indications of what might be happening. Look for push offset X for quick wins.

Towards the top of the function, we see an address that is quite telling of what is happening. The offset aHiMasterDDDDDD called at 0x1001009Dcontains a long message which includes several strings relating to system time information (actually initialised just before), but more notably reference to a Remote Shell (figure 9).

Figure 9: Contents of offset aHiMasterDDDDDD

Further on throughout the function, there are more interesting offset addresses with strings that may provide an indication of activity.

Figure 10: Offset strings within sub_1000FF58

Some of which are likely part of any commandline activity, whereas others may be additional modules. Some of the notable ones might be
aInject, aIexploreExe, and aCreateProcessG, which could be indicative of process injection into iexplore.exe.

9. At 0x100101C8, dword_1008E5C4 indicates which path to take. How does the malware set dword_1008E5C4?

The comparison of dword_1008E5C4 and ebx will determine whether \cmd.exe /c or \command.exe /c is pushed; likey based upon the Operating System version to utilise the correct command prompt (figure 11).

Figure 11: cmd.exe or command.exe options

Following the xrefs of dword_1008E5C4 , we see it written (type w) in sub_10001656, with the value of eax. There is a preceding call to sub_10003695, where the function takes a look at the system’s Version Information (using API call GetVersionExA) (figure 12).

Figure 12:

There is a comparison between the VersionInformation.dwPlatformId and 2, so looking at the Windows Platform IDs we see that it is looking to see if ‘The operating system is Windows NT or later.’ If it is, then \cmd.exe /c is pushed. If not, then it is \command.exe /c.

10. What happens if the string comparison to robotwork is successful?

The robotwork string comparison is completed using the function memcmp, which returns 0 if the two strings are identical. The JNZ branch jumps if the result Is Not Zero. This means, if the robotwork comparison is successful, returning 0, then the jump does not execute (the red path). If the memcmp was unsuccessful, then some other non-zero value would be returned and the jump (green path) would be followed (figure 13).

Figure 13: memcmp of robotwork

Not jumping, (and following the red path), leads to a new function sub_100052A2 which includes registry keys SOFTWARE\Microsoft\Windows\CurrentVersion WorkTime and WorkTimes. The function is looking for values within the WorkTime and WorkTimes ( RegQueryValueExA) and if so, are displayed as part of the relevant aRobotWorktime offset addresses (via %d) (figure 14).

Figure 14: Querying SOFTWARE\Microsoft\Windows\CurrentVersion WorkTime and WorkTimes registry keys

The start of the function takes in a parameter for SOCKET as s , which is then passed through to a new function (sub_100038EE) along with the registry values (ebp) (figure 15).

Figure 15: Passing registry values through SOCKET s

Therefore, if the string comparison for robotwork is successful, the registry keys SOFTWARE\Microsoft\Windows\CurrentVersion WorkTime and WorkTimes are queried and the values passed through (likely) the remote shell connection.

11. What does the export PSLIST do?

Figure 16: Exports view

Open the exports list and find the exported function PSLIST. (figure 16).

Navigate here and see there are three subroutines. One of which queries OS version information (similar as seen in Q9, but this time also sees if dwMajorVersion is 5 for more specific OS footprinting (dwMajorVersions)), and depending on the outcome, will call either sub_10006518 or sub_1000664C (figure 17).

Figure 17: PSLIST exported function paths

Both sub_10006518 and sub_1000664C utilise CreateToolhelp32Snapshot to take a snapshot of the specified processes and associated information, and then execute appropriate commands to query the running processes IDs, names, and the number of threads. sub_1000664C also includes the SOCKET (s) to send the output out to (figure 18).

Figure 18: Using CreateToolhelp32Snapshot, quering running processes, and sending to socket

12. Which API functions could be called by entering sub_10004E79?

A useful way to quickly see what API functions are called by a certain subroutine is through the Proximity Brower view, this transforms the standard Graph or Text views into a much more condensed graph highlighting which API functions or subroutines are called (figure 19)

Figure 19: Proximity View of sub_10004E79
Figure 20: Functions called by sub_10004E79

The functions called from sub_10004E79 (figure 20) indicate that the functionality is to identify the language used on the system, and then pass that information through the SOCKET (as we’ve seen sub_100038EE before). It might make sense to rename sub_10004E79 to something like getSystemLanguage. While we’re at it, we might aswell rename sub_100038EE to something like sendSocket.

13. How many Windows API functions does DllMain call directly, and how many at a depth of 2?

Another way to view the API functions called from somewhere, is through View -> Graphs -> User XRef Chart. Set start and end addresses to DllMain and the Recursion depth to 1 to see four API functions called (figure 21). At a depth of 2, there are around 32, with some duplicates.

Figure 21: API functions called by DllMain.

Some of the more notable API calls which may provide indication of functionality are: sleep winexec gethostbyname inet_nota CreateThread WSAStartup inet_addr recv send socket connect LoadLibraryA

14. How long will the Sleep API function at 0x10001358 execute for?

At first glance, one might think that the value passed to the sleep is 3E8h (1000), equating to 1 second, however it is a imul call which means the value at eax is getting multiplied by 1000. Looking up, we see that aThisIsCti30 at the offset address is moved into eax and then the pointer is moved 13 along (similar to what's seen in Q2) (figure 22).

Figure 22: Sleep for 30 seconds

This means that the value of eax when it is pushed is 30. atoi converts the string to an integer, and it is multiplied by 1000. Therefore, the Sleep API function sleeps for 30 seconds.

15 & 16. What are the three parameters for the call to socket at 0x10001701?

The three values pushed to the stack, labeled as protocol, type, and af, and are 6, 1, 2 respectively, are the three parameters used for the call to socket (figure 23).

Figure 23: Call to socket at 0x10001701

These depict what type of socket is created. Using Socket Documentation we can determine that in this case, it is TCP IPV4. At this point, we might aswell rename those operands (figure 24).

Figure 24: Definitions and renaming of socket parameters

17. Is there VM detection?

Figure 25: Searching for the in instruction using 0xED in binary.

The in instruction (opcode 0xED) is used with the string VMXh to determine whether the malware is running inside VMware. 0xED can be searched (alt+B) and look for the in instruction (figure 25).

From here, we can navigate into the function and see what is going on within sub_10006196.

Figure 26: in instruction within sub_10006196

Directly around the in instruction, we see evidence of the string VMXh (converted from original hex value) (figure 26), which is potentially indicative of VM detection. If we look at the other xrefs of sub_10006196 we see three occurrences, each of which contains aFoundVirtualMa, indicating the install is canceling if a Virtual Machine is found (figure 27).

Figure 27: Found Virtual Machine string found after VMXh string

18, 19, & 20. What is at 0x1001D988?

The data starting at 0x1001D988 appears illegible, however, we can convert this to ASCII (by pressing A), albeit still unreadable (Figure 28).

Figure 28: Random data at 0x1001D988

We have been provided a python script with the lab lab05–01.py which is to be used as an IDA plugin for a simple script. For 0x50 bytes from the current cursor position, the script performs an XOR of 0x55, and prints out the resulting bytes, likely to decode the text (figure 29).

Figure 29: XOR 0x55 script

We are unable to do this within the free version of IDA, however we can loosely do it manually ourselves by taking the bytes from 0x1001D988 and doing XOR 0x55.

Evidently, the conversion to ASCII and manual decoding has messed up something with the capitalisation, but we can see some plaintext and determine the completed message (figure 30)

Figure 30: Manual XOR 0x55

rxdoor is this backdoor, string decoded for Practical Malware Analysis Lab :)1234

--

--

Chris Eastwood
Malware Analysis

Incident Response, Forensic Investigations, and Threat Hunting professional, writing things to learn them better.