WepMetering

Written by Bjarke Viksoe.
This article was submitted 9/4/2001.

Description

WepMetering is an experiment to monitor calls made to 3rd party or Windows system DLLs. In this example, the WinSOCK DLL will be targeted and any data sent to the internet will be intercepted.

So the goal of this experiment is to intercept and log all calls made to the WinSOCK functions send() and sendto() by any process on the computer. We could have chosen any function inside any DLL, but to spy on what the Firefox application is sending seems to be fun enough to make it an application.

Before diving into the fun-but-grungy ASM, it would be useful to see the big picture.
To intercept calls that another application makes to a specific DLL, we need to either intercept the call before it is made (which is almost impossible) or instrument the target DLL with a logging function, which is called as it executes the actual DLL code.

Ways to do it

There are a few methods you can use for remote process spying. They all have pros and cons. I'll summarize:

Replace the DLL

The easiest way to do spying, is simply to create a new DLL that is an exact match of the DLL you want to spy on. It must have the same export signature and will act as a bridge between the caller and the original DLL function.
The fake DLL will log any incoming call and simply send the call on to the original DLL where it is processed.
The obvious disadvantage is that you must physically replace the DLL you want to spy on. Further many system DLLs contain undocumented export entries, which will be difficult to bridge.

Replace the process' Import Address Table

A method described by Matt Pietrek in an early MSJ article. Takes off from the fact that the compiler doesn't place calls to explicitly imported libraries as inline jumps, but rather puts them in an Import Address Table to allow the System DLL Loader to quickly do address space relocation.
The spy code is injected into the calling process and replaces entries in the Import Address Table. Nice and clean.
The problem with this approach is that the DLL must be in the import table of the process, and cannot be a DLL loaded dynamically by the ::LoadLibrary() Windows API.

Hook into the remote DLL code

A more brutal approach is to inject your spying code right into the remote code. Any calls made to the hooked code function has to be automatically redirected to the actual spying code and can return control to the original code when done.
This is sometimes referred as a Trampoline, and this is the approach described here. The problems with this approach will be highlighted later.

Getting inside

To inject anything into a remote process you must first be able to control the remote process. This was very easy on Windows 3.1, got a little more difficult on Windows 9X, and Windows NT imposed even more problems. Still, it can be done.
Windows NT separates processes as a security precaution. One process is usually not allowed to do anything inside another process. The exception to this is when you are debugging the process - or if you can trick the remote process to load and run your code inside its address space.
Windows NT offers at least two possibilities to have a custom DLL loaded into every process on the machine.

Use the App_Init registry key. It contains a list of DLLs that the system will inject into any process loaded.
Use a system-wide message hook. The message hook is really intended for computer-based training (replay of keyboard strokes etc), but works by injecting the hook DLL into every process so a callback function can be called in the process' context.

In this example the WH_CALLWNDPROC system-wide message hook is used. It enables us to monitor the messages sent to window procedures. This is not really what we wanted, but it allows us to install the real spying mechanism.
The message hook must reside in its own DLL. Since it's injected into all the processes running on the machine, we should try to keep it as fast and tiny as possible.
Here is the message callback logic:

LRESULT CALLBACK HookProc(int nCode, WPARAM wp, LPARAM lp)
{
   if( bFirstTime ) {
      bFirstTime = false;
      if( ProcessRecognized() ) {
         InjectWinsockHook();
      }
   }
}

The HookProc gets called by the system for all messages sent - usually starting as the first window opens.
To limit the number of processes, which we should further spy on, only specific process names that we recognize gets injected with the spy mechanism.
The installation of the actual spy hook is located in another DLL to keep the size of the message hook DLL down.

Installing the spy

The spy hook we're going to install makes sure that the WinSOCK is loaded by forcing it in (using ::LoadLibrary()). It then gets the address of the send() function by using another common Win32 API function: ::GetProcAddress().
Because we're calling this from the system-wide message hook, we're actually calling it from inside the remote process. From there the spy can be installed directly into the WinSOCK function.

The spy code itself is really all about redirecting the call from the WinSOCK function to our own logging function. Then we return control back to WinSOCK and let it execute its code.

To redirect the WinSOCK function to the logging function, we simply insert a JMP assembler instruction into the first bytes of the target function (WinSOCK.send). Assuming that it's at least 5 bytes long, there is plenty of room to do this. Of course since the JMP instruction will destroy some of the original WinSOCK code, we make a backup of the first few bytes so we can execute them later.

One of the biggest hurdles is that we need to inject the JMP hook right into unknown WinSOCK assembler code. We know exactly how many bytes the JMP instruction requires, but how can we be sure that the backup we make is a valid assembler opcode sequence? We're going to execute the backed up code later, so we better make sure that we don't cut an assembler instruction in half.
A little study of how compilers generate code will tell us that they usually generate some stack frame setup code in the beginning of every function:

WinSOCK.send:
                     ; Stack frame setup
55                   push        ebp
8B EC                mov         ebp,esp
83 EC 10             sub         esp,10h
                     ; The actual function code starts here
33 D2                xor         edx,edx

This is 6 bytes of assembler code. That is just enough to fit our JMP instruction. If we can identify these assembler instructions as a known stack frame setup stub, we can also safely backup the 6 bytes for later execution.
This is in fact the greatest weakness of this approach. We rely on compiler specific output here. Though the number of ways to generate stack frame setup code is limited - it could still present as a problem for some DLLs. To be 100% accurate we actually need to write a little disassembler to analyse the code, but that's clearly out of scope for this sample.

After injecting the JMP instruction, the code from before will look like this:

WinSOCK.send:
                     ; The injected JMP instruction
E9 D8 3C 00 00       jmp         hook_stub
10                   ; <- This is a left over byte!
                     ;    It's not a valid instruction.
                     ; Here continues the WinSOCK code...
33 D2                xor         edx,edx

We now need to construct the actual logging function. One thing is for sure: when the logging is done, we need to return control to the original WinSOCK code. In fact, we need to start by executing the original code (the stack frame setup stub) that we overwrote with our JMP hook.
Here is an example of how the final ASM code for the hook stub for the send() function will look like:

hook_stub:
         ; Call logging code
         pushad
         call        my_logging_function
         popad
         ; This is a copy of the original
         ; first 5 bytes the WinSOCK function
         ; contained.
         push        ebp
         mov         ebp,esp
         sub         esp,10h
         ; Return control to WinSOCK
         jmp         WinSOCK.send + 6

And that's basically all there is to it.

The logging function itself (here called my_logging_function) is actually not interesting. It can do anything. The only challenge is to extract the actual arguments passed to the send() function. But they are still to be found on the stack, and as long as you take into account the pushad and call instructions and the order in which arguments are passed in the C++ language, then it's relatively easy.

The hook stub code above will be dynamically constructed (using the ::VirtualAlloc() API) on the fly as we hook the function. Using self-modifying code is not something I would usually recommend, but it comes in very handy here.

You may ask why we need to bother with the system wide message hook at all? Why not just load the WinSOCK DLL in our own process and inject the hook code once and for all?
The answer to that is the Virtual-Memory protection that Windows uses. It will share the memory pages containing code in any loaded DLL among all processes. But it also has a "copy-on-write" policy: Whenever you modify a shared page it will make a local copy of that page before it applies your modification. So to modify the WinSOCK DLL in a remote process, we must make sure we modify the local copy in that process.

Notes

This sample does not log return values. You cannot apply the same logic if you for instance wanted to spy on the recv() function in the WinSOCK DLL as it returns a buffer - but my example only logs at function entry. However, an example on how to do this is given in the mentioned 1996 MSJ article by Matt Pietrek (the guru).

With the arrival of Internet Explorer 8, the process isolation apparently seems to prevent the automatic injection of the DLL into the IE browser. The sample still works with Firefox and most other applications.

Source Code Dependencies

Microsoft Visual C++ 6.0

Download Files

Source Code (37 KiB)