This a guest blog entry written by Piotr Bania .

Disclaimer
The author takes no responsibility for any actions taken using the provided information or code. This article is copyright (C) 2009 Piotr Bania, all rights reserved. Any duplication of code or text provided here in electronic or printed publications is not permitted without the author's agreement.

Prologue
About a month ago Laurent Gaffié released an advisory in which he described the SMB 2.0 NEGOTIATE PROTOCOL REQUEST Remote BSoD vulnerability. Fortunately for some and unfortunately for others this vulnerability is remotely exploitable. At the time of writing, there are only two exploits available for this flaw, one written by Immunity Inc., which only provides a copy to paying customers, and one written by Stephen Fewer and included in the Metasploit Framework.  Unfortunately, Stephen Fewer's exploit seems to be unreliable against physical machines (vs VMs) due to a hardcoded address from the BIOS/HAL memory region (0xFFD00D09) which must be initiated to "POP ESI; RET". In this article I am going to describe a method for exploiting this vulnerability that only requires a stable absolute memory address (filled with NULL bytes).

Step One. Where to?
First, lets take a look at the vulnerable code, we will assume a Windows Vista SP2 operating system and SRV2.SYS version 6.0.6002.18005:

At offset 0x000056B3 EAX is initialized with a word from [ESI 0Ch]. The [ESI 0Ch] location points to the SMB2 packet, giving the attacker complete control on the lower 16 bits of the EAX register (AX).  In the next instruction (0x000056B7) our controlled EAX is used as an array index. There is only one safety check on this value that verifies that *(DWORD*)ValidateRoutines[EAX*4] is not NULL. This is the cause of this vulnerability, since there is no check to determine if the EAX value (array index) exceeds the number of elements in the ValidateRoutines array.  Further in the code, the location pointed to by ValidateRoutines[EAX*4] is executed by the "call EAX" instruction (0x000056CA).

In summary, we can redirect execution to any location (as long as it is not null) from ValidateRoutines to (ValidateRoutines (0xFFFF * 4)). This gives us about 2^16 potential memory locations to check. this is not completely accurate, since we cannot assume that any memory location outside the SRV2.SYS address space will be consistent across mul;tiple machines (device driver ImageBase addresses change on every boot). To make my life less miserable, I wrote a little program that dumps the SRV2.SYS address space from system memory, then disassembles every potential region that can be reached through ValidateRoutines[INDEX*4]. Additionally, I set some boundaries that ensure we are operating only on the SRV2.SYS address space. Here are the results I have obtained:

I must confess that I was confused at first, not because of the results obtained, but due to the Immunity exploit video that was released. In this video, they stated that exploitation is based on on time values. This led me to focus on any function that manipulated time values.  I noticed that the SrvBalanceCredits function (index 0x31, 0x4b7) can be used to modify the CurrentTime structure (0x0001D320), which can then be  used again later as the memory address for a "call EAX". However, since KeQuerySystemTime returns the time as a count of 100-nanosecond intervals since January 1, 1601 and the system time is typically updated approximately every ten milliseconds, it is very unlikely to use this as reliable offset. An alternative would be to use the BootTime variable and reboot the machine to reset it, however my results were still not satisfying (the BootTime and CurrentTime values are both returned as part of a normal SMB2 NEGOTIATE_RESPONSE packet, so it is possible to query these remotely).

I decided that the time approach was a dead end and that it was time to start over from scratch and never watch Immunity videos again :-) After leaving the time approach I decided to look into the functions that would corrupt the stack by using a accepting a different number of arguments than the original function. The following indexes showed the most promise: 0x217 (srv2!SrvSnapShotScavengerTimer), 0x237 (srv2!SrvScavengerTimer), 0x1e3 (srv2!SrvScavengeDurableHandlesTimer), and 0x1bb (srv2!SrvProcessOplockBreakTimer). Stephen Fewer's exploit uses the 0x217 (srv2!SrvSnapShotScavengerTimer) as a index value. All four of those indexes have something in common:

Each of those functions ends with a "ret 10h", indicating the function expects four arguments, and will adjust the stack to account for those when it returns. To see how this helps us, lets take one more look at the vulnerable code:

As you can see, the procedure pointed to by EAX is called (0x000056CA) with one argument on the stack (see 0x000056C9 - PUSH EBX). SRV2.SYS assumes that the called function is using the stdcall convention (callee is responsible for cleanup of the stack). Since we forced EAX to point to one of the "ret 10" functions, the callee will clean the stack, but adjust it for for four parameters, not just the single parameter that was passed in (0x10=16 -> 16/4=4). How does this influence the execution flow? Take a look:

The first "d ESP" command shows the stack before the "CALL EAX" (where EAX points to on of the "ret 10" procedures). The second "d ESP" shows the stack after the "ret 10" function was executed. The important part is when the "POP ESI" (0x000056D0) instruction is executed, it will be exchanged with the pointer to our SMB packet (see "d poi(esp 4)") -- this will bring us some serious kudos later. Additionally, even if at the moment the stack pointer is invalid (because we haxored it) it will be reinitialized correctly by the instruction at 0x000056D9. As you probably know, the LEAVE instruction (also called High Level Procedure Exit), sets the ESP to EBP and pops EBP. In other words, despite the fact we have mangled the stack and forced ESI to point to our packed data, ESP will be "good" again. That is important, since otherwise it would cause an exception when executing the "ret 4".  Lets assume we used 0x237 (srv2!SrvScavengerTimer) as an index, after few instructions we land here:

As you can see, ESI still points to our packet. The instruction at 0x0001FAB1 (setnl cl) is also a key factor in the way I have chosen to exploit this, since the setnl result depends on the value our called "faked function", which is why a function like 0x1e3 (srv2!SrvScavengeDurableHandlesTimer) will not work), since the CL register must be 1 before the PUSH ecx is executed. This will be discussed later.

Step Two. Mum I want a Trampoline!
In this step we will create a trampoline that will transfer the code execution to the shellcode. Stephen's exploit code depended on a static "pop esi; ret" address that made it unreliable on many non-virtual machines. With my technique, we just need to find a stable 4-byte memory region filled with NULL bytes (or any other predictable value) and we will force the SMB code to build a trampoline for us, using just 351 packets. After some digging I found following piece of code interesting (located in the end of _SrvProcPartialCompleteCompoundedRequest@8 function):

The instruction located at 0x0002115F is used to automically increase the value pointed to by the EAX register by ECX (=1). This is actually a variation of the InterlockedExchangeAdd function.  The key point here is that the EAX register value is controlled by the SMB packet and ECX is set to 1. Lets review how the EBX register value is computed:

In the code above, you can that EBX is equal to the [packet 0xAC] field. This means that the memory region that is be increased by the xadd instruction is equal to [packet 0xAC] 0xBC (this offset changes among the different Vista versions). This provides us with full control of the area that will be increased by each request. So what we are going to do with it? We are going to build a trampoline, dumbass :-)

To do that we, must consider:

1) We need an absolute memory address that is executable (see DEP) and is filled with constant data (NULLs in our case, however thanks to the xadd arithmetic operation any stable value works). We
need four bytes of NULLs at the address and an additional three bytes before it to handle overlapping writes to reduce the number of packets required.

2) We need to know what value to compute and how many requests it will take to accomplish this.

Answers:

1) Lets use the same BIOS/HAL region chosen by Stephen's exploit, since the memory here is readable, writeable, and executable. NULL bytes in this region are much easier to find than a POP ESI;RET for sure!

2) It seems that the opcode sequence "INC ESI; POP ESI; RET" (0x46 0x5E 0xC3) would be the easiest way to bounce to our shellcode using this as a trampoline. However, writing the value 0x4656C3 with a single increment per require would require us to send 4,609,731 packets. Fortunately, there is a solution that reduces this to just 351 packets -- a much more reasonable number. The trick  is to divide the process into three stages, where each stage is responsible for increasing only one byte. For example, we send 0x46 packets to increment address 0, 0x65 packets to increment address 1, and 0xC3 packets to increment loc 2.

Step Three. Code Execution
Now that the trampoline is ready we just need to jump to it, here is the code responsible for that:

EAX (call desitnation address) is fully controlled by the value from the SMB packet (ESI 168h). This offset changes does change between different Vista versions. Here's the general schema of my attack:


That is all for now, expect to see an updated Metasploit module in the near future that takes advantage of this technique.