3. Recap
In the previous ticket we created a dummy shellcode firstly in C language and then in the assembler language; we tested the dummy shellcode but we’ve seen that the execution was failing. In this ticket we will try to fix the dummy shellcode problems and hopefully we will be able to execute it successfully.
The 2 most common pitfalls that the shellcode writers must address in their code are: the null bytes problem and the addressing problem.
4.The null bytes problem
Very often the shellcode is injected in the vulnerable program using ( C )string functions like strcpy, read, so the shellcode content will be treated as an array of char values terminated by a special NULL character (value ‘\0), so when the shellcode contains a NULL byte, the byte will be interpreted as a string terminator and the execution will stop.
In order to fix the problem, you should not use the NULL byte in the shellcode, but firstly you have to find it. The easiest way to sport it is to use the objdump tool.
The output of objdump -d hello -M intel command is the following one:
hello: file format elf32-i386 Disassembly of section .text: 08048080 <_start>: 8048080: b8 04 00 00 00 mov eax,0x4 8048085: bb 01 00 00 00 mov ebx,0x1 804808a: b9 a4 90 04 08 mov ecx,0x80490a4 804808f: ba 0d 00 00 00 mov edx,0xd 8048094: cd 80 int 0x80 8048096: b8 01 00 00 00 mov eax,0x1 804809b: bb 05 00 00 00 mov ebx,0x5 80480a0: cd 80 int 0x80
The hello is the binary containing the original shellcode, and in marked in bold the null bytes presented in the file.
The easiest way to fix the null byte problem is to xor the register/s and to fill the smallest register with the desired value. For example the :
mov eax,0x4
will be replaced by:
;the eax will be full of zeroes xor eax,eax ;add 4 to the al register not to eax mov al,0x4
After changing all the faulty instructions the new hello word dummy shellcode will have the following structure:
global _start section .text _start: ;execute write(1,"Hello World \n", 14); xor eax, eax mov al, 0x4 xor ebx, ebx mov bl, 0x1 mov ecx, message xor edx, edx mov dl, 0xD int 0x80 ;execute _exit(0); xor eax, eax mov al, 0x1 xor ebx, ebx mov bl, 0x5 int 0x80 section .data message: db "Hello World!", 0xA
We could check again to see if there are any NULL bytes in the new file. As you can see there are no NULL bytes anymore:
helloNoNull: file format elf32-i386 Disassembly of section .text: 08048080 <_start>: 8048080: 31 c0 xor eax,eax 8048082: b0 04 mov al,0x4 8048084: 31 db xor ebx,ebx 8048086: b3 01 mov bl,0x1 8048088: b9 a0 90 04 08 mov ecx,0x80490a0 804808d: 31 d2 xor edx,edx 804808f: b2 0d mov dl,0xd 8048091: cd 80 int 0x80 8048093: 31 c0 xor eax,eax 8048095: b0 01 mov al,0x1 8048097: 31 db xor ebx,ebx 8048099: b3 05 mov bl,0x5 804809b: cd 80 int 0x80
5 The addressing problem
The addressing problem is linked to the datas that are used by the shellcode; in our case it is the string “Hello World !”. As you can see in the assembler code, the bx register will contain the memory address of the message to write on the screen:
mov ecx, message
and will be transformed by the compiler in the following instruction:
mov ecx,0x80490a0
where 0x80490a0 is a (statically computed by the compiler) memory location. When the shellcode will be executed the memory location will certainly contains something else. This is the reason why when we executed our shellcode, (see the last screenshot from the previous ticket ) the output was some strange characters and not the expected string.
To summarize, the shellcode must dynamically compute the memory addresses of all his datas and to do this there are 2 ways: the jump call pop technique and/or push the datas on the stack.
5.1 Compute memory location using “Jump Call Pop”
In the case of the Intel call instruction, when the call is …called , the address of the next instruction is pushed to the stack (ESP register). So, the trick is to position the data that you want to compute the address after a call instruction and then get the address of the data from the stack. Here is some pseudocode:
funtionThatWillUseData: ;ESP will contain the address of the data pop eax ;now the eax will contain the address of the data call funtionThatWillUseData data: db "blabla", 0xA
Now, we will rewrite our dummy shellcode to compute the address of the “Hello World” string. Here is the new version of the dummy shellcode:
global _start section .text _start: jmp short data shellcode: ;execute write(1,"Hello World \n", 14); xor eax, eax mov al, 0x4 xor ebx, ebx mov bl, 0x1 ;the ecx contains the address of the message variable pop ecx xor edx, edx mov dl, 0xD int 0x80 ;execute exit(0); xor eax, eax mov al, 0x1 xor ebx, ebx mov bl, 0x5 int 0x80 data: call shellcode message: db "Hello World!", 0xA
At this moment we fixed the all the problems, so the shellcode should execute successfully; If you want to know how to test it, go to the “Test your shellcode” paragraph from my previous ticket.
5.2 Compute memory location by pushing the data on the stack
The second technique (that will make your code smaller that the previous one) is to push directly on the stack the data that you want to use in your shellcode. Now if your data is longer than 4 bytes, then you can split your data in multiple chunks and push it; in our case we will split the data in 4 chunks.
Another point that is worth mentioning is that the different chunks will be pushed on the stack in the reverse order (in HEX) because the stack is growing from “up” to “down” (from upper memory addresses to lower memory addressees) and (to make the things more complex) the order on the letters in each chunk is reversed because of the Intel Little Endian architecture. So, finally the data on the stack will look like this :
Higher memory 0\n !dlr oW o lleH Lower memory
Here is the new version of our dummy shellcode:
global _start section .text _start: xor eax, eax mov al, 0x4 xor ebx, ebx mov bl, 0x1 ;push on the stack the "Hello World !\n00" ;"0x0" xor ecx, ecx push ecx ;"\n" push 0x0A ;"!dlr" push 0x21646C72 ;"oW o" push 0x6F57086F ;"lleH" push 0x6C6C6548 mov ecx, esp xor edx, edx mov dl, 0xD int 0x80 xor eax, eax mov al, 0x1 xor ebx, ebx mov bl, 0x5 int 0x80
Again, the shellcode should execute successfully; you can go and test it.
6. Main points to remember
Forgetting about the technical details, these are the basic steps that you should remember for writing a shellcode in Linux:
- Write the shellcode in C.
- Rewrite the shellcode in assembler using the same system calls made by the C version of the shellcode.
- Fix the (eventually) addressing problem and (eventually) the null bytes problem.
- Test your shellcode.
All the source codes can be found on GutHub.
Bibliography
- Assembly Language and Shellcoding on Linux (web training)
- Buffer Overflow Attacks: Detect, Exploit, Prevent,1st Edition (book)
- Writing shellcode for Linux and *BSD (web site)
- Shellcoding for Linux and Windows Tutorial (web site)