dirtbox: a Highly Scalable x86/Windows Emulator

Black Hat USA 2010

Presented by: Georg Wicherski
Date: Thursday July 29, 2010
Time: 16:45 - 18:00
Location: Augustus 3+4
Track: Cloud Virtualization

The increasing amount of new malware each day does not only put anti-virus companies up to new limits handling these samples for detection by creating new signatures. But also for network security providers and administrators, getting information on how samples affect the networks they try to protect is an increasing problem. Dynamic analysis of malware by execution in sandboxes has been an approach that has been successfully applied in both of these problem scenarios, however classic sandbox approaches clearly suffer from severe scalability problems. Most of these rely on setting up a real target system ­ such as the Windows XP operating system ­ as a virtual machine with additional software that does logging of performed actions. While these are easy to develop and set up, they require a separate virtual machine instance for each malware sample to be analyzed and therefore do not scale up with today's requirements in terms of malware growth.

Anti-Virus vendors tried to circumvent performance issues for file analysis by developing custom emulators that can be deployed on a customer end-host for detection and do not require a whole operating system inside a virtual machine. These emulators however often are software interpreters for the x86 instruction set and run therefore into execution speed limitations on their own. Additionally, they suffer from detectability because they try to emulate every single Windows API but suffer from accuracy issues.

dirtbox is an attempt to implement a highly scalable x86/Windows emulator that can be both used for simple malware detection and detailed behavior analysis reports. Instead of emulating every single x86 instruction in software, malware instructions are executed directly on the host CPU in a per basic block fashion. A disassembling run on each basic block ensures that no privileged or control flow subverting instructions are executed. The notion of virtual memory that is separated from the emulators memory is employed by special LDT segments and switching segment selectors before executing guest instructions.

Since no instrumentation alike instruction rewriting is being done, disassembler results per basic block can be cached and all execution happens in the same process without context-switches, a high grade of performance is achieved.

The operating system is emulated at the syscall layer. While this layer is mostly undocumented and implementing it in an accurate fashion is a challenging task on its own, the fact that no register changes are leaked from Ring 0 thwarts a lot of detection techniques. For usage of the high-level APIs, corresponding libraries are directly mapped into the virtual memory as well. Detection mechanisms such as:

Furthermore, process and heap layout reassemble that of a genuine process since the original ntdll PE loading and heap management code can be executed and used.

Georg Wicherski

Georg Wicherski is a 20 years old German university student with experience in the fields of botnet tracking and mitigation, malware analysis and network engineering. He co-authored the Honeynet Project's paper "Know Your Enemy: Tracking Botnets" and two papers submitted to ESORICS and DFN-Cert Workshop. He also published his paper "Medium Interaction Honeypots" on the Internet. Additionally, he presented on Blackhat Asia 2006 and the 23C3. His fields of interest besides malware and botnets include robotics engineering and programming as well as wireless appliances. He is the author of the mwcollectd medium-interaction-honeypot and nepenthes developer. He founded and now leads the mwcollect Alliance, a non-proifit organization aiming at collecting malware with now over 150000 unique in-the-wild samples. For more info: www.pixel-house.net & www.mwcollect.org


KhanFu - Mobile schedules for INFOSEC conferences.
Mobile interface | Alternate Formats