Fuzzing, Binary Rewriting, System Programming, Emulation

Research Overview

In my research I mainly focus on improving modern fuzzing techniques with the goal of overcoming limitations and unlocking new types of applications such that they can be tested by automated security analysis tools.

Nils Bars

Profile Page

Master's Thesis: Uncovering Fuzzing Roadblocks in Widely Used Software

Fuzzing is a widely used, automated technique to uncover bugs in modern software. For fuzzing to be effective, it is paramount to generate inputs that cover as much code (paths) as possible in the target program. To this end, many techniques, such as symbolic execution or taint tracking, were suggested by previous research. While these techniques target specific data constructs such as checksums or magic values, it remains unclear whether other types of common roadblocks are blocking the fuzzers‘ progress.

In this thesis, you are asked to thoroughly analyze common obstacles blocking the AFL++ fuzzer during fuzzing of widely used software. To this end, you are provided with fuzzing results amounting to eight CPU years, but you will need to set up your own fuzzing experiments as well.

Requirements

Preferably Python or Rust
Development of high performance multithreaded code for data processing
Previous experience or interest in using tools for tracing binaries during execution

Master's Thesis: Auto Harnessing of Functions

Fuzzing has emerged as a powerful technique for identifying software vulnerabilities by generating diverse inputs to test target applications thoroughly. This master’s thesis is to implement an auto-harnessing approach that identifies unconstrained input bytes passed to specific functions within the target application. This information allows us to fuzz small parts of the target codebase by creating a harness around these functions. A crucial challenge is ensuring that the approach can generate reproducing inputs for the target application (not only the function) upon finding bugs in these functions.

Summary

Developing a technique to determine which bytes are consumed by a function.
Implement an approach to detect which bytes processed by a function are unconstrained regarding the target’s input.
Implementing a prototype of this approach based on some fuzzer, such as libAFL or AFL++.

Requirements

Knowledge of low-level concepts and C
Preferably basic knowledge of Fuzzing

Master's Thesis: Using Dynamic Likely Invariants to Improve Input Diversity and Coverage during Fuzzing

Modern fuzzers, such as AFL (American Fuzzy Lop) and its variants, have revolutionized vulnerability discovery by efficiently exploring the target programs utilizing coverage feedback. However, achieving high code coverage and generating diverse inputs to thoroughly exercise the target remains challenging. This master’s thesis topic is the implementation of a new feedback metric for fuzzing that leverages likely invariants [1,2,3], which are periodically updated (i.e., dynamic likely invariants) during the fuzzing process. Each time a likely invariant is violated, the input is retained, enabling the generation of more diverse inputs that better cover the input space. Notably, due to the dynamic nature, the likely invariants are iteratively tailored toward the actual invariant that may diverge from the one the developer anticipated, causing the program to exercise undefined behavior (e.g., a crash).

This research aims to enhance the overall effectiveness and efficiency of the fuzzing process, leading to improved vulnerability detection.

[1]: https://www.usenix.org/conference/usenixsecurity21/presentation/fioraldi

[2]: https://plse.cs.washington.edu/daikon/

[3]: https://www.sciencedirect.com/science/article/pii/S016764230700161X

Summary

Use daikon to extract likely invariants of functions in the target.
Implement some technique to instrument the target such that invariant violation can be detected. This may happen via recompilation or JITing, e.g., using LLVM patch points.
Extend libAFL (or another fuzzer) to implement the idea so invariants are periodically determined, and the target is instrumented accordingly during fuzzing.

Requirements

Low-level concepts, C and x86-64 Assembly
Preferably experience in Rust or interest in learning it
Previous experience or interest in using daikon
Preferably basic knowledge of Fuzzing

Master's Thesis: Improving Binary Fuzzing through Trace Re-Translation

Modern fuzzers, such as AFL++, rely on binary re-translators such as QEMU for fuzzing targets that are only available as a binary. Such tools allow augmenting binaries during execution, mostly to inject custom code that implements efficient feedback mechanisms such as code coverage. Unfortunately, these capabilities come at the cost of performance degeneration since translating the target binary to an intermediate representation (IR), modifying this IR, and then translating it back, is not free. However, as opposed to the common use case of emulators, fuzzers consecutively & rapidly execute the targets while applying (generally) relatively small modifications to the input of the target. Consequently, in many cases, the same code path is exercised during execution. This insight allows us to focus on such single, hot paths.

In this thesis, your task is to implement a fuzzing-specific QEMU (user mode) optimization that exploits the abovementioned observation.

Requirements

Low level concepts, JIT, C and x86-64 Assembly
Preferably Python or Rust
Development of high performance code
Previous experience or interest in using tools for tracing binaries during execution.

Master's Thesis: Hypervisor-based Cheats

Cheating has been an integral part of gaming since its first days. While back then, it was limited to single-player games, nowadays, it can even cause economic damage if used in contests that award money. While professional tournaments provide hardware to the participants in a controlled environment, gamers can participate remotely in competitions with relatively low payouts. Game developers try to counter cheating with so-called anti-cheat engines, such as VAC, PunkBuster, or BattlEye. However, these anti-cheat engines focus mainly on cheats running in user space or kernel space, but it is unknown whether they reliably detect hypervisor-based cheats or even attempt doing so.

In this thesis, you will develop a cheat that runs inside a virtual machine monitor (VMM) and can apply arbitrary modifications to the virtualized OS. Furthermore, you are asked to answer several research questions such as:

Do modern games employ any kind of countermeasures to avoid the execution inside a VM?
Can such a cheat be implemented for a modern multiplayer game?
What methods can detect such cheats or the presence of a VMM?
In what ways can a VMM-based cheat be used to gain an advantage?

Requirements

Low level concepts, Paging, C and x86-64 Assembly
Preferably Python or Rust
Working IOMMU setup with GPU pass through and some games to test it
Preferably experience with QEMU system emulation